# Sap10Calculator parity probe — findings as of 2026-05-18 100-cert random sample from `data/ml_training/runs/2025_2026_n250000_v18a/data.parquet`, filtered to cert sap-score 20-95 (typical band). 0 errors — calculator runs end-to-end on every cert. ## Headline | Metric | Value | |---|---| | MAE | 8.41 SAP-points | | RMSE | 13.98 | | Bias | -2.65 (slight under-prediction) | | Within ±1 | 18.0% | | Within ±3 | 36.0% | | Within ±5 | 57.0% | | Within ±10 | 84.0% | | Worst residual | -56 SAP-points | Session B success criterion is MAE ≤ 1.0 on the typical subset; we're 8× that on the first pass, which roughly matches ADR-0009's expectation that the first run shakes out spec-interpretation gaps. ## Dominant failure shape: flats and bungalows under-predicted 10 of the 15 worst residuals are flats or bungalows. **Pattern**: calculator charges floor + roof heat loss to dwellings that don't have exposed floor / roof surfaces (mid-floor flats, top-floor flats with party ceiling, etc.). Worst 15 (residual = predicted − actual): | Cert | actual | predicted | residual | TFA | dwelling | |---|---|---|---|---|---| | 0320-2756-7670-2196-2035 | 78 | 22 | -56 | 57 | Semi-detached bungalow | | 0036-1125-8600-0165-2206 | 63 | 18 | -45 | 42 | Mid-floor flat | | 0340-2394-5510-2925-4421 | 75 | 35 | -40 | 73 | Mid-floor flat | | 9360-2179-9590-2495-2615 | 78 | 39 | -39 | 54 | Ground-floor flat | | 0036-0529-1500-0700-8276 | 75 | 36 | -39 | 47 | Top-floor flat | | 0350-2182-9590-2526-7841 | 43 | 4 | -39 | 119 | Top-floor flat | | 2148-3061-6204-0016-7204 | 81 | 44 | -37 | 67 | Mid-floor flat | | 0800-1364-0922-4522-3963 | 71 | 37 | -34 | 70 | Detached bungalow | | 2110-6453-5050-8205-9605 | 63 | 31 | -32 | 43 | Ground-floor maisonette | | 2903-8339-6962-6004-0725 | 75 | 47 | -28 | 11 | Top-floor flat | | 0320-2850-3380-2125-1661 | 70 | 48 | -22 | 45 | Semi-detached bungalow | | 8035-9023-1500-0237-3226 | 43 | 63 | +20 | 64 | Detached bungalow | | 9590-7751-0022-0599-3953 | 51 | 69 | +18 | 74 | Detached house | | 2118-1198-2619-1711-7960 | 62 | 46 | -16 | 42 | Mid-floor flat | | 3336-3822-5500-0437-9202 | 70 | 59 | -11 | 73 | Mid-floor maisonette | ## Session B iteration backlog (priority order) 1. **S-B-flat-surfaces** — Map `dwelling_type` to exposed floor/roof flags. Mid/top flats lose their `u_floor × ground_floor_area`; mid/ground flats lose their `u_roof × top_floor_area`. Expected impact: closes most of the −20 to −56 residuals. 2. **S-B-heating-eff-fallback** — When `sap_main_heating_code` is None, fall back through `main_heating_category` + age band to a modern-condensing-boiler efficiency, not the legacy 0.80. ~28% of our 100-cert sample had a null code with category=2. 3. **S-B-electric-storage-tariff** — Electric storage heaters (codes 401-409) should price space-heating fuel at Economy-7 low rate (Table 32 code 31, ~5.5 p/kWh), not standard rate 30. This is a 2× cost reduction on those certs. 4. **S-B-wall-uvalue-cascade-review** — Worst non-flat residuals suggest the wall U-value cascade is too conservative for recently-built / well-insulated stock. Review `domain.ml.rdsap_uvalues.u_wall` against RdSAP 10 Table 5. 5. **S-B-bungalow-investigation** — Bungalow residuals don't fit the flat-surfaces pattern (bungalows have full floor+roof). Hypothesis: thermal-bridging y-factor + storey-count interaction over-counts envelope. Probe specifically before deciding. 6. **S-B-pump-fan-default** — We default to 130 kWh/yr; SAP 10.3 Table 4f says higher for systems with mechanical ventilation. Marginal but consistent. ## How to reproduce ```bash python adhoc/sap_calculator/probe_n.py # 100 certs, seed=7 python adhoc/sap_calculator/probe_n.py 500 13 # bigger sample python adhoc/sap_calculator/probe_worst.py # detailed cert-by-cert dump ``` `probe_n.py` runs in ~80s. Errors: 0/100. Mapper handles every real cert shape encountered.