# Deterministic SAP 10.3 calculator alongside the ML model; ML becomes a residual learner **Status: Accepted.** Builds on [ADR-0007](0007-kwh-as-ml-target.md) (the SAP10 calculator is the ground truth ML approximates) and [ADR-0008](0008-physics-as-feature.md) (we already ship ~30% of a calculator as physics features). Decision point: do we keep grinding ML accuracy on `sap_score`, or do we *write the calculator* and have ML predict its residual? ## Grill outcomes (2026-05-17) Seven open questions resolved through a `/grill-with-docs` session before Session A. Each lands a binding scope decision for the implementation: | # | Question | Decision | |---|---|---| | 0 | Domain placement | **Option B** — new term **Calculated SAP10 Performance**, parallel to Effective Performance (ML) and Lodged Performance (gov register). Effective Performance is **not** retired now; a future ADR may promote Calculated to its current role once parity is confirmed. Process named **SAP10 Calculation**. | | 1 | PCDB heat-pump COP source for Session A | **Stub-seam.** Define `PcdbLookup` Protocol, ship `NoOpPcdbLookup` returning None, fall back to Table 4a. Session C bundles a CSV PCDB extract under `domain/sap10_calculator/tables/pcdb/data/` and implements the lookup. | | 2 | MCS installation factors | **Boolean input on calculator inputs, default `False`.** Plumbing in Session A; no behaviour change until the input is populated. Slice 18f (separate, tracked in HANDOFF §7-D0) lifts `mcs_installed_heat_pump` from gov API → `EpcPropertyData.MainHeatingDetail` so calculator can apply the factor on the ~1.5% of HP certs that carry it. | | 3 | Thermal bridging | **Global y factor** (the path SAP 10.3 specifies for RdSAP-driven assessments). Per-junction Table R2 sum requires junction-count inputs the cert doesn't carry — not available on the RdSAP-driven flow. | | 4 | Living-area fraction default | **RdSAP 10 Table 27** — direct lookup from `habitable_rooms_count`. Unambiguous, one-line table. | | 5 | Secondary-heating allocation | **SAP 10.2/10.3 Table 11** keyed on main heating type. RdSAP doesn't redefine the fraction — it identifies the type only. Forcing rule: when main is micro-CHP and Table N9 says non-zero secondary heat with no secondary specified, assume portable electric heaters. | | 6 | Validation cohort | **Stratified random of 1000 certs**; report MAE per stratum. Session A success criterion = MAE ≤ 1.0 SAP-point on the **typical subset** (excluding sap_score ≤ 5, sap_score ≥ 100, multi-heating, conservatory, RIR). Global MAE reported alongside for honesty. | | 7 | `MeasureOverrides` shape | **Rejected as phantom mid-layer.** `Sap10Calculator.calculate(epc) -> SapResult` takes a single immutable cert. A separate **MeasureApplicator** service translates Optimised Package → cert-field changes, returning the "ending state snapshot" EpcPropertyData that Plan Phase already persists. Three pure functions in chain: applicator → calculator → result. | ## Additional findings from the grill that change Session A scope - **SAP rating formula belongs to RdSAP, not SAP 10.3.** RdSAP §19 ("RdSAP10-specific SAP rating equations referred to as EER") defines the SAP-score equation used for RdSAP-driven assessments. SAP 10.3 §13 defines the rating for new-build assessments. The cert's `energy_rating_current` was computed by RdSAP §19, so parity validation must compute against RdSAP §19, not SAP 10.3 §13. - **RdSAP 10 (June 2025) cross-references SAP 10.2 (March 2025) for heating-system identification (Appendix A).** RdSAP was published before SAP 10.3 (Jan 2026). Until BRE updates RdSAP to reference SAP 10.3, the calculator's heating-identification logic reads SAP 10.2 Appendix A while everything else reads SAP 10.3. Keep both PDFs in `domain/sap10_calculator/docs/specs/`. - **RdSAP Table 29 ("Heating and hot water parameters") is a 20+-entry defaulting table** that the `cascade_defaults.py` module needs to encode. Current scope of `rdsap_uvalues.py` is U-values only; Table 29 extends the cascade pattern to cylinder insulation, primary-pipework insulation, boiler interlock, emitter temperature, underfloor-heating routing, solar-panel parameters, heat-network defaults. Adds ~1-2 hrs to Session A (effective Session A.5 if not split). - **MCS field exists in gov API** but is dropped by the current mapper. Slice 18f (lift `mcs_installed_heat_pump` into `EpcPropertyData`) is a prerequisite for the MCS-factor path. ~30 min slice; can ship before Session A or in parallel. ## Problem After six slices of physics-feature work (18b/18c/18d/20a/20a.1) the ML model is at sap_score MAPE 3.63%, MAE 1.86 globally; per-decile MAE 3.86 (d0) and 2.25 (d9). Each new slice now nudges d0 MAE by ~0.05. User's target is MAE ≤ 0.5 across all bands. The remaining error is dominated by: 1. **Catastrophic tail noise** — d0 has 3.3% of rows with `sap_score ≤ 20` (heritage / abandoned / data-anomaly homes). MAE on those rows is structurally large because the model's prediction floor is ~30 even for the worst inputs. 2. **Calculator nuance the physics features can't reach** — monthly heat balance with solar/internal gains and utilisation factor, full SAP §J hot-water variants, PCDB heat-pump overrides, dual-fuel allocation, conservatory modes, room-in-roof handling. Each of these is a deterministic line in the SAP10.3 spec but we model it via tree splits over input fields. These cannot be closed by another tree feature. They require executing the calculator. ## Decision Build a deterministic **`Sap10Calculator`** that reads `EpcPropertyData` and emits the same outputs the certificate's BRE-approved assessor software emits: `sap_score`, `co2_emissions`, `peui_raw`, `peui_ucl`, `space_heating_kwh`, `hot_water_kwh`. Target the SAP 10.3 specification (DESNZ/BRE, 13-01-2026) and the RdSAP 10 specification (BRE, 10-06-2025), both held in `domain/sap10_calculator/docs/specs/`. The ML model is **not deprecated**. It is repurposed as a **residual learner** against `actual_sap − calculator_sap` (and similar deltas for the other five targets). Residual distributions are much narrower than the raw target distributions (calculator is within ~1 SAP-point on 95% of typical certs, per the working hypothesis), so the ML residual head should fit the corrections with far fewer features and reach the MAE ≤ 0.5 target. ## Why now 1. **SAP 10.3 just dropped (Jan 2026).** Building against the new spec means the calculator outputs match assessor software for any cert lodged from 2026 onward. Building against SAP 10.2 (March 2025) now would need re-derivation later. 2. **The retrofit-simulation use case demands transparency.** Surveyors, building physicists, and homeowners need to see exactly which physics line — wall U×A, ventilation ACH, solar gain on south-facing windows — contributes how much heat-loss/cost. Tree-model attribution doesn't supply that. Calculator does. 3. **30% of the calculator is already shipped.** `rdsap_uvalues.py` (Tables 6–10, 15–20, 24, 26), `sap_efficiencies.py` (Tables 4a, 4b, 32), `envelope.py` (Σ U·A + thermal bridging), partial `ventilation.py` (slice 20a tracer), partial `demand.py` (annual heat balance), `ecf.py` (Total fuel cost, ECF, log10ECF), PV credit (slice 17a), SAP §J hot-water port (slice 17b). The pivot is mostly re-platforming, not new physics. 4. **ML residual learning has a clean home for the noise.** The catastrophic-tail rows the calculator gets wrong (data anomalies, mis-described systems) are exactly where ML *should* live, because they're not closed-form solvable. Calculator + residual head is a cleaner split of responsibility than "ML approximates the deterministic spec". ## Scope of the calculator (Session A) A full SAP 10.3 worksheet plus the data-extraction rules from RdSAP 10 Appendix S. Module organisation: ``` domain/sap10_calculator/ __init__.py # Sap10Calculator entry point + SapResult dataclass worksheet/ dimensions.py # §1 ventilation.py # §2 + Table 5 + Appendix Q heat_transmission.py # §3 + Appendix K (thermal bridging) + Tables 6–10/15–20/24/26 hot_water.py # §4 + Appendix J + Appendix G (FGHRS/WWHRS/PV-diverters) internal_gains.py # §5 + Appendix L (lighting) solar_gains.py # §6 + Tables 6d/6e mean_temperature.py # §7 climate.py # §8 + Appendix U (region-from-postcode, monthly external temp/wind/solar) space_heating.py # §9 + Appendices A/B/D/E/N (heating systems, efficiency, heat pumps) fuel_cost.py # §12 + Table 32 (fuel prices) + Appendix M (PV/wind/hydro generation) energy_cost_rating.py # §13 + the SAP score formula co2_primary_energy.py # §14 (emissions + primary energy) fee.py # §11 Fabric Energy Efficiency tables/ table_4a_4b.py # heating-system seasonal efficiency table_5.py # ventilation rate components table_6.py # monthly external temp by region table_6d.py # monthly solar flux by orientation by region table_32.py # fuel prices table_R.py # reference values (Appendix R) rdsap/ appendix_s.py # cert → calculator input mapping cascade_defaults.py # the RdSAP10 "assume-typical" rules (currently in rdsap_uvalues.py) ``` The existing `domain.sap10_ml.*` modules stay where they are during Session A; they continue serving the live ML pipeline. Session B promotes them into `domain.sap10_calculator.*` once parity is reached. ## Sap10Calculator interface ```python @dataclass(frozen=True) class SapResult: sap_score: float energy_cost_rating: float # alias for sap_score before band lookup sap_band: str # A-G co2_emissions_kgco2_per_m2: float peui_raw_kwh_per_m2: float peui_ucl_kwh_per_m2: float space_heating_kwh_per_yr: float hot_water_kwh_per_yr: float monthly_breakdown: MonthlyBreakdown intermediate: dict[str, float] # every named worksheet quantity, for traceability class Sap10Calculator: def __init__(self, climate: ClimateData, pcdb: Optional[PcdbLookup] = None) -> None: ... def calculate(self, epc: EpcPropertyData) -> SapResult: ... ``` `intermediate` carries every named SAP10.3 worksheet variable (envelope conduction W/K, ventilation rate, solar gains by month, utilisation factor, heat-pump SCOP, ECF, ...) so consumers can drill down. This replaces ADR-0008's physics-as-feature columns for retrofit-simulation consumers; the ML pipeline keeps generating them as features until the residual head is trained and validated. ## Validation Two corpora: 1. **Calculator-vs-cert parity (Session B).** Run the calculator over 1000 randomly-sampled RdSAP-10 certs from `data/ml_training/runs/2025_2026_n250000_v18a/data.parquet`. Compare `Sap10Calculator.calculate(epc).sap_score` to the cert's `energy_rating_current`. Target: MAE ≤ 1.0 on 95% of certs; outliers investigated case-by-case to find spec-interpretation gaps or PCDB requirements. 2. **Residual ML head (Session C+).** Train LightGBM on `actual_sap − calculator_sap` as the target. Validate that residual MAE is materially smaller than the current 1.86 global / 3.86 d0. If residual MAE on d0 falls below 0.5, the calculator + residual approach hits the user's target. We do **not** retire the existing ML pipeline until both validations pass. ## What this ADR does *not* change - **The six ML targets remain those from ADR-0007.** The residual head predicts deltas against the same six quantities. - **ADR-0008's physics-as-feature pattern stays valid for the ML residual head.** The residual head probably needs fewer features, but the cascade U-value defaults and SAP efficiency lookups remain useful as feature builders if the calculator subset alone underfits. - **`energy_rating_current` remains excluded from features.** Same leakage rule. - **RdSAP 10 cert-extraction rules are now first-class in the codebase.** Rules that were ad-hoc in `transform.py` move into `domain.sap10_calculator.rdsap.appendix_s`. - **The training parquet schema continues at v2.x.** A new column `calculator_sap_score` lands as a non-breaking addition once Session A reaches parity. The schema version bumps to v3.0.0 only when the residual targets replace the raw targets — a coordinated AutoGluon-repo deploy, per ADR-0008's cutover discipline. ## SAP 10.2 → SAP 10.3 implications The newer spec replaces tables we already ship: - Table 4a/4b (heating efficiencies) — likely identical, verify on read. - Table 32 (fuel prices) — almost certainly different, re-derive from Appendix in 10.3. - Table 6d (solar flux) — likely identical (climate data). - Energy cost rating formula constants — unchanged in 10.3 vs 10.2 unless DESNZ updated the deflator. Re-derivation work is bounded — a few hundred numbers across tables — and the `*_table_*.py` modules already have a clean shape for the cutover. ## Session plan (carried from HANDOFF §High-value next slices) - **Session A (3–4 hrs):** Implement ventilation per §2 (replacing the slice-20a tracer), 12-month heat balance per §6 + §8 + Appendix U, solar gains per §6 + Table 6d, internal gains per §5 + Appendix L, utilisation factor per §6.4, mean internal temperature per §7. End of Session A: `Sap10Calculator.calculate(epc) -> SapResult` runs on typical certs. - **Session B (3–4 hrs):** Edge cases — conservatory modes, room-in-roof handling, multi-heating allocation, dual fuel, secondary heating fraction (Appendix A). Run parity validation across 1000 certs. Iterate on spec-interpretation gaps. End of Session B: 95% of typical certs within 1 SAP-point of cert value. - **Session C (2–3 hrs):** PCDB integration for boiler + heat-pump overrides (Appendices D, N). Residual-head training on `actual_sap − calculator_sap`. ADR-0010 if any non-trivial calculator/ML hybrid pattern emerges that ADR-0009 didn't anticipate. ## Caveats - **Spec interpretation will need product input.** 5–10 questions per session on edge cases: multi-heating split logic, secondary heating threshold rules, PCDB-vs-Table-4b precedence, etc. These are not in the spec text and are real business decisions. - **No reference BRE Python port is currently known.** If one surfaces, porting accelerates. If not, every line of the calculator is implemented from the spec PDF directly, with tests. - **PCDB (Product Characteristics Database).** SAP 10.3 references the PCDB throughout for boiler/HP efficiency overrides. Without PCDB integration, calculator carries ~1 SAP-point penalty on PCDB-listed equipment. Defer to Session C. - **The current ML pipeline keeps running through all three sessions.** No deprecation until residual validation lands. The branch `ara-backend-design-prd` (current ML grind) and the calculator work proceed in parallel. ## Consequences - A new top-level domain area `domain.sap10_calculator.*` is introduced; over Sessions B/C it absorbs `domain.sap10_ml.{envelope,demand,ecf,rdsap_uvalues,sap_efficiencies,ventilation}.py`. The ML transform stops shipping those as standalone features once the residual head takes over. - The codebase carries two SAP outputs: cert-reported `sap_score` (ground truth at training time) and calculator-emitted `sap_score` (ground truth at inference time for any RdSAP cert input). The product layer chooses; for "score this hypothetical post-retrofit state", calculator wins. - The deterministic calculator is **version-bound to SAP 10.3.** A future SAP 10.4 is a calculator MAJOR bump and an ADR. The ML residual head is SAP-version-agnostic only insofar as the residual distribution it learns stays stationary; in practice a spec bump retrains the residual head. - Spec PDFs live in `domain/sap10_calculator/docs/specs/` (this repo). The repo now carries the canonical reference for what the calculator computes. License: SAP 10.3 © Crown copyright 2026; RdSAP 10 © BRE — both are public-interest references for SAP-compliant software, included for traceability.