Promotes ADR-0009 from Proposed to Accepted after the grill-with-docs session resolved all seven open questions. Bundles the SAP 10.3 and RdSAP 10 specifications under docs/sap-spec/ plus a calculator design sketch (module layout, monthly-loop pseudo-code, status table). CONTEXT.md adds three new domain terms parallel to existing performance language: - Calculated SAP10 Performance (parallel to Effective / Lodged) - SAP10 Calculation (process; implemented by Sap10Calculator) - Measure Application (process; implemented by MeasureApplicator) ML pipeline is NOT retired — it stays as the residual head once the calculator reaches parity in Session B. ADR-0009 §"Grill outcomes" carries the seven binding scope decisions plus three Session-A-scope changes discovered during the grill (RdSAP §19 EER formula, SAP 10.2 Appendix A cross-reference, RdSAP Table 29 cascade defaults).
16 KiB
Deterministic SAP 10.3 calculator alongside the ML model; ML becomes a residual learner
Status: Accepted. Builds on ADR-0007 (the SAP10 calculator is the ground truth ML approximates) and ADR-0008 (we already ship ~30% of a calculator as physics features). Decision point: do we keep grinding ML accuracy on sap_score, or do we write the calculator and have ML predict its residual?
Grill outcomes (2026-05-17)
Seven open questions resolved through a /grill-with-docs session before Session A. Each lands a binding scope decision for the implementation:
| # | Question | Decision |
|---|---|---|
| 0 | Domain placement | Option B — new term Calculated SAP10 Performance, parallel to Effective Performance (ML) and Lodged Performance (gov register). Effective Performance is not retired now; a future ADR may promote Calculated to its current role once parity is confirmed. Process named SAP10 Calculation. |
| 1 | PCDB heat-pump COP source for Session A | Stub-seam. Define PcdbLookup Protocol, ship NoOpPcdbLookup returning None, fall back to Table 4a. Session C bundles a CSV PCDB extract under docs/sap-spec/ and implements the lookup. |
| 2 | MCS installation factors | Boolean input on calculator inputs, default False. Plumbing in Session A; no behaviour change until the input is populated. Slice 18f (separate, tracked in HANDOFF §7-D0) lifts mcs_installed_heat_pump from gov API → EpcPropertyData.MainHeatingDetail so calculator can apply the factor on the ~1.5% of HP certs that carry it. |
| 3 | Thermal bridging | Global y factor (the path SAP 10.3 specifies for RdSAP-driven assessments). Per-junction Table R2 sum requires junction-count inputs the cert doesn't carry — not available on the RdSAP-driven flow. |
| 4 | Living-area fraction default | RdSAP 10 Table 27 — direct lookup from habitable_rooms_count. Unambiguous, one-line table. |
| 5 | Secondary-heating allocation | SAP 10.2/10.3 Table 11 keyed on main heating type. RdSAP doesn't redefine the fraction — it identifies the type only. Forcing rule: when main is micro-CHP and Table N9 says non-zero secondary heat with no secondary specified, assume portable electric heaters. |
| 6 | Validation cohort | Stratified random of 1000 certs; report MAE per stratum. Session A success criterion = MAE ≤ 1.0 SAP-point on the typical subset (excluding sap_score ≤ 5, sap_score ≥ 100, multi-heating, conservatory, RIR). Global MAE reported alongside for honesty. |
| 7 | MeasureOverrides shape |
Rejected as phantom mid-layer. Sap10Calculator.calculate(epc) -> SapResult takes a single immutable cert. A separate MeasureApplicator service translates Optimised Package → cert-field changes, returning the "ending state snapshot" EpcPropertyData that Plan Phase already persists. Three pure functions in chain: applicator → calculator → result. |
Additional findings from the grill that change Session A scope
- SAP rating formula belongs to RdSAP, not SAP 10.3. RdSAP §19 ("RdSAP10-specific SAP rating equations referred to as EER") defines the SAP-score equation used for RdSAP-driven assessments. SAP 10.3 §13 defines the rating for new-build assessments. The cert's
energy_rating_currentwas computed by RdSAP §19, so parity validation must compute against RdSAP §19, not SAP 10.3 §13. - RdSAP 10 (June 2025) cross-references SAP 10.2 (March 2025) for heating-system identification (Appendix A). RdSAP was published before SAP 10.3 (Jan 2026). Until BRE updates RdSAP to reference SAP 10.3, the calculator's heating-identification logic reads SAP 10.2 Appendix A while everything else reads SAP 10.3. Keep both PDFs in
docs/sap-spec/. - RdSAP Table 29 ("Heating and hot water parameters") is a 20+-entry defaulting table that the
cascade_defaults.pymodule needs to encode. Current scope ofrdsap_uvalues.pyis U-values only; Table 29 extends the cascade pattern to cylinder insulation, primary-pipework insulation, boiler interlock, emitter temperature, underfloor-heating routing, solar-panel parameters, heat-network defaults. Adds ~1-2 hrs to Session A (effective Session A.5 if not split). - MCS field exists in gov API but is dropped by the current mapper. Slice 18f (lift
mcs_installed_heat_pumpintoEpcPropertyData) is a prerequisite for the MCS-factor path. ~30 min slice; can ship before Session A or in parallel.
Problem
After six slices of physics-feature work (18b/18c/18d/20a/20a.1) the ML model is at sap_score MAPE 3.63%, MAE 1.86 globally; per-decile MAE 3.86 (d0) and 2.25 (d9). Each new slice now nudges d0 MAE by ~0.05. User's target is MAE ≤ 0.5 across all bands. The remaining error is dominated by:
- Catastrophic tail noise — d0 has 3.3% of rows with
sap_score ≤ 20(heritage / abandoned / data-anomaly homes). MAE on those rows is structurally large because the model's prediction floor is ~30 even for the worst inputs. - Calculator nuance the physics features can't reach — monthly heat balance with solar/internal gains and utilisation factor, full SAP §J hot-water variants, PCDB heat-pump overrides, dual-fuel allocation, conservatory modes, room-in-roof handling. Each of these is a deterministic line in the SAP10.3 spec but we model it via tree splits over input fields.
These cannot be closed by another tree feature. They require executing the calculator.
Decision
Build a deterministic Sap10Calculator that reads EpcPropertyData and emits the same outputs the certificate's BRE-approved assessor software emits: sap_score, co2_emissions, peui_raw, peui_ucl, space_heating_kwh, hot_water_kwh. Target the SAP 10.3 specification (DESNZ/BRE, 13-01-2026) and the RdSAP 10 specification (BRE, 10-06-2025), both held in docs/sap-spec/.
The ML model is not deprecated. It is repurposed as a residual learner against actual_sap − calculator_sap (and similar deltas for the other five targets). Residual distributions are much narrower than the raw target distributions (calculator is within ~1 SAP-point on 95% of typical certs, per the working hypothesis), so the ML residual head should fit the corrections with far fewer features and reach the MAE ≤ 0.5 target.
Why now
- SAP 10.3 just dropped (Jan 2026). Building against the new spec means the calculator outputs match assessor software for any cert lodged from 2026 onward. Building against SAP 10.2 (March 2025) now would need re-derivation later.
- The retrofit-simulation use case demands transparency. Surveyors, building physicists, and homeowners need to see exactly which physics line — wall U×A, ventilation ACH, solar gain on south-facing windows — contributes how much heat-loss/cost. Tree-model attribution doesn't supply that. Calculator does.
- 30% of the calculator is already shipped.
rdsap_uvalues.py(Tables 6–10, 15–20, 24, 26),sap_efficiencies.py(Tables 4a, 4b, 32),envelope.py(Σ U·A + thermal bridging), partialventilation.py(slice 20a tracer), partialdemand.py(annual heat balance),ecf.py(Total fuel cost, ECF, log10ECF), PV credit (slice 17a), SAP §J hot-water port (slice 17b). The pivot is mostly re-platforming, not new physics. - ML residual learning has a clean home for the noise. The catastrophic-tail rows the calculator gets wrong (data anomalies, mis-described systems) are exactly where ML should live, because they're not closed-form solvable. Calculator + residual head is a cleaner split of responsibility than "ML approximates the deterministic spec".
Scope of the calculator (Session A)
A full SAP 10.3 worksheet plus the data-extraction rules from RdSAP 10 Appendix S. Module organisation:
packages/domain/src/domain/sap/
__init__.py # Sap10Calculator entry point + SapResult dataclass
worksheet/
dimensions.py # §1
ventilation.py # §2 + Table 5 + Appendix Q
heat_transmission.py # §3 + Appendix K (thermal bridging) + Tables 6–10/15–20/24/26
hot_water.py # §4 + Appendix J + Appendix G (FGHRS/WWHRS/PV-diverters)
internal_gains.py # §5 + Appendix L (lighting)
solar_gains.py # §6 + Tables 6d/6e
mean_temperature.py # §7
climate.py # §8 + Appendix U (region-from-postcode, monthly external temp/wind/solar)
space_heating.py # §9 + Appendices A/B/D/E/N (heating systems, efficiency, heat pumps)
fuel_cost.py # §12 + Table 32 (fuel prices) + Appendix M (PV/wind/hydro generation)
energy_cost_rating.py # §13 + the SAP score formula
co2_primary_energy.py # §14 (emissions + primary energy)
fee.py # §11 Fabric Energy Efficiency
tables/
table_4a_4b.py # heating-system seasonal efficiency
table_5.py # ventilation rate components
table_6.py # monthly external temp by region
table_6d.py # monthly solar flux by orientation by region
table_32.py # fuel prices
table_R.py # reference values (Appendix R)
rdsap/
appendix_s.py # cert → calculator input mapping
cascade_defaults.py # the RdSAP10 "assume-typical" rules (currently in rdsap_uvalues.py)
The existing domain.ml.* modules stay where they are during Session A; they continue serving the live ML pipeline. Session B promotes them into domain.sap.* once parity is reached.
Sap10Calculator interface
@dataclass(frozen=True)
class SapResult:
sap_score: float
energy_cost_rating: float # alias for sap_score before band lookup
sap_band: str # A-G
co2_emissions_kgco2_per_m2: float
peui_raw_kwh_per_m2: float
peui_ucl_kwh_per_m2: float
space_heating_kwh_per_yr: float
hot_water_kwh_per_yr: float
monthly_breakdown: MonthlyBreakdown
intermediate: dict[str, float] # every named worksheet quantity, for traceability
class Sap10Calculator:
def __init__(self, climate: ClimateData, pcdb: Optional[PcdbLookup] = None) -> None: ...
def calculate(self, epc: EpcPropertyData) -> SapResult: ...
intermediate carries every named SAP10.3 worksheet variable (envelope conduction W/K, ventilation rate, solar gains by month, utilisation factor, heat-pump SCOP, ECF, ...) so consumers can drill down. This replaces ADR-0008's physics-as-feature columns for retrofit-simulation consumers; the ML pipeline keeps generating them as features until the residual head is trained and validated.
Validation
Two corpora:
- Calculator-vs-cert parity (Session B). Run the calculator over 1000 randomly-sampled RdSAP-10 certs from
data/ml_training/runs/2025_2026_n250000_v18a/data.parquet. CompareSap10Calculator.calculate(epc).sap_scoreto the cert'senergy_rating_current. Target: MAE ≤ 1.0 on 95% of certs; outliers investigated case-by-case to find spec-interpretation gaps or PCDB requirements. - Residual ML head (Session C+). Train LightGBM on
actual_sap − calculator_sapas the target. Validate that residual MAE is materially smaller than the current 1.86 global / 3.86 d0. If residual MAE on d0 falls below 0.5, the calculator + residual approach hits the user's target.
We do not retire the existing ML pipeline until both validations pass.
What this ADR does not change
- The six ML targets remain those from ADR-0007. The residual head predicts deltas against the same six quantities.
- ADR-0008's physics-as-feature pattern stays valid for the ML residual head. The residual head probably needs fewer features, but the cascade U-value defaults and SAP efficiency lookups remain useful as feature builders if the calculator subset alone underfits.
energy_rating_currentremains excluded from features. Same leakage rule.- RdSAP 10 cert-extraction rules are now first-class in the codebase. Rules that were ad-hoc in
transform.pymove intodomain.sap.rdsap.appendix_s. - The training parquet schema continues at v2.x. A new column
calculator_sap_scorelands as a non-breaking addition once Session A reaches parity. The schema version bumps to v3.0.0 only when the residual targets replace the raw targets — a coordinated AutoGluon-repo deploy, per ADR-0008's cutover discipline.
SAP 10.2 → SAP 10.3 implications
The newer spec replaces tables we already ship:
- Table 4a/4b (heating efficiencies) — likely identical, verify on read.
- Table 32 (fuel prices) — almost certainly different, re-derive from Appendix in 10.3.
- Table 6d (solar flux) — likely identical (climate data).
- Energy cost rating formula constants — unchanged in 10.3 vs 10.2 unless DESNZ updated the deflator.
Re-derivation work is bounded — a few hundred numbers across tables — and the *_table_*.py modules already have a clean shape for the cutover.
Session plan (carried from HANDOFF §High-value next slices)
- Session A (3–4 hrs): Implement ventilation per §2 (replacing the slice-20a tracer), 12-month heat balance per §6 + §8 + Appendix U, solar gains per §6 + Table 6d, internal gains per §5 + Appendix L, utilisation factor per §6.4, mean internal temperature per §7. End of Session A:
Sap10Calculator.calculate(epc) -> SapResultruns on typical certs. - Session B (3–4 hrs): Edge cases — conservatory modes, room-in-roof handling, multi-heating allocation, dual fuel, secondary heating fraction (Appendix A). Run parity validation across 1000 certs. Iterate on spec-interpretation gaps. End of Session B: 95% of typical certs within 1 SAP-point of cert value.
- Session C (2–3 hrs): PCDB integration for boiler + heat-pump overrides (Appendices D, N). Residual-head training on
actual_sap − calculator_sap. ADR-0010 if any non-trivial calculator/ML hybrid pattern emerges that ADR-0009 didn't anticipate.
Caveats
- Spec interpretation will need product input. 5–10 questions per session on edge cases: multi-heating split logic, secondary heating threshold rules, PCDB-vs-Table-4b precedence, etc. These are not in the spec text and are real business decisions.
- No reference BRE Python port is currently known. If one surfaces, porting accelerates. If not, every line of the calculator is implemented from the spec PDF directly, with tests.
- PCDB (Product Characteristics Database). SAP 10.3 references the PCDB throughout for boiler/HP efficiency overrides. Without PCDB integration, calculator carries ~1 SAP-point penalty on PCDB-listed equipment. Defer to Session C.
- The current ML pipeline keeps running through all three sessions. No deprecation until residual validation lands. The branch
ara-backend-design-prd(current ML grind) and the calculator work proceed in parallel.
Consequences
- A new top-level domain area
domain.sap.*is introduced; over Sessions B/C it absorbsdomain.ml.{envelope,demand,ecf,rdsap_uvalues,sap_efficiencies,ventilation}.py. The ML transform stops shipping those as standalone features once the residual head takes over. - The codebase carries two SAP outputs: cert-reported
sap_score(ground truth at training time) and calculator-emittedsap_score(ground truth at inference time for any RdSAP cert input). The product layer chooses; for "score this hypothetical post-retrofit state", calculator wins. - The deterministic calculator is version-bound to SAP 10.3. A future SAP 10.4 is a calculator MAJOR bump and an ADR. The ML residual head is SAP-version-agnostic only insofar as the residual distribution it learns stays stationary; in practice a spec bump retrains the residual head.
- Spec PDFs live in
docs/sap-spec/(this repo). The repo now carries the canonical reference for what the calculator computes. License: SAP 10.3 © Crown copyright 2026; RdSAP 10 © BRE — both are public-interest references for SAP-compliant software, included for traceability.