mirror of
https://github.com/Hestia-Homes/Model.git
synced 2026-06-08 11:17:27 +00:00
ADR-0009: pivot to deterministic SAP 10.3 calculator (Accepted)
Promotes ADR-0009 from Proposed to Accepted after the grill-with-docs session resolved all seven open questions. Bundles the SAP 10.3 and RdSAP 10 specifications under docs/sap-spec/ plus a calculator design sketch (module layout, monthly-loop pseudo-code, status table). CONTEXT.md adds three new domain terms parallel to existing performance language: - Calculated SAP10 Performance (parallel to Effective / Lodged) - SAP10 Calculation (process; implemented by Sap10Calculator) - Measure Application (process; implemented by MeasureApplicator) ML pipeline is NOT retired — it stays as the residual head once the calculator reaches parity in Session B. ADR-0009 §"Grill outcomes" carries the seven binding scope decisions plus three Session-A-scope changes discovered during the grill (RdSAP §19 EER formula, SAP 10.2 Appendix A cross-reference, RdSAP Table 29 cascade defaults).
This commit is contained in:
parent
244f4555ac
commit
8dbe873daf
5 changed files with 389 additions and 0 deletions
12
CONTEXT.md
12
CONTEXT.md
|
|
@ -97,6 +97,18 @@ _Avoid_: original performance, raw EPC values, recorded baseline
|
|||
The SAP / EPC Band / carbon emissions / heat demand the modelling pipeline actually scored against — equal to Lodged Performance when no Rebaselining trigger fires, replaced by ML output when triggered. The half of Baseline Performance that says "what we modelled".
|
||||
_Avoid_: modelled performance, rebaselined performance (only correct when rebaselining ran), scored values
|
||||
|
||||
**Calculated SAP10 Performance**:
|
||||
The SAP score, EPC Band, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh produced by **SAP10 Calculation** from a Property's EpcPropertyData. Distinct from Effective Performance (ML output) and Lodged Performance (gov register) during the validation phase. Surfaced alongside Effective Performance in the UI; may supersede Effective Performance in a later ADR once parity is confirmed against the cert-reported SAP across ≥1000 sample certs. ADR-0009.
|
||||
_Avoid_: calculator output, computed performance, worksheet performance, SAP10 output
|
||||
|
||||
**SAP10 Calculation**:
|
||||
The process that runs the deterministic SAP 10.3 worksheet over a Property's EpcPropertyData and emits **Calculated SAP10 Performance**. Implemented by the `Sap10Calculator` service class in `domain/sap/`. Reads cert fabric/heating/geometry fields, applies the RdSAP 10 cert→input mapping, executes the 12-month heat balance per SAP 10.3 §§1-14, and returns a `SapResult` carrying the five Calculated SAP10 Performance quantities plus a monthly breakdown and worksheet-line audit trail. Distinct from **Rebaselining**, which is ML-based. ADR-0009.
|
||||
_Avoid_: SAP calculation (ambiguous with the gov calculator), SAP scoring, calculator run
|
||||
|
||||
**Measure Application**:
|
||||
The process that translates an Optimised Package into cert-field changes and produces the "ending state snapshot" EpcPropertyData that Plan Phase persists. Implemented by the `MeasureApplicator` service class in `domain/sap/` (or a sibling package). Each Measure Type's translation rules (e.g. `loft_insulation` → `roof_insulation_thickness_mm = 270mm`, `ashp` → `main_heating_details[0]` replacement) live here. Pure function — does not run SAP10 Calculation itself; the caller chains `MeasureApplicator.apply(epc, package) → Sap10Calculator.calculate(post_epc)`. ADR-0009.
|
||||
_Avoid_: measure overrides (rejected during ADR-0009 grill — phantom mid-layer), package applier, retrofit simulator
|
||||
|
||||
**EPC Energy Derivation**:
|
||||
The process that derives a Property's fuel split and annual bills from its space heating kWh and hot water kWh values plus the heating fuel deduced from SAP fields. kWh values themselves come from the EPC's recorded fields (`renewable_heat_incentive.space_heating_existing_dwelling` and `.water_heating`) for SAP10 baselines, or from ML prediction when Rebaselining fires or when scoring a post-measure state. Bills are computed deterministically from delivered kWh × current Fuel Rates + standing charges + SEG credits. The UCL Correction is no longer applied at runtime — it is folded into ML training labels (see [[epc-ml-transform]] and ADR-0007).
|
||||
_Avoid_: kWh prediction (kWh is now an ML target — see Rebaselining), baseline kWh, energy estimation
|
||||
|
|
|
|||
153
docs/adr/0009-deterministic-sap-calculator.md
Normal file
153
docs/adr/0009-deterministic-sap-calculator.md
Normal file
|
|
@ -0,0 +1,153 @@
|
|||
# Deterministic SAP 10.3 calculator alongside the ML model; ML becomes a residual learner
|
||||
|
||||
**Status: Accepted.** Builds on [ADR-0007](0007-kwh-as-ml-target.md) (the SAP10 calculator is the ground truth ML approximates) and [ADR-0008](0008-physics-as-feature.md) (we already ship ~30% of a calculator as physics features). Decision point: do we keep grinding ML accuracy on `sap_score`, or do we *write the calculator* and have ML predict its residual?
|
||||
|
||||
## Grill outcomes (2026-05-17)
|
||||
|
||||
Seven open questions resolved through a `/grill-with-docs` session before Session A. Each lands a binding scope decision for the implementation:
|
||||
|
||||
| # | Question | Decision |
|
||||
|---|---|---|
|
||||
| 0 | Domain placement | **Option B** — new term **Calculated SAP10 Performance**, parallel to Effective Performance (ML) and Lodged Performance (gov register). Effective Performance is **not** retired now; a future ADR may promote Calculated to its current role once parity is confirmed. Process named **SAP10 Calculation**. |
|
||||
| 1 | PCDB heat-pump COP source for Session A | **Stub-seam.** Define `PcdbLookup` Protocol, ship `NoOpPcdbLookup` returning None, fall back to Table 4a. Session C bundles a CSV PCDB extract under `docs/sap-spec/` and implements the lookup. |
|
||||
| 2 | MCS installation factors | **Boolean input on calculator inputs, default `False`.** Plumbing in Session A; no behaviour change until the input is populated. Slice 18f (separate, tracked in HANDOFF §7-D0) lifts `mcs_installed_heat_pump` from gov API → `EpcPropertyData.MainHeatingDetail` so calculator can apply the factor on the ~1.5% of HP certs that carry it. |
|
||||
| 3 | Thermal bridging | **Global y factor** (the path SAP 10.3 specifies for RdSAP-driven assessments). Per-junction Table R2 sum requires junction-count inputs the cert doesn't carry — not available on the RdSAP-driven flow. |
|
||||
| 4 | Living-area fraction default | **RdSAP 10 Table 27** — direct lookup from `habitable_rooms_count`. Unambiguous, one-line table. |
|
||||
| 5 | Secondary-heating allocation | **SAP 10.2/10.3 Table 11** keyed on main heating type. RdSAP doesn't redefine the fraction — it identifies the type only. Forcing rule: when main is micro-CHP and Table N9 says non-zero secondary heat with no secondary specified, assume portable electric heaters. |
|
||||
| 6 | Validation cohort | **Stratified random of 1000 certs**; report MAE per stratum. Session A success criterion = MAE ≤ 1.0 SAP-point on the **typical subset** (excluding sap_score ≤ 5, sap_score ≥ 100, multi-heating, conservatory, RIR). Global MAE reported alongside for honesty. |
|
||||
| 7 | `MeasureOverrides` shape | **Rejected as phantom mid-layer.** `Sap10Calculator.calculate(epc) -> SapResult` takes a single immutable cert. A separate **MeasureApplicator** service translates Optimised Package → cert-field changes, returning the "ending state snapshot" EpcPropertyData that Plan Phase already persists. Three pure functions in chain: applicator → calculator → result. |
|
||||
|
||||
## Additional findings from the grill that change Session A scope
|
||||
|
||||
- **SAP rating formula belongs to RdSAP, not SAP 10.3.** RdSAP §19 ("RdSAP10-specific SAP rating equations referred to as EER") defines the SAP-score equation used for RdSAP-driven assessments. SAP 10.3 §13 defines the rating for new-build assessments. The cert's `energy_rating_current` was computed by RdSAP §19, so parity validation must compute against RdSAP §19, not SAP 10.3 §13.
|
||||
- **RdSAP 10 (June 2025) cross-references SAP 10.2 (March 2025) for heating-system identification (Appendix A).** RdSAP was published before SAP 10.3 (Jan 2026). Until BRE updates RdSAP to reference SAP 10.3, the calculator's heating-identification logic reads SAP 10.2 Appendix A while everything else reads SAP 10.3. Keep both PDFs in `docs/sap-spec/`.
|
||||
- **RdSAP Table 29 ("Heating and hot water parameters") is a 20+-entry defaulting table** that the `cascade_defaults.py` module needs to encode. Current scope of `rdsap_uvalues.py` is U-values only; Table 29 extends the cascade pattern to cylinder insulation, primary-pipework insulation, boiler interlock, emitter temperature, underfloor-heating routing, solar-panel parameters, heat-network defaults. Adds ~1-2 hrs to Session A (effective Session A.5 if not split).
|
||||
- **MCS field exists in gov API** but is dropped by the current mapper. Slice 18f (lift `mcs_installed_heat_pump` into `EpcPropertyData`) is a prerequisite for the MCS-factor path. ~30 min slice; can ship before Session A or in parallel.
|
||||
|
||||
## Problem
|
||||
|
||||
After six slices of physics-feature work (18b/18c/18d/20a/20a.1) the ML model is at sap_score MAPE 3.63%, MAE 1.86 globally; per-decile MAE 3.86 (d0) and 2.25 (d9). Each new slice now nudges d0 MAE by ~0.05. User's target is MAE ≤ 0.5 across all bands. The remaining error is dominated by:
|
||||
|
||||
1. **Catastrophic tail noise** — d0 has 3.3% of rows with `sap_score ≤ 20` (heritage / abandoned / data-anomaly homes). MAE on those rows is structurally large because the model's prediction floor is ~30 even for the worst inputs.
|
||||
2. **Calculator nuance the physics features can't reach** — monthly heat balance with solar/internal gains and utilisation factor, full SAP §J hot-water variants, PCDB heat-pump overrides, dual-fuel allocation, conservatory modes, room-in-roof handling. Each of these is a deterministic line in the SAP10.3 spec but we model it via tree splits over input fields.
|
||||
|
||||
These cannot be closed by another tree feature. They require executing the calculator.
|
||||
|
||||
## Decision
|
||||
|
||||
Build a deterministic **`Sap10Calculator`** that reads `EpcPropertyData` and emits the same outputs the certificate's BRE-approved assessor software emits: `sap_score`, `co2_emissions`, `peui_raw`, `peui_ucl`, `space_heating_kwh`, `hot_water_kwh`. Target the SAP 10.3 specification (DESNZ/BRE, 13-01-2026) and the RdSAP 10 specification (BRE, 10-06-2025), both held in `docs/sap-spec/`.
|
||||
|
||||
The ML model is **not deprecated**. It is repurposed as a **residual learner** against `actual_sap − calculator_sap` (and similar deltas for the other five targets). Residual distributions are much narrower than the raw target distributions (calculator is within ~1 SAP-point on 95% of typical certs, per the working hypothesis), so the ML residual head should fit the corrections with far fewer features and reach the MAE ≤ 0.5 target.
|
||||
|
||||
## Why now
|
||||
|
||||
1. **SAP 10.3 just dropped (Jan 2026).** Building against the new spec means the calculator outputs match assessor software for any cert lodged from 2026 onward. Building against SAP 10.2 (March 2025) now would need re-derivation later.
|
||||
2. **The retrofit-simulation use case demands transparency.** Surveyors, building physicists, and homeowners need to see exactly which physics line — wall U×A, ventilation ACH, solar gain on south-facing windows — contributes how much heat-loss/cost. Tree-model attribution doesn't supply that. Calculator does.
|
||||
3. **30% of the calculator is already shipped.** `rdsap_uvalues.py` (Tables 6–10, 15–20, 24, 26), `sap_efficiencies.py` (Tables 4a, 4b, 32), `envelope.py` (Σ U·A + thermal bridging), partial `ventilation.py` (slice 20a tracer), partial `demand.py` (annual heat balance), `ecf.py` (Total fuel cost, ECF, log10ECF), PV credit (slice 17a), SAP §J hot-water port (slice 17b). The pivot is mostly re-platforming, not new physics.
|
||||
4. **ML residual learning has a clean home for the noise.** The catastrophic-tail rows the calculator gets wrong (data anomalies, mis-described systems) are exactly where ML *should* live, because they're not closed-form solvable. Calculator + residual head is a cleaner split of responsibility than "ML approximates the deterministic spec".
|
||||
|
||||
## Scope of the calculator (Session A)
|
||||
|
||||
A full SAP 10.3 worksheet plus the data-extraction rules from RdSAP 10 Appendix S. Module organisation:
|
||||
|
||||
```
|
||||
packages/domain/src/domain/sap/
|
||||
__init__.py # Sap10Calculator entry point + SapResult dataclass
|
||||
worksheet/
|
||||
dimensions.py # §1
|
||||
ventilation.py # §2 + Table 5 + Appendix Q
|
||||
heat_transmission.py # §3 + Appendix K (thermal bridging) + Tables 6–10/15–20/24/26
|
||||
hot_water.py # §4 + Appendix J + Appendix G (FGHRS/WWHRS/PV-diverters)
|
||||
internal_gains.py # §5 + Appendix L (lighting)
|
||||
solar_gains.py # §6 + Tables 6d/6e
|
||||
mean_temperature.py # §7
|
||||
climate.py # §8 + Appendix U (region-from-postcode, monthly external temp/wind/solar)
|
||||
space_heating.py # §9 + Appendices A/B/D/E/N (heating systems, efficiency, heat pumps)
|
||||
fuel_cost.py # §12 + Table 32 (fuel prices) + Appendix M (PV/wind/hydro generation)
|
||||
energy_cost_rating.py # §13 + the SAP score formula
|
||||
co2_primary_energy.py # §14 (emissions + primary energy)
|
||||
fee.py # §11 Fabric Energy Efficiency
|
||||
tables/
|
||||
table_4a_4b.py # heating-system seasonal efficiency
|
||||
table_5.py # ventilation rate components
|
||||
table_6.py # monthly external temp by region
|
||||
table_6d.py # monthly solar flux by orientation by region
|
||||
table_32.py # fuel prices
|
||||
table_R.py # reference values (Appendix R)
|
||||
rdsap/
|
||||
appendix_s.py # cert → calculator input mapping
|
||||
cascade_defaults.py # the RdSAP10 "assume-typical" rules (currently in rdsap_uvalues.py)
|
||||
```
|
||||
|
||||
The existing `domain.ml.*` modules stay where they are during Session A; they continue serving the live ML pipeline. Session B promotes them into `domain.sap.*` once parity is reached.
|
||||
|
||||
## Sap10Calculator interface
|
||||
|
||||
```python
|
||||
@dataclass(frozen=True)
|
||||
class SapResult:
|
||||
sap_score: float
|
||||
energy_cost_rating: float # alias for sap_score before band lookup
|
||||
sap_band: str # A-G
|
||||
co2_emissions_kgco2_per_m2: float
|
||||
peui_raw_kwh_per_m2: float
|
||||
peui_ucl_kwh_per_m2: float
|
||||
space_heating_kwh_per_yr: float
|
||||
hot_water_kwh_per_yr: float
|
||||
monthly_breakdown: MonthlyBreakdown
|
||||
intermediate: dict[str, float] # every named worksheet quantity, for traceability
|
||||
|
||||
class Sap10Calculator:
|
||||
def __init__(self, climate: ClimateData, pcdb: Optional[PcdbLookup] = None) -> None: ...
|
||||
def calculate(self, epc: EpcPropertyData) -> SapResult: ...
|
||||
```
|
||||
|
||||
`intermediate` carries every named SAP10.3 worksheet variable (envelope conduction W/K, ventilation rate, solar gains by month, utilisation factor, heat-pump SCOP, ECF, ...) so consumers can drill down. This replaces ADR-0008's physics-as-feature columns for retrofit-simulation consumers; the ML pipeline keeps generating them as features until the residual head is trained and validated.
|
||||
|
||||
## Validation
|
||||
|
||||
Two corpora:
|
||||
|
||||
1. **Calculator-vs-cert parity (Session B).** Run the calculator over 1000 randomly-sampled RdSAP-10 certs from `data/ml_training/runs/2025_2026_n250000_v18a/data.parquet`. Compare `Sap10Calculator.calculate(epc).sap_score` to the cert's `energy_rating_current`. Target: MAE ≤ 1.0 on 95% of certs; outliers investigated case-by-case to find spec-interpretation gaps or PCDB requirements.
|
||||
2. **Residual ML head (Session C+).** Train LightGBM on `actual_sap − calculator_sap` as the target. Validate that residual MAE is materially smaller than the current 1.86 global / 3.86 d0. If residual MAE on d0 falls below 0.5, the calculator + residual approach hits the user's target.
|
||||
|
||||
We do **not** retire the existing ML pipeline until both validations pass.
|
||||
|
||||
## What this ADR does *not* change
|
||||
|
||||
- **The six ML targets remain those from ADR-0007.** The residual head predicts deltas against the same six quantities.
|
||||
- **ADR-0008's physics-as-feature pattern stays valid for the ML residual head.** The residual head probably needs fewer features, but the cascade U-value defaults and SAP efficiency lookups remain useful as feature builders if the calculator subset alone underfits.
|
||||
- **`energy_rating_current` remains excluded from features.** Same leakage rule.
|
||||
- **RdSAP 10 cert-extraction rules are now first-class in the codebase.** Rules that were ad-hoc in `transform.py` move into `domain.sap.rdsap.appendix_s`.
|
||||
- **The training parquet schema continues at v2.x.** A new column `calculator_sap_score` lands as a non-breaking addition once Session A reaches parity. The schema version bumps to v3.0.0 only when the residual targets replace the raw targets — a coordinated AutoGluon-repo deploy, per ADR-0008's cutover discipline.
|
||||
|
||||
## SAP 10.2 → SAP 10.3 implications
|
||||
|
||||
The newer spec replaces tables we already ship:
|
||||
|
||||
- Table 4a/4b (heating efficiencies) — likely identical, verify on read.
|
||||
- Table 32 (fuel prices) — almost certainly different, re-derive from Appendix in 10.3.
|
||||
- Table 6d (solar flux) — likely identical (climate data).
|
||||
- Energy cost rating formula constants — unchanged in 10.3 vs 10.2 unless DESNZ updated the deflator.
|
||||
|
||||
Re-derivation work is bounded — a few hundred numbers across tables — and the `*_table_*.py` modules already have a clean shape for the cutover.
|
||||
|
||||
## Session plan (carried from HANDOFF §High-value next slices)
|
||||
|
||||
- **Session A (3–4 hrs):** Implement ventilation per §2 (replacing the slice-20a tracer), 12-month heat balance per §6 + §8 + Appendix U, solar gains per §6 + Table 6d, internal gains per §5 + Appendix L, utilisation factor per §6.4, mean internal temperature per §7. End of Session A: `Sap10Calculator.calculate(epc) -> SapResult` runs on typical certs.
|
||||
- **Session B (3–4 hrs):** Edge cases — conservatory modes, room-in-roof handling, multi-heating allocation, dual fuel, secondary heating fraction (Appendix A). Run parity validation across 1000 certs. Iterate on spec-interpretation gaps. End of Session B: 95% of typical certs within 1 SAP-point of cert value.
|
||||
- **Session C (2–3 hrs):** PCDB integration for boiler + heat-pump overrides (Appendices D, N). Residual-head training on `actual_sap − calculator_sap`. ADR-0010 if any non-trivial calculator/ML hybrid pattern emerges that ADR-0009 didn't anticipate.
|
||||
|
||||
## Caveats
|
||||
|
||||
- **Spec interpretation will need product input.** 5–10 questions per session on edge cases: multi-heating split logic, secondary heating threshold rules, PCDB-vs-Table-4b precedence, etc. These are not in the spec text and are real business decisions.
|
||||
- **No reference BRE Python port is currently known.** If one surfaces, porting accelerates. If not, every line of the calculator is implemented from the spec PDF directly, with tests.
|
||||
- **PCDB (Product Characteristics Database).** SAP 10.3 references the PCDB throughout for boiler/HP efficiency overrides. Without PCDB integration, calculator carries ~1 SAP-point penalty on PCDB-listed equipment. Defer to Session C.
|
||||
- **The current ML pipeline keeps running through all three sessions.** No deprecation until residual validation lands. The branch `ara-backend-design-prd` (current ML grind) and the calculator work proceed in parallel.
|
||||
|
||||
## Consequences
|
||||
|
||||
- A new top-level domain area `domain.sap.*` is introduced; over Sessions B/C it absorbs `domain.ml.{envelope,demand,ecf,rdsap_uvalues,sap_efficiencies,ventilation}.py`. The ML transform stops shipping those as standalone features once the residual head takes over.
|
||||
- The codebase carries two SAP outputs: cert-reported `sap_score` (ground truth at training time) and calculator-emitted `sap_score` (ground truth at inference time for any RdSAP cert input). The product layer chooses; for "score this hypothetical post-retrofit state", calculator wins.
|
||||
- The deterministic calculator is **version-bound to SAP 10.3.** A future SAP 10.4 is a calculator MAJOR bump and an ADR. The ML residual head is SAP-version-agnostic only insofar as the residual distribution it learns stays stationary; in practice a spec bump retrains the residual head.
|
||||
- Spec PDFs live in `docs/sap-spec/` (this repo). The repo now carries the canonical reference for what the calculator computes. License: SAP 10.3 © Crown copyright 2026; RdSAP 10 © BRE — both are public-interest references for SAP-compliant software, included for traceability.
|
||||
224
docs/sap-spec/CALCULATOR_DESIGN_SKETCH.md
Normal file
224
docs/sap-spec/CALCULATOR_DESIGN_SKETCH.md
Normal file
|
|
@ -0,0 +1,224 @@
|
|||
# Sap10Calculator — design sketch
|
||||
|
||||
**Source specs:** `sap-10-3-full-specification-2026-01-13.pdf` (315pp, worksheet) + `rdsap-10-specification-2025-06-10.pdf` (114pp, cert→input rules). Both in this directory.
|
||||
|
||||
**Status:** sketch only — not implemented. Drives ADR-0009 Session A scope.
|
||||
|
||||
---
|
||||
|
||||
## Core insight: SAP is a monthly loop
|
||||
|
||||
The worksheet (§§5–9) iterates over 12 months. Each month produces its own:
|
||||
- mean external temperature (Table U1 by region)
|
||||
- wind speed (Table U2)
|
||||
- horizontal solar irradiance (Table U3) → converted per orientation via Table 6d
|
||||
- mean internal temperature (Table 9 + HLP from worksheet 40)
|
||||
- gains (internal Table 5 + solar §6.1) × utilisation factor η (Table 9a)
|
||||
- losses (HLC × ΔT)
|
||||
- useful space-heating demand
|
||||
- delivered fuel demand (useful / efficiency)
|
||||
|
||||
Annual quantities are the **sum across months**. The current `demand.py` collapses all of this into a single annual `HDH × HLC / η_heating`. That's a 1-step crude approximation of a 12-step loop. The single biggest physics gain in moving to the calculator is unrolling that loop.
|
||||
|
||||
---
|
||||
|
||||
## Inputs (EpcPropertyData mostly already there)
|
||||
|
||||
Already mapped in `EpcPropertyData`:
|
||||
- `total_floor_area_m2`, `sap_building_parts[*].sap_floor_dimensions[*]` (room_height, perimeter, party-wall length)
|
||||
- `sap_windows[*]` (orientation, dimensions, frame_material, glazing_type, draught_proofed, window_transmission_details)
|
||||
- `door_count`, `insulated_door_count`, `insulated_door_u_value`
|
||||
- `sap_building_parts[*]` (wall/roof/floor construction, age band, insulation thickness/type)
|
||||
- `sap_heating.main_heating_details[*]` (sap_main_heating_code, fuel, emitter, controls, fraction)
|
||||
- `sap_heating.water_heating_*`, `cylinder_size`, `cylinder_insulation_thickness_mm`
|
||||
- `sap_energy_source.photovoltaic_arrays`, `solar_water_heating`
|
||||
- `open_chimneys_count`, `blocked_chimneys_count`, `flueless_gas_fires_count` (on SapVentilation)
|
||||
- `region_code` (1–22; SAP10.3 uses regions 0–21 with 0=UK avg — confirm mapping)
|
||||
- `country_code` for ENG/SCT/WAL/NIR table overrides
|
||||
|
||||
Missing / null in current corpus (likely needs slice 18e mapper fix):
|
||||
- `pressure_test` (100% null per HANDOFF §7-D) — SAP10.3 §2.3: when populated, overrides worksheet lines 9–16 entirely
|
||||
- `sap_ventilation.*` (mostly null) — fans, passive vents, AP4
|
||||
- `mechanical_ventilation` (100% null)
|
||||
|
||||
---
|
||||
|
||||
## Output: SapResult
|
||||
|
||||
```python
|
||||
@dataclass(frozen=True)
|
||||
class MonthlyBreakdown:
|
||||
external_temp_c: tuple[float, ...] # 12 entries
|
||||
internal_temp_c: tuple[float, ...]
|
||||
heat_loss_kwh: tuple[float, ...]
|
||||
solar_gain_kwh: tuple[float, ...]
|
||||
internal_gain_kwh: tuple[float, ...]
|
||||
utilisation_factor: tuple[float, ...]
|
||||
useful_demand_kwh: tuple[float, ...]
|
||||
delivered_main_kwh: tuple[float, ...]
|
||||
delivered_secondary_kwh: tuple[float, ...]
|
||||
delivered_hot_water_kwh: tuple[float, ...]
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class SapResult:
|
||||
sap_score: float # 1-100+ rating
|
||||
sap_band: str # A-G
|
||||
co2_emissions_kgco2_per_m2: float
|
||||
peui_kwh_per_m2: float
|
||||
space_heating_kwh_per_yr: float
|
||||
hot_water_kwh_per_yr: float
|
||||
pumps_fans_kwh_per_yr: float
|
||||
lighting_kwh_per_yr: float
|
||||
total_fuel_cost_gbp: float
|
||||
pv_export_credit_gbp: float
|
||||
monthly: MonthlyBreakdown
|
||||
worksheet: dict[int, float] # SAP10.3 worksheet line → value, for audit
|
||||
notes: tuple[str, ...] # spec-decision provenance per cert
|
||||
```
|
||||
|
||||
`worksheet[line]` lets product / surveyor UIs show the SAP worksheet line-by-line. Every named quantity in §§1-14 lands here.
|
||||
|
||||
---
|
||||
|
||||
## Module layout (proposed under `packages/domain/src/domain/sap/`)
|
||||
|
||||
```
|
||||
sap/
|
||||
__init__.py # exports Sap10Calculator, SapResult, MonthlyBreakdown
|
||||
calculator.py # Sap10Calculator orchestrator: input -> 12-month loop -> SapResult
|
||||
worksheet/
|
||||
dimensions.py # §1: floor area, storey height, volume; porch/conservatory inclusion
|
||||
ventilation.py # §2: Table 2.1 + AP50/AP4 override + structural + mech vent
|
||||
heat_transmission.py # §3: U×A per element, thermal bridging (Table R2 or global y)
|
||||
hot_water.py # §4 + Appendix J: SAP §J port already shipped; folds in
|
||||
internal_gains.py # §5 + Table 5/5a; Appendix L lighting
|
||||
solar_gains.py # §6: per orientation, per month, with Z overshading
|
||||
utilisation_factor.py # §6.4 + Table 9a η from gain/loss ratio
|
||||
mean_internal_temp.py # §7 + Table 9/9a/9b: living area + rest, HLP, controls
|
||||
space_heating.py # §9: useful demand, system efficiency, secondary fraction
|
||||
fuel_cost.py # §12: monthly fuel use × Table 12 prices, PV export credit
|
||||
energy_cost_rating.py # §13: ECF → SAP rating piecewise formula
|
||||
co2_primary_energy.py # §14: CO2 and primary energy worksheets
|
||||
tables/
|
||||
table_2_1.py # ventilation rates (chimney 80, flue 35, fan 10, etc.)
|
||||
table_4a_4b.py # heating-system seasonal efficiency
|
||||
table_4c.py # low-temp emitter adjustments
|
||||
table_4e.py # controls bonus / mean-internal-temp adjustment
|
||||
table_5.py # internal gains by floor area
|
||||
table_5a.py # additional gains: pumps, MVHR fans
|
||||
table_6b.py # glazing solar transmittance g_⊥
|
||||
table_6c.py # frame factors FF
|
||||
table_6d.py # solar access factor Z + over-shading categories
|
||||
table_6e.py # window U-value defaults
|
||||
table_9.py # mean internal temp (living + rest of dwelling)
|
||||
table_9a.py # utilisation factor η formula
|
||||
table_11.py # main/secondary heating split fraction
|
||||
table_12.py # fuel prices (SAP10.3 replacement for SAP10.2 Table 32)
|
||||
table_R2.py # default Ψ for non-repeating thermal bridges
|
||||
climate/
|
||||
appendix_u.py # U1 external temp, U2 wind speed, U3 solar irradiance (each 22 regions × 12 months) + solar declination
|
||||
postcode_to_region.py # Table U6: postcode prefix → region code
|
||||
orientation_flux.py # convert horizontal solar U3 → per-orientation per-tilt surface flux
|
||||
rdsap/
|
||||
cert_to_inputs.py # RdSAP 10 cert → calculator input mapping
|
||||
cascade_defaults.py # current rdsap_uvalues.py logic, moved here
|
||||
```
|
||||
|
||||
Modules in `domain.ml.*` stay in place during Session A (live ML pipeline keeps running). Session B promotes the deterministic logic out of `domain.ml.{envelope,demand,ecf,ventilation,sap_efficiencies,rdsap_uvalues}.py` into `domain.sap.*` once Session A reaches parity. The ML transform then imports calculator outputs as a single `predicted_sap_score` feature for residual learning (per ADR-0009).
|
||||
|
||||
---
|
||||
|
||||
## Sap10Calculator.calculate — orchestration sketch
|
||||
|
||||
```python
|
||||
class Sap10Calculator:
|
||||
def __init__(self, climate: ClimateData = APPENDIX_U, pcdb: Optional[PcdbLookup] = None) -> None: ...
|
||||
|
||||
def calculate(self, epc: EpcPropertyData) -> SapResult:
|
||||
inputs = cert_to_inputs(epc) # RdSAP 10 mapping; fills defaults
|
||||
|
||||
# Static, region-independent quantities
|
||||
dim = compute_dimensions(inputs) # §1 — TFA, volume, storey heights
|
||||
hlc_T = heat_transmission_w_per_k(inputs) # §3 — Σ U·A + thermal bridging
|
||||
n_inf = ventilation_ach(inputs, dim) # §2 — infiltration ACH (excl. AP override)
|
||||
eff = heating_efficiencies(inputs, self.pcdb) # §9.2 — winter/water/secondary
|
||||
|
||||
# 12-month loop
|
||||
months = []
|
||||
for m in MONTHS:
|
||||
ext_t = self.climate.external_temp(m, inputs.region)
|
||||
wind = self.climate.wind_speed(m, inputs.region)
|
||||
n_total = apply_wind_shelter(n_inf, wind, inputs.sheltered_sides)
|
||||
hlc_V = n_total * dim.volume * 0.33
|
||||
hlc = hlc_T + hlc_V # W/K
|
||||
int_gain = internal_gain_w(dim.tfa, inputs.lighting, m) # §5 + Appendix L
|
||||
sol_gain = sum(solar_gain_w(w, m, inputs.region) for w in inputs.windows) # §6
|
||||
int_t = mean_internal_temp(hlc, dim.tfa, eff, inputs.controls, m)
|
||||
losses_w = hlc * (int_t - ext_t)
|
||||
gains_w = int_gain + sol_gain
|
||||
eta = utilisation_factor(gains_w, losses_w, dim.tmp) # §6.4 Table 9a
|
||||
useful_w = max(0.0, losses_w - eta * gains_w)
|
||||
useful_kwh = useful_w * HOURS_PER_MONTH[m] * 1e-3
|
||||
fuel_kwh = useful_kwh / eff.winter
|
||||
months.append(MonthlyEntry(...))
|
||||
|
||||
annual = aggregate(months) # sum + lighting + pumps/fans + PV
|
||||
cost = fuel_costs(annual, inputs.fuel_prices)
|
||||
ecf = (cost - pv_export) / dim.tfa
|
||||
sap = sap_rating(ecf) # §13: 117 - 121*log10(ECF) etc.
|
||||
co2 = co2_emissions(annual, inputs.carbon_factors)
|
||||
peui = (annual.delivered_total) / dim.tfa
|
||||
|
||||
return SapResult(sap, ..., monthly=MonthlyBreakdown(...), worksheet=ws_dict, notes=...)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## What we already have vs what we need to write
|
||||
|
||||
| Worksheet area | Status | Module today | Action |
|
||||
|---|---|---|---|
|
||||
| §1 Dimensions | ~70% | implicit in `envelope.py:_part_geometry` | extract into `worksheet/dimensions.py`; handle porch/conservatory/RIR inclusion rules |
|
||||
| §2 Ventilation | ~25% | `ventilation.py` (slice 20a tracer; wrong rates) | rewrite per Table 2.1 + worksheet lines 9-16 + AP override + mech vent |
|
||||
| §3 Heat transmission | ~80% | `envelope.py`, `rdsap_uvalues.py` | move into `worksheet/heat_transmission.py`; add Table R2 thermal bridging |
|
||||
| §4 Hot water | ~70% | `demand.py:predicted_hot_water_kwh` (SAP §J port from slice 17b) | move to `worksheet/hot_water.py`; verify against §4 + Appendix J |
|
||||
| §5 Internal gains | 0% | nothing | new — Table 5 by floor area + Table 5a additions |
|
||||
| §6 Solar gains | 0% | nothing | new — per-orientation per-month with Table 6b/6c/6d + Appendix U3 |
|
||||
| §6.4 Utilisation factor | 0% | nothing | new — Table 9a η from gain-to-loss ratio |
|
||||
| §7 Mean internal temp | 0% | nothing | new — Table 9 + HLP + Table 4e controls |
|
||||
| §8 Climate | ~10% | `demand.py:_HDH_BY_REGION` (annual HDH) | replace with 12-month tables U1/U2/U3 |
|
||||
| §9 Space heating | ~50% | `demand.py:predicted_space_heating_kwh` (annual), `sap_efficiencies.py` (Tables 4a/4b) | rewrite as monthly loop; add Table 4c low-temp adjustments, MCS installation factors, secondary heating fraction |
|
||||
| §12 Fuel cost | ~80% | `ecf.py`, `sap_efficiencies.py` (Table 32→SAP10.3 Table 12) | verify Table 12 prices vs Table 32; add lighting+pumps/fans energy use; PV export credit already in slice 17a |
|
||||
| §13 SAP rating | 0% | nothing (ML emits it) | new — §13 piecewise log formula 117 − 121·log10(ECF) etc. |
|
||||
| §14 CO2 + primary | 0% | nothing | new — Table 12 carbon factors + primary-energy factors |
|
||||
| Climate (Appendix U) | 0% | `_HDH_BY_REGION` only | new — Tables U1/U2/U3 module |
|
||||
| RdSAP cert mapping | ~60% | scattered across `transform.py` + `envelope.py` + `demand.py` | consolidate into `rdsap/cert_to_inputs.py`; clean separation between extraction and physics |
|
||||
|
||||
**Rough effort estimate (re-validating HANDOFF §High-value next slices):**
|
||||
|
||||
- Session A — bring up the monthly loop end-to-end on typical certs: ~4-5 hrs.
|
||||
- 1.5 hrs: Appendix U tables, dimensions, ventilation rewrite per Table 2.1 + worksheet lines 9-16
|
||||
- 1.5 hrs: solar gains (per-orientation per-month) + internal gains + utilisation factor
|
||||
- 1 hr: mean internal temp + monthly loop wiring
|
||||
- 1 hr: SAP rating formula + CO2/primary-energy worksheets + end-to-end test on 5 sample certs
|
||||
- Session B — edge cases + 1000-cert parity validation: ~4 hrs.
|
||||
- Session C — PCDB lookups + residual head training: ~3 hrs.
|
||||
|
||||
Total ~11-12 hours of Claude-time, matching HANDOFF estimate.
|
||||
|
||||
---
|
||||
|
||||
## Open questions for the user (before Session A)
|
||||
|
||||
1. **Heat pump COP source.** SAP10.3 §9.2.7 says PCDB preferred, else Table 4a. PCDB integration is Session C. For Session A do we use Table 4a defaults only, accepting a ~1 SAP-point penalty on PCDB-listed heat pumps?
|
||||
|
||||
2. **MCS installation factors.** Worth applying (×1.39 GSHP space etc.) when MCS-certified? The cert doesn't carry MCS certification explicitly — would need a feature flag.
|
||||
|
||||
3. **Thermal bridging.** Two paths: global y factor (current) or per-junction Table R2 sum. Per-junction needs junction-count inputs that aren't on the cert. Recommendation: stay with global y for the cert-driven calculator; offer per-junction as an override path for new-build / Site Notes inputs.
|
||||
|
||||
4. **Living area fraction.** SAP10.3 §7.1 says "the largest public room"; the cert doesn't carry this. RdSAP 10 spec probably defaults it — confirm in the RdSAP read.
|
||||
|
||||
5. **Secondary heating allocation.** Table 11 splits main/secondary; the cert carries `secondary_fuel_type` but no explicit fraction. RdSAP 10 likely has a default split per main-system type — confirm.
|
||||
|
||||
6. **Validation cohort.** 1000 random certs from `data/ml_training/runs/2025_2026_n250000_v18a/data.parquet`, or filter to "clean" subset first (drop catastrophic-tail noise where cert SAP itself looks anomalous)? Recommend first pass on the full random sample so we measure raw parity, then quantify how much of the residual comes from cert-data anomalies.
|
||||
BIN
docs/sap-spec/rdsap-10-specification-2025-06-10.pdf
Normal file
BIN
docs/sap-spec/rdsap-10-specification-2025-06-10.pdf
Normal file
Binary file not shown.
BIN
docs/sap-spec/sap-10-3-full-specification-2026-01-13.pdf
Normal file
BIN
docs/sap-spec/sap-10-3-full-specification-2026-01-13.pdf
Normal file
Binary file not shown.
Loading…
Add table
Reference in a new issue