The cert's `floor_insulation_thickness` field carries "NI" (Not
Indicated) on 58% of corpus certs — by far the most common value. For
~2 413 of those (12% of corpus) the description also says "Solid,
insulated (assumed)" or "Suspended, insulated (assumed)" — the
assessor saw insulation but didn't measure the thickness. Our
`_parse_thickness_mm("NI")` returns 0, which feeds `u_floor` as an
explicit "0 mm" → r_f=0 → uninsulated-floor U-value. Wrong.
RdSAP 10 §5.12 Table 19 footnote (2) (page 46): "For floors which
have retrofitted insulation, use the greater of 50 mm and the
thickness according to the age band". `u_floor` now accepts a
`description` kwarg; when `_described_as_insulated(description)` is
true and the lodged thickness is missing/zero, ins_mm =
max(50, age-band default).
Geometry sanity-check, 100 m² × 40 m perimeter, w=0.3 (B=5):
- Uninsulated solid floor: d_t = 0.615, U = 0.60 W/m²K
- 50 mm assumption: d_t = 2.758, U = 0.31 W/m²K
Parity probe at 300 certs, seed=7:
PE MAE 45.37 → 44.19 (-1.18)
PE bias 39.75 → 38.56 (-1.19)
Band J bias +41.2 → +29.7 (-11.5)
Band K bias +34.1 → +22.4 (-11.7)
Band L bias +19.6 → +11.3 (-8.3)
Band M bias +86.3 → +55.1 (-31.2)
Bands A-H mostly unchanged (max(50, 0) = 50 either way; description
overrides on older stock are rarer in this sample)
The K-L-M dwellings improved most because for them the age-band
default insulation (100-140 mm) is now applied instead of 0 mm.
Cumulative across S-B23 → S-B27:
PE MAE 57.28 → 44.19 (-13.09)
PE bias 51.56 → 38.56 (-13.00)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two related bugs both produced U=1.7 for retrofit-insulated solid-brick
walls when the spec says U=0.55 (Table 6 footnote: "If a wall is known
to have additional insulation but the insulation thickness is unknown,
use the row in the table for 50 mm insulation"):
1. _insulation_bucket(0, True) returned 0 instead of 50. The "NI"
sentinel parses to 0 via _parse_thickness_mm, then the bucket
function's "< 25 -> 0" branch ignored the insulation_present signal.
Affects 56 corpus certs lodging solid-brick with type=1 or type=3
plus thickness="NI".
2. wall_ins_present was set False whenever wall_insulation_type == 4
("as-built / assumed"), even if the description said
"...insulated (assumed)" or "...partial insulation (assumed)".
Affects 128+51 = 179 corpus certs.
The same root pattern as S-B25 (cavity-wall description disambiguation),
extended to non-cavity constructions. `_cavity_described_as_filled`
generalised to `_described_as_insulated`; now used by:
- u_wall (cavity-wall dispatcher to the Filled-cavity row, S-B23/B25)
- heat_transmission_from_cert (override wall_ins_present for non-cavity
walls so the 50 mm bucket routes per Table 6 footnote)
Parity probe at 300 certs, seed=7:
PE MAE 45.74 → 45.37 (-0.37)
PE bias 40.19 → 39.75 (-0.44)
Band D bias +42.7 → +41.6 (-1.1)
Band F bias +12.6 → +10.7 (-1.9)
Modest aggregate movement — the affected population is small (~0.6% of
corpus, ~2 certs in the 300 sample). The slice's correctness is proved
by 4 unit tests in test_rdsap_uvalues.py + 2 end-to-end tests in
test_heat_transmission.py.
Cumulative across S-B23 → S-B26:
PE MAE 57.28 → 45.37 (-11.91)
PE bias 51.56 → 39.75 (-11.81)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The RdSAP schema's `wall_insulation_type = 4` ("as-built / assumed")
covers two distinct cert populations that previously both routed to
the Cavity-as-built row (U=1.5 at band E):
686 certs: "Cavity wall, as built, no insulation (assumed)" — U=1.5 ✓
1171 certs: "Cavity wall, as built, insulated (assumed)" — should be 0.7
147 certs: "Cavity wall, as built, partial insulation (assumed)" — 0.7
The description string disambiguates. The legacy production map at
recommendations/rdsap_tables.py:753 routes the latter two to "Filled
cavity" — we match that interpretation here for parity with the cert
assessor and the production recommendation engine.
`_cavity_described_as_filled` adds the description check; the existing
filled-cavity dispatcher in u_wall now fires on either signal:
- wall_insulation_type == 2 (S-B23 — explicit filled-cavity code)
- description contains "insulated" or "partial insulation" without
the "no insulation" negation marker (S-B25 — assumed cavity-fill)
Parity probe at 300 certs, seed=7:
PE MAE 46.78 → 45.74 (-1.04)
PE bias 41.78 → 40.19 (-1.59)
Band F bias +23.2 → +12.6 (-10.6)
Band G bias +31.8 → +25.1 (-6.7)
Band H bias +30.7 → +15.5 (-15.2)
Improvements localise to bands F-H (1976-1995), the era when Building
Regs mandated cavity insulation for new-builds — making "as built,
insulated (assumed)" the modal description. SAP MAE drifted up
+0.12 (cost-side residuals surfacing now that envelope is closer to
spec; tracked for follow-up).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Full SAP assessments (~15% of corpus, 4 403 of 30 000 scanned bulk-zip
certs) lodge a measured/calculated wall U-value per BS EN ISO 6946 in
walls[i].description, e.g. "Average thermal transmittance 0.18 W/m²K".
These certs typically have wall_construction, wall_insulation_type and
construction_age_band all None, which the cascade defaults previously
resolved to U = 1.5 (uninsulated cavity at band E). RdSAP 10 §5.3:
"U values are obtained from … the construction type, date of
construction and, where applicable, thickness of additional insulation"
— but a measured value supersedes the cascade.
Corpus U-value distribution among parsed:
median 0.21, mean 0.225, range 0.06-1.84
80% at U ≈ 0.2 (Part L-compliant new-builds)
10% at U ≈ 0.1 (passivhaus / very low)
7% at U ≈ 0.3 (older retrofitted full-SAP)
3% in the tail (conversions, edge cases)
Per affected cert (100 m² new-build at U 1.5 → 0.21):
walls_w_per_k drops 129 → 21 W/K
PEUI drops ≈ 120 kWh/m²
Implementation:
- _measured_u_from_description() regex-parses the phrase from the wall
description; returns None on no-match or non-numeric so the cascade
fall-through is preserved.
- u_wall checks the measured value FIRST, before any cascade logic.
- No range cap — calculator mirrors what the assessor lodged, per the
"deterministic except for input errors" principle. Parse failure
falls through cleanly.
Parity probe at 300 certs, seed=7: headlines unchanged. Direct check
on the sample: 0/300 certs carry an "Average thermal transmittance"
description. The v18a parquet filters full-SAP certs out somewhere
upstream, so this slice is invisible in the parquet-based probe. The
slice's correctness is proved by:
- 4 unit tests in test_rdsap_uvalues.py (tracer + regression on
ordinary descriptions + parse-failure fallback + filled-cavity
description still routes correctly)
- 1 end-to-end test in test_heat_transmission.py exercising a
synthetic full-SAP cert through heat_transmission_from_cert
- All 274 domain tests passing, no regressions
Follow-up tooling: a bulk-zip-based parity probe that doesn't filter
to the parquet's subset is needed to measure this slice's corpus
impact. Separate dig.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Previous bound (20, 95) excluded full-SAP new-builds (sap_score 90+,
which carry the dramatic wall U-value gap) and deepest-tail heritage
certs (sap_score ≤ 20). Widening so the sample reflects the
populations where the calculator's biggest spec gaps live.
New baseline at 300 certs, seed=7:
SAP MAE 5.34 → 4.59 (-0.75)
PE MAE 48.99 → 46.78 (-2.21)
PE bias 42.07 → 41.78 (-0.29)
Note: the v18a parquet only contains ~0.7% certs with age_band=None,
while the raw bulk zip has 15% full-SAP "Average thermal transmittance"
certs. The parquet is filtering them somewhere upstream — to be chased
in separate work. Until then, parity-probe MAE will under-show the true
corpus impact of slices that target full-SAP certs.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The cert encodes filled-cavity walls as
(wall_construction=4 cavity, wall_insulation_type=2 filled,
wall_insulation_thickness="NI"). The previous cascade parsed "NI"→0
and ran the thickness-bucketed table, returning U=1.5 (the
"Cavity as built" row) — treating retrofit-filled cavities as if they
were uninsulated. Spec (RdSAP 10 Table 6, page 33) has a dedicated
"Filled cavity" row at U=0.7 for bands A-E, 0.40 at F, 0.35 at G-H,
and "as built" from band I onward.
Adds:
- WALL_INSULATION_FILLED_CAVITY constant (code 2 per RdSAP schema,
confirmed empirically on 8 000 corpus certs against walls.description)
- _CAVITY_FILLED_ENG row in domain.ml.rdsap_uvalues
- dispatcher in u_wall when (construction=cavity, insulation_type=2)
- wall_insulation_type plumbing through heat_transmission_from_cert
Parity probe (300 certs, seed=7) before → after:
- PE MAE 57.28 → 48.99 (-8.3)
- PE bias 51.56 → 42.07 (-9.5)
- Band C bias +65.3 → +47.8 (-17.5)
- Band D bias +67.9 → +45.7 (-22.2)
- Band E bias +77.0 → +58.8 (-18.2)
- Band F bias +43.8 → +25.4 (-18.4)
- Band K-L bias unchanged (filled-cavity row falls back to as-built
from band I onward per spec footnote; correct no-op)
Future slices already lit up by the same enumeration:
- type=1 external / type=3 internal insulation rows (~440 certs)
- type=6 filled + external / type=7 filled + internal (~22 certs)
- type=None "Average thermal transmittance X W/m²K" string parse
(1 358 certs — biggest follow-up)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds primary-energy breakdown (space heating, hot water, lighting,
pumps, PV) per cert plus stratified bias reports by main_heating_
category, construction_age_band, and dwelling_type. Used to localise
the +51 kWh/m² PEUI bias to envelope-side over-prediction on pre-1996
fabric, which the bare SAP-residual ranking didn't surface.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Per user suggestion: the iteration history in this chat has likely
accreted blind spots that a long context window can't shed (e.g. I
spent slices comparing our delivered kWh to the cert's primary kWh
without noticing the apples-to-oranges error). A fresh agent reading
the SAP 10.2 + RdSAP 10 PDFs cold against the current calculator may
spot gaps faster.
HANDOVER_FRESH_REVIEW.md gives the fresh agent:
- Current state (MAE 5.34, primary-energy bias +51 kWh/m²)
- Repo layout pointer
- Priority-ordered dig list (PEUI mystery first)
- Validated truths
- Dead-end list (don't repeat S-B5 NI thickness switch etc.)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wires SAP 10.2 Table 12 "Primary energy factor" column into Table 12
helpers and onto CalculatorInputs as three per-end-use factors (space
heating, hot water, other). calculate_sap_from_inputs now emits
primary_energy_kwh_per_yr and primary_energy_kwh_per_m2 on SapResult,
matching the cert's `energy_consumption_current` field (PEUI).
Triggered by a decomposition that revealed I'd been comparing our
delivered energy to the cert's primary energy — apples to oranges.
With proper primary-energy comparison the actual finding is:
300-cert primary-energy diff (cert calibration prices):
energy MAE: 57.3 kWh/m²
energy bias: +51.6 (we over-predict by ~50%)
energy P50: +49.5
This is a much bigger systemic bug than the SAP MAE 5.34 suggested.
Closing it requires investigating either (a) demand model
over-prediction, (b) HW losses, (c) PEF values per fuel, or (d) cert
reporting convention differences. Targeted for the next context.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Per SAP 10.3 §2 worksheet line 22 / RdSAP10 §4.1: effective infiltration =
raw_ACH × (1 - 0.075 × sheltered_sides). Default 2 sheltered sides for
typical UK terraced/semi-detached layout (the cert doesn't lodge a
sheltered-sides count, so we apply the spec's typical default).
infiltration_ach() gains a `sheltered_sides` kwarg defaulting to 0
(spec-pure intermediate result; existing unit tests keep that contract).
cert_to_inputs passes sheltered_sides=2.
Found via energy decomposition: our predicted total energy was running
+15.7 kWh/m² over cert (10% over) — wind shelter knocks ~15% off
infiltration, contributing to closing that gap.
300-cert parity probe:
MAE 5.43 → 5.34 (-0.09)
bias -0.52 → +0.29 (back near zero)
within ±10: 86.3% → 86.7%
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
SAP 10.2 Table 11 allocates a fraction (10-20%) of space heating to a
secondary system based on main heating category. Per Appendix A §A.2.2,
this is applied:
- Always for electric storage heater main systems (codes 401-407, 409,
421); a portable electric heater (code 693) is defaulted when no
secondary is recorded.
- Otherwise only when the cert lodges a secondary_heating_type.
Calculator gains secondary_heating_fraction, secondary_heating_efficiency,
secondary_heating_fuel_cost_gbp_per_kwh on CalculatorInputs and a
secondary_heating_fuel_kwh_per_yr on SapResult. Monthly loop splits
demand: q_main = q_heat × (1 - frac), q_secondary = q_heat × frac, each
converted to fuel via its own efficiency. Cost = main_kwh × main_price
+ secondary_kwh × secondary_price + ... .
Initial implementation applied 10% unconditionally and regressed 300-
cert MAE 5.45 → 6.58 (bias -2.65). Restricted to the conditional rule
above and aggregate returns to flat:
300-cert: MAE 5.45 → 5.43 (flat)
bias +0.22 → -0.52
within ±5: 62.7% → 64.3%
The slice is spec-correct and architecturally enables the secondary-
heating channel; aggregate MAE moves are small because most certs
don't lodge a secondary and most non-storage mains don't force one.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Per user suggestion (switch from probe-driven to worksheet-driven
iteration), enumerates the §§1-15 worksheet + Appendices A-U state in
the calculator with a status grade and a prioritised gap list. Becomes
the roadmap for Session B remaining slices.
Next slice from this list: Table 11 secondary heating allocation —
10% fraction on most boiler-main certs that we currently model as 0.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wires photovoltaic_arrays into the calculator as a per-kWh cost credit
against the ECF numerator. Total annual PV kWh = sum(peak_power_kw)
× 850 (UK-average yield per Appendix M, single national figure since
ratings use UK-average weather per S-B18). Credit rate is Table 12
code 60 (PV export tariff) — 5.59 p/kWh under SAP spec prices, 13.19
p/kWh under cert-calibration prices.
This is the first slice from the worksheet-driven phase (per user
suggestion). PV was identified as a clear systemic gap that probe-
driven iteration hadn't surfaced because only ~5-10% of certs have
PV and the corpus probe is biased toward the most-frequent shapes.
100-cert: MAE 4.39 → 4.49 (small regression; bias -0.17 → -0.07)
300-cert: MAE 5.44 → 5.45 (essentially flat; bias 0.11 → 0.22)
Net spec-correct, aggregate MAE neutral. The certs that DO have PV
should see the right cost story now; ML residual will pick up the
fidelity gap (no orientation/overshading/pitch on our yield).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
SAP 10.2 Appendix U explicit rule: "Calculations for fabric energy
efficiency (FEE), regulation compliance (TER and DER, TPER and DPER)
and for ratings (SAP rating and environmental impact rating) are done
with UK average weather. Other calculations (such as for energy use and
costs on EPCs) are done using local weather."
Our calculator was using the cert's region_code for everything. Spec
mandates region 0 (UK average) for rating outputs. Net MAE neutral on
the 100-cert sample (most certs sit close to UK average) and on the
300-cert sample but it's spec-correct, and aligns with what the cert
assessor's SAP rating actually computes.
Found by switching from probe-driven to worksheet-driven iteration —
per user suggestion this is the more efficient mode once the easy
wins from probe-driven have been extracted.
100-cert: MAE 4.39 (unchanged)
300-cert: MAE 5.44
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Refines S-B16 with a fuel-conditional rule for the Unknown tariff code
(RdSAP energy_tariff=3): all-electric dwellings whose meter_type the
assessor couldn't pin down are almost always E7-eligible (gas dwellings
default to Single). For non-electric end-uses (gas main heating), the
meter_type doesn't affect cost, so Unknown stays standard for them.
Hand-trace confirmation: 3 of the 4 worst residuals (0800-1364,
0036-1125, 0340-2394) all have meter_type=3 AND electric main fuel —
applying off-peak to these recovers the parity loss S-B16 introduced.
100-cert parity probe:
MAE 5.04 → 4.39 (recovered to S-B15 best state)
bias -1.20 → -0.17
within ±10: 93% → 96%
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Confirmed against the official RdSAP enum in
datatypes/epc/domain/epc_codes.csv:
1 = dual (off-peak / Economy-7)
2 = Single (standard tariff)
3 = Unknown (verified against Elmhurst assessor software:
treated as Single)
4 = dual (24 hour) (off-peak)
5 = off-peak 18 hour (off-peak)
Different from the SAP-Schema enum (1=standard / 2=off-peak) — the
transform.py docstring referenced the SAP enum, not RdSAP. Our corpus
is RdSAP so we use the RdSAP codes.
This locks in the meter_type-based tariff selection from S-B15 with
the authoritative enum, replacing the earlier heating-code heuristic.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Per user guidance: trust the cert's lodged meter_type as the source of
truth for tariff selection, rather than inferring tariff from heating
code lists. SAP10 meter_type enum (verified empirically on the 250k
corpus: 75% type 2, 14% type 1, 11% type 3):
1 = Off-peak (Economy-7 / dual rate)
2 = Single (Standard)
3 = Off-peak (24-hour heating)
The transform.py docstring describes 1=Standard / 2=Off-peak but that
contradicts the 75% type-2 distribution (UK demographics don't put 75%
of dwellings on off-peak). The inverted reading parity-tests correctly.
Tariff routing rules:
- Space heating: off-peak rate when main fuel is electric AND meter is
off-peak; else standard main-fuel rate.
- Hot water: off-peak rate when water fuel is electric AND meter is
off-peak; else water-fuel rate.
- Lighting + pumps + fans: always standard electricity (Table 12a
notwithstanding — cert software empirically uses standard here).
100-cert parity probe:
MAE 4.40 → 4.39 (flat in aggregate; structurally cleaner code)
RMSE 5.63 → 5.56
bias +0.16 → -0.17
within ±10: 96% (unchanged)
The meter_type seam replaces the e7_eligible_main_codes set on
PriceTable. Conceptually cleaner: tariff is a property of the meter,
not the heating system.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Hand-trace cert 0340-2394 (73m² mid-floor flat, code 691 electric room
heater, actual SAP 75, predicted 56) confirmed the cert software
applies off-peak rates to electric room heaters when the dwelling has
the E7-tariff hallmarks (electric immersion HW cylinder). Extending
the cert-calibration E7-eligible set from {191-196, 401-409, 421-425}
to add {691-696}.
100-cert parity probe:
MAE 4.48 → 4.40 (-0.08)
RMSE 5.81 → 5.63
bias -0.52 → +0.16 (essentially centered)
within ±10: 95% → 96%
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Water heating codes 907 (single-point gas) and 909 (electric instantaneous)
describe no-cylinder, point-of-use systems with no primary circuit.
The predicted_hot_water_kwh model was adding 366 kWh cylinder-storage
loss + 245 kWh primary-pipework loss on top of useful demand for these
certs — over-counting HW by 600+ kWh.
Discovered hand-tracing cert 2903-8339 (11m² Top-floor flat studio,
water_heating_code=909, actual SAP 75, predicted 55).
100-cert parity probe:
MAE 4.53 → 4.48 (-0.05)
RMSE 5.96 → 5.81
bias -0.57 → -0.52
Smaller MAE delta than S-B12 because instantaneous-HW certs are a
smaller subset, but the affected dwellings are exactly the worst-
residual tail.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The legacy water_heating_efficiency(901, main_code) returns 0.80 (gas
boiler default) when sap_main_heating_code is None — even if the main
system is a heat pump (category=4, efficiency 2.30). For "from main
system" water codes (901/902/914), we must inherit through the FULL
main-heating cascade including the category fallback.
Discovered by hand-tracing cert 0320-2850 (Semi-detached bungalow,
heat-pump main with no SAP code lodged, actual SAP 70, predicted 49).
HW was being charged at 0.80 eff for a 2.30-eff dwelling — 2.9× too
much HW fuel.
100-cert parity probe:
MAE 4.66 → 4.53 (-0.13)
RMSE 6.27 → 5.96
bias -0.70 → -0.57
within ±10: 94% → 95%
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Hand-tracing cert 0800-1364 (Detached bungalow, code 191/direct-electric,
actual SAP 71, predicted 37) showed the cert assessor applies off-peak
rates to direct-electric main heating despite SAP 10.2 Table 12a
specifying 90% high-rate. Adds e7_eligible_main_codes to PriceTable so
each price source carries its own rule:
- SAP_10_2_SPEC_PRICES: {401-409, 421-425} (storage only, per Table 12a)
- CERT_CALIBRATION: {191-196, 401-409, 421-425} (empirically what
the cert software does)
100-cert parity probe:
MAE 4.99 → 4.66 (recovered to pre-S-B9 best state)
bias -1.03 → -0.70
within ±1: 23% → 24%
within ±3: 47% → 48%
within ±10: 93% → 94%
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Separates the SAP-spec source of truth from the empirical cert-
calibration prices. cert_to_inputs() now accepts a `prices: PriceTable`
parameter defaulting to SAP_10_2_SPEC_PRICES (3.64 gas, 16.49 elec,
9.40 7h-low — verbatim from SAP 10.2 §12.2 / Table 12). Parity probe
passes the empirical cert_calibration_prices() factory from
domain.sap.tables.table_12_cert_calibration which carries the lower
prices that match the cert assessor software's actual output (3.48,
13.19, 5.50).
This split is documented in both table modules: cert calibration is
explicitly NOT spec-correct, it just matches observed cert behaviour
for parity testing.
100-cert parity probe with cert-calibration prices:
MAE 6.66 → 4.99 (recovered from spec-price regression; also -0.41
from absolute baseline thanks to other S-B fixes)
RMSE 10.29 → 7.13
bias -4.66 → -1.03
within ±1: 20% → 23%
within ±3: 38% → 47%
within ±5: 63% → 67%
within ±10: 82% → 93%
Session-B progress overall (S-B2 baseline → here): MAE 8.41 → 4.99,
within ±1 doubled (10% → 23%).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Verified against the SAP 10.2 spec (14-03-2025): Table 12 unit prices
are IDENTICAL to SAP 10.3 Table 12. Both specs mandate (§12.2): "Fuel
costs are calculated using the fuel prices given in Table 12. Other
prices must not be used for calculation of SAP ratings." The legacy
ML-pipeline prices in domain.ml.sap_efficiencies (3.48 gas, 13.19 elec,
5.50 E7-low) do NOT match either SAP 10.2 or 10.3 and appear to be a
pre-2022 holdover.
New module domain.sap.tables.table_12 carries the spec-correct
values:
mains gas: 3.64 (was 3.48 legacy)
standard electricity: 16.49 (was 13.19)
7h-low / Economy-7: 9.40 (was 5.50)
24h-heating: 14.04 (was 6.61)
Also corrects an S-B4 bug: SAP 10.2 Table 12a shows direct-acting
electric heating (codes 191-196) runs at 90% high-rate on 7h tariffs,
not 0% — only true storage heaters (401-409, 421-425) bill at the
low rate. _E7_SPACE_HEATING_CODES narrowed accordingly.
100-cert parity probe with spec-correct prices:
MAE 4.66 → 6.66 (regression vs legacy prices)
bias -0.70 → -4.66 (over-counting cost)
spec-correctness: SAP 10.2 verbatim
The MAE regression confirms the corpus's lodged ratings were NOT
calculated against the published SAP 10.2 Table 12 prices. The cert
ratings appear to use the legacy lower prices despite reporting
sap_version=10.2. Three paths forward documented in next commit's
discussion thread.
Also adds the SAP 10.2 spec PDF to docs/sap-spec/.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When the main heating is electric storage / direct-electric (codes
191-196, 401-409, 421-425), the cert almost always carries an
Economy-7 tariff and the immersion HW cylinder runs on the off-peak
timer. Bill HW at the 7h-low rate (5.5 p/kWh) in that case, falling
back to the lower of {7h-low, water_heating_fuel rate} so we never
over-charge an HW fuel that's already cheaper than off-peak.
100-cert parity probe:
MAE 4.90 → 4.66 (-0.24)
bias -1.44 → -0.70 (over-correction halved)
within ±3: 46% → 48%
within ±5: 67% → 68%
within ±10: 93% → 94%
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
SAP 10.3 §12 charges fuel costs by end-use, not by main heating fuel.
For a gas-heated dwelling with an electric immersion hot-water cylinder,
HW bills at the electric rate (13.19 p/kWh) not the gas main-heating
rate (3.48 p/kWh) — a 3.8× cost difference for HW that propagates
straight to ECF. Lighting, central-heating pumps, and fans always
electric regardless of main fuel.
Discovered by hand-tracing cert 8035-9023 (Detached bungalow, actual
SAP 43, predicted 63). Trace showed our hot-water + lighting + pumps
lines were charging mains-gas rates throughout, under-counting cost by
~£290/yr.
100-cert parity probe (biggest single Session-B slice so far):
MAE 5.70 → 4.90 (-0.80, -14%)
RMSE 7.48 → 6.68 (-11%)
within ±1: 20% → 24%
within ±3: 37% → 46%
within ±5: 54% → 67%
bias +1.50 → -1.44 (over-corrected by ~3 SAP points)
The over-correction (bias now slightly negative) means we're now
under-predicting on average. Next slice tackles where we're charging
too much electricity — probably HW on dwellings with combi boilers (no
immersion, water still on main fuel) and the water_heating_code 901
("from main system") inheritance path.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replaces the two hardcoded glazing defaults (g⊥=0.63, FF=0.7) in the
cert→inputs mapper with spec-driven lookups:
- g_perpendicular by glazing_type (Table 6b):
single → 0.85, double 2002+ → 0.72, low-E soft → 0.63,
secondary → 0.76, triple → 0.68. Default 0.72 when missing.
- frame_factor by frame_material (Table 6c):
wood/PVC/composite → 0.70, aluminium/steel/metal → 0.83.
Measured values from window_transmission_details / SapWindow.frame_factor
still take precedence. Overshading factor stays at 0.77 ("average") since
RdSAP 10 doesn't lodge a per-window overshading code.
100-cert parity probe:
MAE 5.65 → 5.70 (flat)
exact-match within ±1: 18% → 20%
bias +1.13 → +1.50
Slight bias drift toward over-prediction is expected — bigger solar
gains reduce predicted heating demand. Net: the engine is now more
spec-correct (more exact matches), but composition of errors elsewhere
needs the next slice to bring bias back toward 0.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Maps the Table 9 main_heating_control code to SAP control type 1/2/3:
codes 2101-2104 = type 1, 2105-2109 = type 2, 2110+ = type 3. Default
remains type 2 when code is missing or unrecognised.
Two other fixes tried-and-reverted in this slice based on the 100-cert
parity probe:
- NI-thickness → None (the "wall insulated but thickness unknown,
use 50mm row" path): over-corrected in aggregate because many "NI"
certs are genuinely uninsulated. Reverted to legacy NI→0 with a
note to revisit once wall_insulation_type is used as a stronger
signal.
- boiler-age efficiency rescue (cat 1/2, A-F → 0.74, K-M → 0.85):
same issue — stacked with NI fix it over-shot, on its own it gave
marginal MAE without bias improvement. Dropped pending further
investigation.
100-cert parity probe:
MAE 5.72 → 5.65 (-0.07; control-type-only is a small net win)
RMSE 7.58 → 7.48 (-0.10)
bias +1.20 → +1.13
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Splits the single CalculatorInputs.fuel_unit_cost_gbp_per_kwh into three
end-use lines — space_heating, hot_water, other — to match SAP 10.3 §12
which charges different tariffs per end-use on Economy-7 dwellings.
cert→inputs rule: when sap_main_heating_code is in the electric-storage
(401-409), high-heat-retention storage (421-425), or direct-electric
(191-196) ranges, space heating bills at the 7h-low rate (5.5p/kWh)
while hot water + lighting + pumps stay on standard electricity
(13.19p/kWh). All other fuels use a single rate across all three end-
uses.
100-cert parity probe impact:
MAE 7.53 → 5.72 (-1.81, -24%)
RMSE 11.60 → 7.58 (-4.02, -35%)
worst residual -56 → -25 (Semi-detached bungalow)
within ±10: 85% → 91%
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds services/ml_training_data/src/ml_training_data/sap_parity_probe.py
— samples N certs from the v18a corpus, streams them via BulkZipReader,
runs Sap10Calculator, prints MAE/RMSE/bias + worst-N residuals. Baseline
across 100 certs: MAE 8.41, RMSE 13.98, bias -2.65, 0 errors.
docs/sap-spec/PARITY_FINDINGS.md captures the dominant failure pattern
(flats + bungalows under-predicted, 10 of the worst-15 are flats whose
floor/roof are party with neighbouring dwellings) and the priority-
ordered Session B iteration backlog (S-B-flat-surfaces first).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Pure-function ParityCase / ParityReport / build_parity_report for the
Session B 1000-cert parity check (ADR-0009). Aggregates per-cert
(predicted, actual) sap pairs into global + typical-subset MAE, RMSE,
bias, and the worst-N residuals for spec-iteration. Cert→case mapping
(corpus load, calculator run, actual-sap lookup) sits at a higher
layer; this module is trivial to test so the harder integration code
inherits its testing.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds domain.sap.rdsap.cert_to_inputs.cert_to_inputs(epc) which produces a
typed CalculatorInputs from an EpcPropertyData, and a thin
Sap10Calculator.calculate(epc) entry point that wraps the mapper + the
S-A7a orchestrator. Defaults follow RdSAP 10 (Table 27 for living-area
fraction, Table 5 for ventilation, Table 12 for fuel cost + CO2 factor)
and SAP 10.3 Tables 4a/4b for heating efficiency via the existing
domain.ml.sap_efficiencies cascade.
Deferred to Session B: conservatory modes, room-in-roof, secondary
heating split (Table 11), multi-fuel weighted cost, thermal-mass
parameter from construction type, control-temp adjustment from
main_heating_control code.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wires SAP 10.3 §§5-13 into a 12-month heat-balance loop driven by a typed
CalculatorInputs aggregate, returning a typed SapResult with the score,
ECF, costs/CO2 totals, and a 12-entry monthly breakdown. Physics
assembly only — the cert→inputs mapper lands in S-A7b. η/T_internal
solved with two-pass iteration per SAP 10.3 §7.3.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Tenth slice of the SAP10 Calculator Session A (ADR-0009). Ships four
pure functions under domain.sap.worksheet.rating implementing the SAP
10.3 rating formulas:
energy_cost_factor(total_cost_gbp, total_floor_area_m2)
-> equation (7): ECF = 0.36 × cost / (TFA + 45)
Deflator 0.36 sourced from Table 12 (page 191).
sap_rating(ecf)
-> equations (8)/(9), continuous (un-rounded) SAP value:
ECF ≥ 3.5: 108.8 − 120.5 × log10(ECF)
ECF < 3.5: 100 − 16.21 × ECF
Naturally rises above 100 for net energy exporters (negative ECF).
sap_rating_integer(ecf)
-> integer SAP value as published on the EPC: round to nearest, clamp
to minimum 1 per §13.
environmental_impact_rating(co2_emissions_kg_per_yr, total_floor_area_m2)
-> equations (10)-(12), continuous EI rating:
CF = CO2 / (TFA + 45)
CF ≥ 28.3: 200 − 95 × log10(CF)
CF < 28.3: 100 − 1.34 × CF
8 AAA cycles cover: ECF formula hand-computed, SAP linear branch (typical
home), SAP log branch (high cost), boundary continuity at ECF=3.5,
net-exporter SAP > 100, integer rounding + min-1 clamp, EI linear branch,
EI log branch.
Orchestrator (S-A7) wires these into Sap10Calculator alongside the monthly
heat balance loop from S-A5e.
Ninth slice of the SAP10 Calculator Session A (ADR-0009). Ships
monthly_heat_requirement_kwh implementing the Table 9c step-10 formula:
L_m = H × (T_i,m − T_e,m) (W)
Q_heat,m = 0.024 × (L_m − η_m × G_m) × n_m (kWh)
with the table's clamp: Q_heat is set to 0 when negative or below 1 kWh
per month (summer months and well-insulated dwellings in shoulder
months).
The orchestrator (S-A6) iterates utilisation factor + mean internal
temperature until they converge before calling this function.
5 AAA cycles cover: typical-winter-month hand-computed worked example,
summer month with gains exceeding losses clamping to 0, gains-scaling
direction check, external-temperature direction check, and the sub-1-kWh
clamp per the Table 9c note.
Eighth slice of the SAP10 Calculator Session A (ADR-0009). Implements
SAP 10.3 mean internal temperature with three public helpers under
domain.sap.worksheet.mean_internal_temperature:
elsewhere_heating_temperature_c(hlp, control_type)
-> Table 9 T_h2 formula:
control type 1: T_h2 = 21 − 0.5 × HLP
control type 2 or 3: T_h2 = 21 − HLP + HLP² / 12
HLP clamped to 6.0 per Table 9 note (e).
off_period_temperature_reduction_c(t_off, T_h, T_e, R, G, H, η, τ)
-> Table 9b u value (°C drop below T_h over an off-period):
t_c = 4 + 0.25·τ
T_sc = (1−R)(T_h−2) + R·(T_e + η·G/H)
quadratic branch when t_off ≤ t_c, linear when t_off > t_c.
mean_internal_temperature_c(...)
-> Table 9c steps 1-8: living-area zone (off 7+8 h, T_h1=21°C) and
elsewhere zone (off 7+8 h for control 1/2 or 9+8 h for control 3,
T_h2 from above), blended by living_area_fraction, plus the
Table 4e control-type temperature adjustment.
Step 9 (re-compute utilisation factor with the new T_i) and step 10
(Q_heat = 0.024 × (L − η·G) × n_m) live in the next slice's monthly loop.
7 AAA cycles cover: T_h2 formulas for control types 1 vs 2, HLP > 6 clamp
per note (e), off-period u quadratic branch (t_off ≤ t_c), off-period u
linear branch (t_off > t_c), full mean_internal_temperature hand-computed
worked example, and control-type-3 longer first off-period dropping mean
temp slightly below control-type-2.
Seventh slice of the SAP10 Calculator Session A (ADR-0009). Ships
utilisation_factor(*, total_gains_w, heat_loss_rate_w, time_constant_h)
implementing SAP 10.3 Table 9a:
a = 1 + τ / 15
γ = G / L
if γ > 0 and γ ≠ 1: η = (1 − γ^a) / (1 − γ^(a+1))
if γ = 1: η = a / (a + 1)
if heat_loss_rate ≤ 0: η = 1 (dwelling in net surplus)
η caps the contribution of internal + solar gains when they outpace the
heat-loss rate. The orchestrator computes time_constant_h = TMP /
(3.6 × HLP) and passes it in here; that's a future slice.
5 AAA cycles cover: small γ → η ≈ 1, γ = 1 special-case formula,
zero/negative heat loss returning η = 1, large γ dropping η well below
0.5, and higher τ (more thermal mass) raising η for the same γ.
Sixth slice of the SAP10 Calculator Session A (ADR-0009). Two layers
under domain.sap.worksheet.solar_gains:
1. surface_solar_flux_w_per_m2(orientation, pitch_deg, region, month)
— implements Appendix U §U3.2 polynomial that converts the horizontal
solar irradiance from Table U3 to per-orientation per-pitch surface
flux:
S(orient, p, m) = S_h,m × R_h-inc
R_h-inc = A cos²(φ-δ) + B cos(φ-δ) + C
where A, B, C are cubics in sin(p/2) with coefficients k1-k9 from
Table U5. Reads latitude φ from Table U4 and solar declination δ
from Table U3 footer (already in domain.sap.climate.appendix_u).
2. window_solar_gain_w(area_m2, surface_flux, g⊥, FF, Z)
— implements §6.1 equation (5): G = 0.9 × A × S × g⊥ × FF × Z.
Orientation enum maps the 8 SAP cardinal codes to the 5 Table U5 columns:
N/S to their own column; NE/NW share; E/W share; SE/SW share.
7 AAA cycles cover: UK average South vertical July hand-computed flux,
rooflight pitch=0 collapses to horizontal Table U3 directly, North-vertical
summer > winter (diffuse signal), NE/NW share constants symmetry, equation
(5) window gain, zero-area edge case, out-of-range region validation.
Tables 6b (g⊥), 6c (frame factor), 6d (overshading Z) defaults deferred
to the cert→inputs mapper slice — callers pass them explicitly here so
the physics stays cert-shape-independent.
Fourth slice of the SAP10 Calculator Session A (ADR-0009). Ports the
per-element conduction HLC logic out of domain.ml.envelope into a typed
HeatTransmission breakdown under domain.sap.worksheet. Aggregates Σ U×A
across walls, roof, floor, party walls, windows, doors, plus thermal-
bridging y × total exposed area, summed across every building part.
The orchestrator can now read walls_w_per_k / roof_w_per_k / floor_w_per_k
etc. directly off the result for audit + monthly-loop wiring, rather than
seeing a single envelope_heat_loss scalar.
U-value cascade still routes through domain.ml.rdsap_uvalues (migrates to
domain.sap.rdsap.cascade_defaults in Session B per ADR-0009 module-layout
plan). domain.ml.envelope stays in place to keep the ML transform's
physics-feature pipeline running until Session B.
6 AAA cycles cover: per-element breakdown for a baseline age-G cavity
mid-terrace, window net-wall subtraction, insulated-door U-value blending,
cavity-party-wall contribution per Table 15, thermal-bridging scaling by
age band per Table 21, and multi-part (main + extension) aggregation.
192 tests pass across domain.sap + domain.ml — no regressions.
Third slice of the SAP10 Calculator Session A (ADR-0009). Ports the SAP
10.2 / RdSAP10 §4.1 air-change-rate worksheet for the no-pressure-test
path. Returns an InfiltrationBreakdown carrying each named worksheet line
so callers can audit per SAP convention:
(8) openings_ach — Table 2.1 rate × count / volume
(10) additional_ach — (storey_count − 1) × 0.1
(11) structural_ach — 0.25 steel/timber-frame, 0.35 masonry
(12) floor_ach — 0.2 unsealed timber / 0.1 sealed / 0
(13) draught_lobby_ach — 0.05 absent, 0.0 present
(15) window_ach — 0.25 − 0.2 × (pct_dp / 100)
(16) total_ach — sum of all of the above
Table 2.1 rates: open chimney 80, open flue 20, closed-fire chimney 10,
solid-fuel-boiler chimney 20, other-heater chimney 35, blocked chimney
20, intermittent fan 10, passive vent 10, flueless gas fire 40 (all
m³/hour per opening).
9 AAA cycles cover the baseline calculation, each Table 2.1 opening
contribution, frame-vs-masonry structural baseline, suspended-timber
floor sealed/unsealed split, draught-lobby presence, window draught-
proofing scale, multi-opening aggregation, and volume_m3 ≤ 0 validation.
Pressure-test override (worksheet lines 17-21) and mechanical-ventilation
adjustments (Table 4g, n_eff formula §2.6.6) are out of scope for this
slice — separate later slices per ADR-0009.
Second slice of the SAP10 Calculator Session A (ADR-0009). Ships a frozen
Dimensions dataclass + dimensions_from_cert(epc) pure function under
domain/sap/worksheet/. Aggregates geometry across every sap_building_parts
entry (main dwelling + each extension): total floor area, volume, storey
count, area-weighted average storey height, ground/top floor area,
ground-floor heat-loss perimeter, gross wall area, party wall area.
Top-level epc.total_floor_area_m2 is the authoritative TFA; per-storey
sums drive the wall-area calculations. Volume = TFA × avg_storey_height.
5 AAA cycles cover: single-storey single-part, two-storey scaling,
main+extension aggregation, empty-cert fallback to default 2.5 m height,
and a non-default-height terrace exercising party-wall scaling.
Edge cases (porches, conservatories, integral garages, RIR storey
treatment) deferred to later slices per ADR-0009 Session A scope.
First slice of the SAP10 Calculator Session A (ADR-0009). Ships the three
SAP 10.3 Appendix U monthly tables across 22 climate regions (region 0 =
UK average; 1-21 named per spec) as a pure-data module under the new
domain/sap/ package:
- Table U1: mean external temperature (°C)
- Table U2: wind speed (m/s)
- Table U3: mean global solar irradiance on horizontal plane (W/m²)
- Table U3 footer: monthly solar declination (°, region-independent)
Lookups validate region (0..21) and month (1..12) and raise ValueError
on out-of-range inputs. 11 AAA tests cover happy-path lookups across
multiple regions/months plus boundary and error cases.
Promotes ADR-0009 from Proposed to Accepted after the grill-with-docs
session resolved all seven open questions. Bundles the SAP 10.3 and
RdSAP 10 specifications under docs/sap-spec/ plus a calculator design
sketch (module layout, monthly-loop pseudo-code, status table).
CONTEXT.md adds three new domain terms parallel to existing performance
language:
- Calculated SAP10 Performance (parallel to Effective / Lodged)
- SAP10 Calculation (process; implemented by Sap10Calculator)
- Measure Application (process; implemented by MeasureApplicator)
ML pipeline is NOT retired — it stays as the residual head once the
calculator reaches parity in Session B. ADR-0009 §"Grill outcomes" carries
the seven binding scope decisions plus three Session-A-scope changes
discovered during the grill (RdSAP §19 EER formula, SAP 10.2 Appendix A
cross-reference, RdSAP Table 29 cascade defaults).
v20a added ventilation_heat_loss_w_per_k as a standalone feature but never
connected it to the HLC inside predicted_space_heating_kwh, so the
downstream physics aggregates (predicted_ecf, predicted_total_fuel_cost,
predicted_log10_ecf — the top-10 model features) never saw the
infiltration signal. Importance for ventilation_heat_loss_w_per_k was rank
58/196 (importance 30) vs envelope's rank 21 (86).
Adds the ventilation column to the envelope-conduction HLC before
applying HDH and efficiency, so chimney + draught-proofing signals flow
through the physics aggregates the model actually uses. Default 0 keeps
backwards compatibility.
Adds SAP10.2 §C tracer-bullet infiltration model as a new physics-as-feature
column alongside envelope_heat_loss_w_per_k. ACH = structural baseline
(0.35 masonry / 0.25 timber-or-system-built) + open chimneys at 40 m³/h each
minus a draught-proofing reduction scaled by window_pct_draught_proofed,
then volumed and converted to W/K. Targets the d0 catastrophic-low-SAP tail
where chimney + leakage signals dominate but envelope conduction alone
under-counts heat loss.
Scope deferred to follow-ups: MVHR/MEV factors (mechanical_ventilation is
100% null in the corpus), pressure-test override (pressure_test also 100%
null - slice 18e mapper fix), open flues / passive vents / flueless gas
fires (sap_ventilation sparsely populated).
Many real certs carry main_heating_category=4 (heat pump) but null
sap_main_heating_code, so seasonal_efficiency() was returning the 0.80
gas-boiler default — a 3x COP under-count that dragged the high-SAP
heat-pump tail. Adds main_heating_category + main_fuel_type fallbacks:
cat=4 -> 2.30, cat=7 -> 1.00, cat=10 routes by fuel
(electric=1.00, gas=0.55, oil=0.65), cat=5 warm air -> 0.76.
Explicit SAP codes still win.
When wall_construction integer is missing or WALL_UNKNOWN, u_wall now
parses the top-level walls[i].description for material keywords
(sandstone/limestone/granite/whinstone/cob/system built/timber frame/
solid brick/cavity) before falling through to the cavity-by-age default.
Explicit construction codes still win. Threaded through
envelope_heat_loss_w_per_k via a joined wall description string off the
top-level walls list.
Table 18 age-band roof defaults assume joist insulation >= 100mm, which
mis-rates heritage roofs the surveyor explicitly described as
uninsulated. u_roof now reads roofs[i].description and routes
"no insulation" / "uninsulated" -> 2.30 W/m^2K and "limited insulation"
-> 1.50 W/m^2K, threaded through envelope_heat_loss_w_per_k via a single
joined description string off the top-level roofs list.
Explicit insulation_thickness_mm still wins over description.
predicted_total_fuel_cost_gbp was silently mispricing every non-gas
property because primary_main_fuel_type / water_heating_fuel store the
gov EPC API enum (26=mains gas, 27=LPG, 28=oil, 29=electricity) and our
_FUEL_UNIT_PRICE dict is keyed by Table 32 codes (1=gas, 4=oil, 30=elec).
Codes 26-29 hit the dict's default 3.48 p/kWh -- silently treating
electric immersion as gas.
Concrete impact on OX1 5LR Sep 2025 cert (worst-predicted SAP=41, model
84): water_heating_fuel=29 (electric immersion). Real DHW cost 2941 kWh
* 13.19p = £388/yr; we computed 2941 * 3.48 = £102 (4x under). Net
predicted_total_fuel_cost £292 vs implied real £2513 -- predicted_ecf
0.49 (~SAP 93) vs real ECF 4.24 (SAP 41).
Effect: every off-gas property's predicted_ecf was systematically too
low, dragging the model's catastrophic-low-SAP predictions toward
mid-band. Expected to substantially reduce decile-0 bias on retrain.
New _API_TO_TABLE32 map covers codes 0-29. 4 new AAA tests; VERSION
2.2.0 -> 2.3.0 (MINOR; behavioural fix to existing column values).