Commit graph

4895 commits

Author SHA1 Message Date
Khalim Conn-Kowlessar
5cc68ab3fd §4 slice 2: hot_water_other_uses_monthly_l_per_day (line (42c)m)
Appendix J equation J11 — daily hot water use for non-shower / non-bath
purposes (sinks, dishwashers, etc.) is annual-avg V_d,other,ave = 9.8 ×
N + 14, modulated month-by-month by the Table J2 monthly factors and
reduced by 5% when the dwelling meets the 125 L/person/day water-use
target.

Validated against both Elmhurst non-RR fixtures to better than 1e-3 L:
  - 000490 N=2.1468 → V_d,other,ave ≈ 35.04, Jan = 38.5426
  - 000474 N=1.8896 → V_d,other,ave ≈ 32.52, Jan = 35.7697

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 15:43:56 +00:00
Khalim Conn-Kowlessar
aff678e8eb §4 slice 1: assumed_occupancy (worksheet line (42), Appendix J)
First slice of the §4 worksheet-driven rewrite (xlsx rows 207-304).
New module `domain/sap/worksheet/water_heating.py` lands the line-ref
mapped functions; subsequent slices append below.

`assumed_occupancy(tfa)` implements the SAP10.2 Appendix J Table 1b
piecewise formula. Validated against:
  - canonical xlsx worked example  (TFA Q23 → N U209)
  - Elmhurst U985-0001-000474       (TFA 56.79 → N 1.8896)
  - Elmhurst U985-0001-000490       (TFA 66.06 → N 2.1468)
  - boundary case TFA ≤ 13.9        (N=1 floor)

The legacy `domain.ml.demand._default_occupants_sap_j` mirror stays in
place until the §4 worksheet rewrite is complete; both sources will be
reconciled in a later slice once dependent callers move over.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 15:27:03 +00:00
Khalim Conn-Kowlessar
d90827446a docs: sweep stale handover, mark §3 Full, scaffold §4 slice plan
§3 close (LINE_31/33/36/37 exact for both non-RR Elmhurst worksheets) is
now landed across slices 344a9c9d..cf244762. HANDOVER_S3_CLOSE.md was
written as a mid-stream working brief; with §3 done it now creates doc
rot, so it's removed in favour of SPEC_COVERAGE.md as the single source
of truth.

SPEC_COVERAGE.md updates:
  - §3 marked Full (non-RR); RR sub-area deferral noted
  - §4 carries the ordered slice plan for the worksheet-driven rewrite
    (xlsx rows 207–304, line refs (42)..(65))
  - Hierarchy callout: the canonical SAP10.2 algorithm lives in the
    repo-root xlsx, not in any handover doc

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 15:18:46 +00:00
Khalim Conn-Kowlessar
cf244762d5 Elmhurst 000474: §3 LINE_33 + LINE_37 close exactly
Closes the second non-RR Elmhurst worksheet (mid-terrace, 3 parts).
LINE_33 (209.1084) and LINE_37 (232.1169) reproduce to 0.1 W/K.

Cert inputs lodged on the fixture:
  - Ext1 SapFloorDimension(is_exposed_floor=True) — Table 20 route
  - Ext2 ground floor (tiny 1.35 m², P=3.30) stays on Table 19 fn 1
    suspended-timber default for age B (cascade → U≈1.25, worksheet 1.25)
  - door_count=2 → 3.70 m² total door area
  - WINDOW_TOTAL_AREA_M2=11.72 split across two glazing types
    (Type 1: 6.22 m² post-2002 raw U=2.0, Type 2: 5.50 m² pre-2002 raw
    U=2.8). Area-weighted aggregate raw U=2.37 reproduces the worksheet's
    25.37 W/K through the curtain-resistance transform.

Non-RR §3 scope closed:
  - LINE_31  exact (existing test)
  - LINE_33  exact ← this slice + the 000490 slice
  - LINE_36  exact (existing test, y × LINE_31)
  - LINE_37  exact ← this slice + the 000490 slice

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 14:14:08 +00:00
Khalim Conn-Kowlessar
4479fc69ac Elmhurst 000490: §3 LINE_33 + LINE_37 close exactly
End-to-end §3 fabric heat loss now matches the Elmhurst worksheet to
0.1 W/K (the worksheet displays per-element U-values to 2 d.p.; our
cascade keeps full precision so the totals differ at the third decimal).

Cert inputs lodged on the fixture:
  - roof_insulation_thickness=300 mm on Main and Ext1 → Table 16 U=0.14
  - door_count=2 (cascade default 1.85 m²/door → 3.70 m² worksheet area)
  - WINDOW_TOTAL_AREA_M2=9.03 with WINDOW_AVG_RAW_U_VALUE=2.8 (pre-2002
    double-glazed PVC, 12mm gap; Table 24 row → U_eff=2.518)

Per-part window/door apportionment cancels in the §3 line totals — net
wall sums to the same value whether openings sit on Main or Ext1 — so a
single aggregate area/U pair reproduces (33) exactly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 14:11:07 +00:00
Khalim Conn-Kowlessar
269dd991b5 Elmhurst 000490 fixture: tag Ext1 floor as exposed timber
Per the worksheet docstring on this fixture, Extension 1 hangs off the
main from the first storey upward — its lowest dimension is an exposed
timber floor (over outside air), not a ground floor on soil. Set
is_exposed_floor=True so heat_transmission_from_cert routes Ext1 through
the Table 20 lookup (U=1.20 W/m²K at age B unknown insulation) instead
of BS EN ISO 13370.

Combined with the Table 19 fn 1 default that routes Main to the
suspended-timber branch (U≈0.71), §3 LINE_28A floor sum lands at
≈32.4 W/K — matching the worksheet's 0.71×14.85 + 1.20×18.18.

A new floor-sum regression test pins the combined behaviour; the existing
LINE_31/36 parametrised test still passes (the exposed-floor route
contributes its area to LINE_31 the same way the ground-floor route did).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 13:28:23 +00:00
Khalim Conn-Kowlessar
6b99ad0a55 heat_transmission: route exposed/semi-exposed floors through Table 20
SapFloorDimension gains an is_exposed_floor flag (default False) signalling
that the floor sits over outside air or unheated space rather than soil —
typical for an extension that hangs off the main from the first storey
upward (Elmhurst 000490 Extension 1 is exactly this shape).

heat_transmission_from_cert now consults the flag on the part's ground
SapFloorDimension and dispatches to u_exposed_floor (Table 20) instead
of the BS EN ISO 13370 / Table 19 cascade. Basement floor still wins
priority (Table 23 § 5.17 overrides everything else for that part).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 13:22:44 +00:00
Khalim Conn-Kowlessar
e2c37300ec u_exposed_floor: Table 20 lookup for exposed/semi-exposed upper floors
RdSAP10 §5.13 Table 20 (page 47) gives U-values for upper floors that
sit over outside air (exposed) or enclosed unheated space (semi-exposed) —
e.g. an extension hanging off the main from the first storey upward.
The spec collapses both into the same lookup: keyed on age band ×
insulation thickness, no geometry needed.

Elmhurst worksheet U985-0001-000490 Extension 1 records U=1.20 W/m²K
for its exposed timber floor (age B, no insulation). Table 20 row
"A to G, insulation unknown or as built" returns 1.20 exactly.

Caller wiring (heat_transmission_from_cert routing on a floor_position
discriminator) lands in the next slice.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 13:19:46 +00:00
Khalim Conn-Kowlessar
344a9c9d5e u_floor: route age A,B unknowns to suspended-timber branch (Table 19 fn 1)
RdSAP10 §5.12 Table 19 footnote (1): when floor_construction is unknown,
age bands A and B default to suspended timber, not solid. Previously
u_floor always used the BS EN ISO 13370 solid-floor formula, which
under-counted ~14% on pre-1929 dwellings.

Elmhurst worksheet U985-0001-000490 Main Dwelling (A=14.85, P=7.42,
w=0.400, age B) records floor U=0.71 W/m²K — the suspended-floor formula
on §5.12 page 46 reproduces this exactly. The solid branch returned 0.66.

Description prefixes "Solid, ..." / "Suspended, ..." take precedence over
the age-band default since they're explicit assessor observations.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 13:17:35 +00:00
Khalim Conn-Kowlessar
49e8c65ae8 Handover: replace stale docs with focused §3-close + Table-11 brief
Delete HANDOVER_FRESH_REVIEW (22-slice, MAE-5.34 era) and
HANDOVER_SYSTEMATIC_REVIEW (pre-Elmhurst-conformance). Both described
a state the Elmhurst worksheet work has since superseded.

Add HANDOVER_S3_CLOSE.md with:
- Accurate §3 status: §1/§2 fully done; LINE_31/LINE_36 exact for
  non-RR fixtures; LINE_33 gap diagnosed as missing floor_construction
  codes (not a window-area problem as previously assumed)
- Concrete investigation steps to close LINE_33 for 000474 + 000490
- Table 11 Secondary Heating framed as next slice after §3

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 13:03:09 +00:00
Khalim Conn-Kowlessar
2fd0fe1c08 §3 exact conformance: non-RR LINE_31 + LINE_36 match Elmhurst worksheets
LINE_31 (total external element area) = Σ_parts (gross_wall + roof +
floor). Window and door areas cancel in the net-wall expansion, so LINE_31
is independent of the window/door split. This lets us assert the exact
Elmhurst worksheet (31) for the two non-RR fixtures (000474, 000490)
without needing window-area input data.

LINE_36 = y × LINE_31 follows for free. Both 000474 and 000490 use age
band B throughout (y = 0.15), giving:
  000474: 0.15 × 153.39 = 23.0085
  000490: 0.15 × 164.85 = 24.7275

The per-storey-perimeter fix (e6c768c3) was the prerequisite; without it,
upper storeys with a smaller perimeter than the ground floor were
over-counted (e.g. 000474 Main: 7.07 m ground vs 5.27 m first storey).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 12:47:01 +00:00
Khalim Conn-Kowlessar
a374bd075e P6.1 follow-on: use BuildingPartIdentifier enum in ml/transform + tests
Replace the string literal "Main Dwelling" / "Extension 1" comparisons
in `_building_part_aggregates` and the four affected tests with the
typed `BuildingPartIdentifier.MAIN` / `.EXTENSION_1` enum values, so
the transform is consistent with the typed domain introduced in the P6.1
cert→inputs adapter. Fixes a latent mismatch that would silently return
`main=None` if the string ever drifted.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 12:46:47 +00:00
Khalim Conn-Kowlessar
e6c768c356 Wall + party-wall area = Σ (perim_i × height_i), not ground × avg × count
SAP §3 wall heat-loss area sums each storey individually:
`Σ (heat_loss_perimeter_i × room_height_i)`. Pre-fix used the short-cut
`ground_perimeter × avg_height × storey_count`, which over-counts upper
storeys whenever they have a smaller perimeter than the ground (set-back
top floors, ground-floor additions, etc.). RdSAP §5.10 party-wall area
follows the same per-storey-sum convention.

Surfaced by Elmhurst 000474 Main (ground perim 7.07, first 5.27): our
gross-wall over-counted by ~10 m², the (29a) W/K downstream by ~15 W/K
on this cert. Documented at the time as follow-up #2; this slice closes
it. The §3 partial-conformance test's gap-#2 entry is removed; gap #1
(RR sub-areas) remains.

Fix lives in two parallel code paths:
- dimensions.py: per-storey accumulation inside the existing fd loop
- heat_transmission.py: _part_geometry now emits gross_wall_area_m2 and
  party_wall_area_m2 directly, dropping the avg_height + storey_count
  intermediate fields (no other consumer)

Tests:
- New: gross_wall_area_sums_per_storey_perimeter_times_height_…
  (2-storey main, ground 10 m / first 6 m, same height — expects
  Σ=40 m² not ground×avg×count=50)
- New: party_wall_area_sums_per_storey_party_length_… (same shape,
  ground party 5 / first party 3 → Σ=20 not 25)
- New: walls_w_per_k_uses_sum_of_per_storey_perimeter_… (heat-
  transmission counterpart: 0.6 × 40 = 24 W/K not 30)

829 tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 10:33:14 +00:00
Khalim Conn-Kowlessar
6ea5727a4e Dimensions: storey_count is dwelling height (max across parts), not sum
SAP §2 (9) "ns" is the dwelling height — the tallest part — which drives
the (10) additional-infiltration adjustment. Pre-fix code summed
`len(sap_floor_dimensions)` across parts and incremented for every
sap_room_in_roof block, so a 2-storey main + 1-storey side extension
returned ns=3 instead of 2, and a 2-part RR-bearing cert could return
ns=4 or 5. The (10) ach output overstated by 0.1 per spurious storey.

Fix tracks per-part `(floor_count + 1 if RR else 0)` and emits
`max(per_part)`. TFA and volume sums on §1 are unaffected — those are
genuine Σ per RdSAP §3.9.1.

Surfaced by Elmhurst 000474 (2-storey + 2 side extensions): worksheet
says ns=2; we previously had to pass `storey_count=fixture.LINE_9_STOREYS`
explicitly in the §2 Elmhurst conformance test. With the fix, the test
now derives `storey_count` from `dims.storey_count` and the
`LINE_9_STOREYS` field cross-checks the derivation against (9).

Tests:
- New: dwelling_storey_count_is_max_across_parts_not_sum (2-storey main
  + 1-storey ext expects ns=2)
- New: room_in_roof_on_main_adds_one_to_dwelling_storey_count_only_once
  (main with RR + ext without RR expects ns=3, not 5)
- Updated: main_plus_extension_sums_areas_perimeters_and_walls assertion
  ns==2 → ns==1 (both parts single-storey)
- Updated: all_rir_shapes_apply_section_1_2_45m_convention_uniformly —
  storey_delta is now ≤1 not len(parts_with_rr); TFA/volume deltas
  remain Σ per the spec
- Updated: §2 Elmhurst test consumes dims.storey_count + asserts
  dims.storey_count == fixture.LINE_9_STOREYS as an Arrange precondition

826 tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 10:27:38 +00:00
Khalim Conn-Kowlessar
883028c89e P6.1 follow-on: unbox BuildingPartIdentifier at backend boundaries
Threads the strict BuildingPartIdentifier type (introduced in a8b443f6)
through the two remaining backend touchpoints:

- EpcBuildingPartModel.from_*: SQLModel column expects a string, so
  unbox the enum with .identifier.value before binding to the DB.
- documents_parser end-to-end tests: swap bare-string equality
  ("main" / "extension_1") for identity checks against the enum
  members (BuildingPartIdentifier.MAIN / EXTENSION_1).

Documents_parser test pack passes (105/105). No dedicated SQLModel test
covers EpcBuildingPartModel.from_*; the .value line is exercised
transitively via db_writer.py / local_runner.py in production.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 09:58:23 +00:00
Khalim Conn-Kowlessar
a8b443f669 SAP calculator entry point + cert→inputs adapter + strict P6.1 identifiers
Lands the production code that the just-committed Elmhurst conformance
fixtures (6455d48b) exercise: the SAP10.3 calculator orchestrator
(domain.sap.calculator.Sap10Calculator), the RdSAP-driven cert→inputs
mapper (domain.sap.rdsap.cert_to_inputs), and the EpcPropertyData
strict-type pass that P6.1 starts.

calculator.py is the entry point. Two surfaces depending on the caller's
shape:
- Sap10Calculator().calculate(epc) — full RdSAP mapper + worksheet loop
- calculate_sap_from_inputs(inputs) — pure physics over typed inputs

P6.1 introduces BuildingPartIdentifier as a strictly-typed replacement
for bare-string matching on SapBuildingPart.identifier (motivated by
the pain point at worksheet/dimensions.py:74-82). Two boundary factories
canonicalise raw inputs: from_api_string for the gov-EPC API, and
extension(n) for site-notes / construction id flows.

Also catches up two transitive deps that 6455d48b implicitly required
but I missed:
- ml/rdsap_uvalues.py — party-wall U-value rows that heat_transmission
  resolves; the U=0.0 branch the 000516 fixture exercises lands here.
- ml/tests/_fixtures.py — make_minimal_sap10_epc that every Elmhurst
  fixture imports. Without this catch-up, checking out 6455d48b in
  isolation would ImportError.

Out of scope (will commit separately): ml/transform.py legacy envelope
drift; backend/ FastAPI + documents_parser layer; etl/ scratch.

824 tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 09:54:30 +00:00
Khalim Conn-Kowlessar
6455d48b9d Elmhurst SAP10.2 worksheet conformance: §1/§2/§3 + 6 fixtures + README
Lands real-cert ground-truth conformance tests for the SAP10.2 worksheet,
asserting our §1 dimensions, §2 ventilation, and §3 heat-transmission
output line-by-line against six Elmhurst-lodged worksheets (000474,
000477, 000480, 000487, 000490, 000516). Each fixture covers a distinct
shape: with/without room-in-roof, single-part vs main+extensions, age
A and B, party-wall U=0.0 vs U=0.25, 1/2/3 sheltered sides, varying
draught-proofing %, and the (12) suspended-timber quirk.

§1/§2/§3 module updates back the new line-refs (LINE_31 external-element
area, LINE_33 fabric loss, LINE_37 total fabric loss; per-fixture (12)
floor / (15) window / (21) shelter-adjusted ach; SapRoomInRoof storey
contribution via the 2.45 m §3.9.1 convention).

The §3 test currently asserts invariants only ((33) = Σ per-element,
(37) = (33) + (36)) because SapRoomInRoof only carries floor_area —
gable/slope/stud/flat-ceiling sub-areas the worksheet itemizes are not
yet modelled. LINE_3* constants capture the worksheet ground truth for
when that gap closes.

Adds a SAP-domain README with a step-by-step guide for adding new
Elmhurst fixtures from the assessor's PDF pair (Summary + worksheet),
including the field-by-field cert → EpcPropertyData mapping table and
the gotchas surfaced across the six fixtures (storey-height +0.25
convention, party-wall U code mapping, has_suspended_timber_floor flag
truth table, (25) effective-ach formula, Energy Rating vs EPC Costs
wind-speed trap).

366 tests pass (was 360 pre-pairs 5-6).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 09:48:30 +00:00
Khalim Conn-Kowlessar
a1c9d2a14d Record post-P5 parity-probe baseline (2026-05-19)
100-cert probe, seed=7, sap_score window 5..99. MAE 4.29
(vs 8.41 on 2026-05-18 with the older 20..95 window — the
delta blends calculator improvements with sample-window
change, so this is logged as the post-P5 reference, not as
"P5 reduced MAE".)

P5 itself was pure trace exposure; the calculator's SAP
output should be numerically unchanged. The headline finding
from this run is primary-energy over-prediction: PE MAE
44.40 kWh/m², bias +39.66 — now the dominant signal with
SAP residuals halved. Each end-use PE contribution surfaces
on SapResult.intermediate per P5.12, so the next session
can localise the bias without re-instrumenting.
2026-05-19 16:19:01 +00:00
Khalim Conn-Kowlessar
411c477d09 P5.14: SAP 10.2 worksheet trace + RdSAP10 deflator drift note
Closes the second half of P5 (HANDOVER_SYSTEMATIC_REVIEW §2.5):
- Adds test_bre_worked_examples.py — one comprehensive test that
  locks every published SapResult.intermediate key against its
  SAP 10.2 worksheet item number ((4) TFA, (33) fabric heat loss,
  (39) HTC, (40) HLP, (73) gains, (93) mean internal temp, (98c)
  space heating, (240e/247/250) costs, (252) PV credit, (256)
  deflator, (257) ECF, (261-272) per-end-use CO2, (275-287)
  primary energy per m²). All formulas derived independently from
  the worksheet pages 131-148; passes against the synthetic
  100 m² baseline.
- Explicit caveat in module docstring: BRE-published worked
  examples don't exist in any of the three SAP-spec PDFs we have
  (rdSAP10, SAP10.2, SAP10.3 — all greppped). The test is
  spec-formula-derived, not BRE-validated. Structure stays if
  BRE numbers surface later; only expected values change.

Also surfaces and documents an RdSAP10 spec drift in
PARITY_FINDINGS.md: Table 32 (page 95 of rdSAP10) gives
Energy Cost Deflator = 0.42, vs the code's 0.36 (SAP10.2 Table 12,
worksheet item (256)). Not changed in P5 — needs ADR-level
resolution on whether the calculator targets SAP10.2 (0.36) or
RdSAP10 (0.42) ratings.

P5 (SapResult.intermediate population + BRE worked-example
fixtures) is now complete on this branch.
2026-05-19 15:32:42 +00:00
Khalim Conn-Kowlessar
0fa39e859c P5.13: SapResult.intermediate exposes per-end-use CO2 breakdown
Closes the second §11-sketch gap noted in HANDOVER_SYSTEMATIC_REVIEW
("primary energy AND CO2 per end-use"). Lifts the single co2 = total
× factor expression into five named locals (main_heating, secondary,
hot_water, pumps_fans, lighting) and exposes them on `intermediate`.
The five components sum exactly to the top-level co2_kg_per_yr — no
PV deduction in the current implementation.
2026-05-19 12:24:59 +00:00
Khalim Conn-Kowlessar
f09e83b6a1 P5.12: align per-end-use primary energy to §11 sketch (per-m²)
P5.9 exposed the four primary-energy components as absolute kWh/yr
keys (space_heating_primary_kwh_per_yr, …). HANDOVER_SYSTEMATIC_REVIEW
§11 specifies these as `_pe_kwh_per_m2` because primary energy enters
the rating equation per floor area. Renamed to match the sketch:
- space_heating_pe_kwh_per_m2
- hot_water_pe_kwh_per_m2
- other_pe_kwh_per_m2
- pv_pe_offset_kwh_per_m2

Chain check now verifies max(0, sum − pv_offset) ≈
result.primary_energy_kwh_per_m2 (the top-level per-m² field).
Absolute kWh/yr values remain recoverable via tfa_m2 on `intermediate`.
2026-05-19 12:21:15 +00:00
Khalim Conn-Kowlessar
550b1fbcd0 P5.11: SapResult.intermediate exposes PV export credit
Final P5 slice. PV credit was the missing term linking the per-end-use
fuel costs (P5.6) to the top-level total_fuel_cost_gbp: total =
max(0, sum(per-end-use) − pv_credit). With this key, every step of
the §13 cost chain — per-fuel cost → PV credit → total → ECF →
rating — is auditable from `intermediate`. P5 trace exposure is
complete.
2026-05-19 10:41:18 +00:00
Khalim Conn-Kowlessar
02f92e2b0c P5.10: SapResult.intermediate exposes rating-equation spec constants
Promotes _FLOOR_AREA_OFFSET_M2 → FLOOR_AREA_OFFSET_M2 (§13 ECF
denominator, Table 12) and _ECF_LOG_THRESHOLD → ECF_LOG_THRESHOLD
(SAP rating linear/log regime boundary at ECF = 3.5). Together with
the deflator (P5.7) they fully document the §13 rating curve in
trace mode.
2026-05-19 10:37:49 +00:00
Khalim Conn-Kowlessar
3d56898944 P5.9: SapResult.intermediate exposes primary-energy breakdown
Lifts the inlined primary-energy sum into four named components:
space-heating (main + secondary × space_heating PEF), hot water,
other (pumps_fans + lighting × other PEF), and the PV offset at
other PEF (Appendix M). Together with the top-level
primary_energy_kwh_per_yr they make whether the floor-at-zero
clipped visible.
2026-05-19 10:35:10 +00:00
Khalim Conn-Kowlessar
537e18bc2e P5.8: SapResult.intermediate exposes CO2 chain
Adds delivered_fuel_kwh_per_yr (sum of all five end-use kWh) and
co2_factor_kg_per_kwh (mirrors the SAP10 input). Together with the
top-level co2_kg_per_yr they make the §15 equation traceable:
co2 = delivered_fuel × factor.
2026-05-19 10:32:59 +00:00
Khalim Conn-Kowlessar
27d40539c3 P5.7: SapResult.intermediate exposes ECF and energy-cost deflator
Promotes `_ENERGY_COST_DEFLATOR` to `ENERGY_COST_DEFLATOR` so the
§13 Table 12 constant can be referenced in trace mode alongside the
ECF it scales. ECF mirrors the top-level field; the deflator is the
only fixed worksheet constant the SAP rating depends on.
2026-05-19 10:29:53 +00:00
Khalim Conn-Kowlessar
2104c8c2da P5.6: SapResult.intermediate exposes per-end-use fuel costs
Per-end-use £/yr costs (main heating, secondary heating, hot water,
pumps_fans, lighting) lifted from the inlined total_cost sum into named
locals and populated on `intermediate`. §12 sweep slices can now diff
each line against the spec (Table 12 unit prices, future Table 12a
fractional blending, Table 12c heat-network DLF) without re-deriving
the cost decomposition.

Behaviour-preserving — `total_fuel_cost_gbp` reconciles bit-for-bit.

136 SAP tests pass.
2026-05-19 10:24:27 +00:00
Khalim Conn-Kowlessar
44b1d0d923 P5.5: SapResult.intermediate exposes useful_space_heating_kwh_per_yr
§9 / Table 9c step 10 output keyed by worksheet name on `intermediate`.
Mirrors the top-level `space_heating_kwh_per_yr` field so spec sweep
slices refer to the worksheet name regardless of field renames.

135 SAP tests pass.
2026-05-19 10:22:53 +00:00
Khalim Conn-Kowlessar
80845b0919 P5.4: SapResult.intermediate exposes HLC, HLP, τ, annual averages
heat_transfer_coefficient_w_per_k (HLC), heat_loss_parameter_w_per_m2k
(HLP), time_constant_h, and the two annual averages
(internal_gains_annual_avg_w, mean_internal_temp_annual_avg_c) populated
on `intermediate`. The averages let sweep slices verify monthly-loop
outputs without re-summing 12 months.

134 SAP tests pass.
2026-05-19 10:21:44 +00:00
Khalim Conn-Kowlessar
443a7697ff P5.3: SapResult.intermediate exposes ventilation group
infiltration_ach (the cert-derived input) and infiltration_w_per_k
(the derived HLC_V = ACH × volume × 0.33 from SAP 10.2 §4.1) populated
on `intermediate`. Diagnostic surface for the §4 / Table 4g sweep.

133 SAP tests pass.
2026-05-19 10:20:27 +00:00
Khalim Conn-Kowlessar
d5b1d0d483 P5.2: SapResult.intermediate exposes heat transmission group
Seven fabric W/K components from `inputs.heat_transmission` populated on
`intermediate`: walls, roof, floor, party_walls, windows, doors,
thermal_bridging. Handover §11 / §5 (sap-spec sweep).

132 SAP tests pass.
2026-05-19 10:19:21 +00:00
Khalim Conn-Kowlessar
aa07265606 P5.1: add SapResult.intermediate; populate dimensions group
First slice of P5 trace mode mechanical half (ADR-0010 / handover §11).
SapResult.intermediate: dict[str, float] now exposes worksheet-named
variables for per-section diffing against BRE worked examples and hand
calcs. Dimensions group lands first: tfa_m2, volume_m3, storey_count.

Subsequent slices (P5.2 heat transmission → P5.8 primary energy)
extend the same dict; field defined here so the structural change
lands once and later slices are pure additions.

131 SAP tests pass; 310 packages/domain tests pass.
2026-05-19 10:17:55 +00:00
Khalim Conn-Kowlessar
62289ec6f6 P2.4: correct table_12 CO2 factors to SAP 10.2 (14-03-2025); P2 complete
ADR-0010 §1: the file was a SAP 10.2 prices + SAP 10.3 CO2 hybrid,
incorrectly labelled "SAP 10.3" throughout. Realigns the CO2 column
to SAP 10.2 PDF page 189 — the table the calculator's Validation
Cohort certs were emitted against.

CO2 corrections (kg CO2e per kWh delivered):
  - Mains gas:               0.214 → 0.210
  - LPG (2, 3, 5, 9):        0.24  → 0.241 (precision restore)
  - Biogas (7):              0.029 → 0.024
  - HVO (71):                0.041 → 0.036
  - FAME (73):               0.058 → 0.018
  - B30K (75):               0.226 → 0.214
  - Bioethanol (76):         0.072 → 0.105
  - Coal / anthracite (11, 15): 0.398 → 0.395
  - Smokeless (12):          0.398 → 0.366
  - Wood logs (20):          0.023 → 0.028
  - Wood pellets (22, 23):   0.048 → 0.053
  - Wood chips (21):         0.018 → 0.023
  - Dual fuel (10):          0.084 → 0.087
  - Standard electricity (all grid tariffs):
                             0.086 → 0.136 (biggest swing — the
                             annual-average factor changes between
                             SAP 10.2 and 10.3 by -37%)
  - Heat-network variants realigned to match their parent fuels
  - _DEFAULT_CO2_KG_PER_KWH: 0.214 → 0.210

Header docstring rewritten:
  - Re-labelled "SAP 10.2 (14-03-2025 amendment)"
  - Dropped the misleading "+25% shift from SAP 10.2" block — those
    13.19 → 16.49 figures were SAP 10.1 → SAP 10.2, not 10.2 → 10.3
  - Notes the SAP 10.3 re-pointing trigger (corpus migration)

New test file packages/domain/src/domain/sap/tests/test_table_12.py
locks SAP 10.2 values for mains gas, standard electricity, 7h low,
24h heating, bulk LPG, heating oil, default, plus sanity checks
on the unchanged unit price + PE factor columns.

All 161 SAP + ml_training_data tests pass. CO2 corrections don't
affect SAP score (cost-driven) or PEUI (PEF-driven), so golden
fixtures and probe pinned values remain green.

P2 complete:
  P2.1 (ac1aa56a) — probe swap to spec prices
  P2.2 (28e9dd38) — golden fixtures migrated to loose smoke test
  P2.3 (cd6ac9b1) — cert-cal file deleted
  P2.4 (this)     — CO2 factors corrected

Next: P1 (parquet re-extract with inspection_date) + P3 (Validation
Cohort filter) unblock the cohort-clean probe baseline.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 10:10:04 +00:00
Khalim Conn-Kowlessar
cd6ac9b16d P2.3: delete table_12_cert_calibration.py (no remaining consumers)
ADR-0010 §2: the cert-calibration price table was bug-masking
pre-March-2025 SAP values fit against a mixture-distribution of two
spec-version regimes. P2.1 swapped the probe to SAP_10_2_SPEC_PRICES,
P2.2 migrated the golden fixtures, leaving no external consumers.
File deletion is mechanical at this point.

Also updates the cert_to_inputs() docstring at L741-L751: removes the
stale reference to CERT_CALIBRATION_PRICES, points at ADR-0010 and
the Validation Cohort filter as the parity-validation mechanism.

All 152 SAP + ml_training_data tests pass with the file gone.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 10:04:28 +00:00
Khalim Conn-Kowlessar
28e9dd3864 P2.2: migrate golden fixtures to SAP 10.2 spec prices; loose smoke test
ADR-0010 §10: the cert-based fixtures contained compensating errors
under cert-cal prices and are scheduled for replacement by BRE
worked-example fixtures (P5). Until P5 lands they stay as a loose
smoke test catching catastrophic regressions only.

Changes:
  - Swap prices=cert_calibration_prices() → prices=SAP_10_2_SPEC_PRICES.
    Last external consumer of cert_calibration_prices — P2.3 can now
    delete table_12_cert_calibration.py cleanly.
  - Loosen tolerance: SAP ±1 → ±5, PE ±10 → ±25. The cert-cal prices
    had been numerically tuned around these specific certs, so spec
    prices alone produce a -3 to +3 SAP drift across the set.
  - Retire 9390-2722-3520-2105-8715 early (heat-network mid-floor
    flat). It drifted to SAP residual -7 because cert-cal had absorbed
    heat-network DLF + Table 12c interactions. Cert JSON remains in
    fixtures/golden/ per ADR-0010 §10; a BRE worked-example covering
    the heat-network path will subsume it during P5.

Remaining 6 fixtures pass at ±5 SAP under spec prices. The whole
suite retires when P5 lands.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 10:01:05 +00:00
Khalim Conn-Kowlessar
bb9c5ac017 docs: ADR-0010 retargets calculator to SAP 10.2; rewrite handover
Adds ADR-0010 superseding ADR-0009's spec-version target, PCDB
sequencing, and cert-calibration layer. Captures the conclusions
of a grill-with-docs session:

  1. Active spec target is SAP 10.2 (14-03-2025), not SAP 10.3 — no
     SAP-10.3-lodged certs exist in the corpus to validate against.
  2. table_12_cert_calibration is deleted (not "re-derived at the
     end"). It was pre-March-2025 spec prices fit against a mixture
     distribution of two spec-version regimes, with downstream-
     component bugs absorbed into the fit — not Elmhurst deviation.
  3. Validation Cohort: filter the corpus to inspection_date ≥
     2025-07-01 so every cert in the probe was lodged on SAP 10.2
     (14-03-2025) prices. One spec, one signal.
  4. PCDB integration is promoted from "Session C deferred" to
     prerequisite P4 — dominates residual variance on heat pumps and
     the 78% of gas-boiler certs lodging main_heating_data_source=1.
  5. Trace mode (SapResult.intermediate) and BRE worked-example
     fixtures replace the 7 cert-based golden fixtures, which
     contained compensating errors.
  6. Strict-type EpcPropertyData via codes.csv-derived canonical
     enums (P6) — the in-source motivation lives at
     dimensions.py:74-82 (Khalim's comment, included in this commit).
  7. Worksheet-faithful structure is a sweep-time principle: each
     worksheet module mirrors SAP 10.2 worksheet line numbering.

CONTEXT.md additions:
  - Refined "Calculated SAP10 Performance" and "SAP10 Calculation"
    to reference SAP 10.2 + ADR-0010.
  - New term "SAP Spec Version" — domain-meaningful because the
    same EpcPropertyData yields different sap_score under different
    spec revisions.
  - New term "Validation Cohort" — the version-locked sub-corpus.

HANDOVER_SYSTEMATIC_REVIEW.md is rewritten section-by-section to
reflect ADR-0010: §1 framing, §2 status pointer, new §2.5 with the
six prerequisites P1–P6 in dependency order, §3 diagnosis (cert-cal
was stale prices, not Elmhurst deviation), §4 scope (PCDB IN,
SAP 10.3 stays OUT), §5 approach (worksheet-faithful principle as
§5.5), §7 tension dissolved, §7b findings re-framed, §8 dead-ends
re-classified as conditional, §9 cohort filter, §10 fixture
strategy, §11 trace mode as prerequisite, §12 prereqs-first,
§13 Phase 0/Phase 1 workflow, §14 ADR-0010 reference, §15 final
note.

P2.1 (commit ac1aa56a) already lands the first ADR-0010 slice
(probe swap to spec prices).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 09:54:24 +00:00
Khalim Conn-Kowlessar
ac1aa56ab1 P2.1: extract predict_sap_for_cert; swap probe to SAP 10.2 spec prices
ADR-0010 P2: cert-calibration layer is deleted, the probe uses
SAP_10_2_SPEC_PRICES (already defined in cert_to_inputs.py). Extracts
a pure predict_sap_for_cert(cert_document, *, prices) -> int helper
out of main()'s inline pipeline so the spec-prices path is unit-
testable in isolation; the helper is also reusable for P3's cohort-
filtered probe variant.

The pinned regression value (SAP=67 for cert 6035-7729 under spec
prices, vs the cert's lodged SAP of 73 under cert-cal prices) lives
in services/ml_training_data/tests/unit/test_sap_parity_probe.py.
It will drift as P4 (PCDB) and the section sweep land their fixes;
that's expected.

cert_calibration_prices is still imported by test_golden_fixtures.py
and the table_12_cert_calibration module is intact. P2.2/P2.3 retire
those.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 09:51:42 +00:00
Khalim Conn-Kowlessar
377962f8bd docs: strengthen handover with §7b outstanding findings + PCDB roadmap
§7b "Outstanding findings to pick up during the systematic pass"
collects spec-correct fixes that were reverted because they regressed
SAP MAE against the corpus — but the spec basis is unambiguous and
they WILL be the right answer once cert-calibration is re-derived.
Treat as TODOs, not dead-ends. Documents:

  Finding 1 — HW cylinder zero-loss for combi (PE MAE -6.64 measured)
  Finding 2 — Standing charges Table 12 note (a)
  Finding 3 — Cat=10 room-heater Table 12a fractional blending
  Finding 4 — Lighting Appendix L proper (L1-L12 cascade)
  Finding 5 — Internal-gains Table 5 water-heating + losses rows
  Finding 6 — Storage-loss-factor table values 3× off spec
  Finding 7 — Heat-pump fallback (needs PCDB)
  Finding 8 — Smaller gaps carried forward

Each documents the spec section/page reference, the current code
bug, empirical impact where measured, and when to pick up during the
section-by-section sweep.

PCDB section strengthened from "deferred to Session C" to an explicit
roadmap: data source URL, lookup key (main_heating_index_number),
fields needed, recommended sequencing (after spec sweep so cert-cal
is re-derivable), and why-not-now (cert-cal currently masks PCDB gaps).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 07:35:19 +00:00
Khalim Conn-Kowlessar
3363f63f5e docs: handover for systematic section-by-section RdSAP 10 review
The slice-by-slice "fix the biggest residual" approach has hit a
ceiling at SAP MAE ~4.6 because the cert-calibration prices absorb
multiple structural deviations from spec. Any spec-correct fix in one
component breaks the calibration for others. Three failed slices this
session (standing charges, cat=10 routing, combi zero-loss) made the
pattern unambiguous.

Pivot: systematic section-by-section spec verification. Read the
RdSAP 10 + SAP 10.2 spec in order, check each table / formula /
footnote against the corresponding code, fix gaps one at a time.
Build the spec-correct engine first; re-derive cert-cal calibration
once at the end as a thin Elmhurst-compatibility layer.

Handover doc covers:
- Critical framing (deterministic, not assessor judgement)
- Current state (SAP MAE 4.61, PE MAE 43.32 at f4a8d2a0)
- Why the slice-by-slice approach won't converge
- Scope decisions (RdSAP 10 + SAP 10.2 only; park full-SAP + PCDB)
- Section-to-code mapping
- Known dead-ends to skip
- Cert-calibration vs spec-correctness tension and how to resolve it
- The 7 golden fixtures and their compensating-error caveats
- Trace mode recommendation (ADR-0009's `intermediate` field)
- Specific §1-3 starting tasks
- Workflow recap

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 07:30:27 +00:00
Khalim Conn-Kowlessar
f4a8d2a017 tests: golden-fixture regression set — 7 currently-correct corpus certs
Pins 7 certs from a 1000-cert random sample that satisfy:
  |SAP rounded-int residual| ≤ 1
  |PE residual| ≤ 10 kWh/m²
  main_heating_category != 4 OR main_heating_data_source != 1
    (non-PCDB-heat-pump — PCDB lookup is deferred)

Cert mix: 6 cat=2 gas/oil boilers (3 PCDB, 3 Table 4b) + 1 cat=6 heat
network. Age bands A, C, D (×3), F, J, L. TFAs 75-526. Mix of
detached / semi-detached / mid-terrace / mid-floor flat. The cleanest
PE match in the set (cert 7536-3827) has PE residual -0.29 kWh/m².

Purpose: regression anchor. Future slices that improve aggregate MAE
silently break individual certs unless caught here. Each cert's
expected residual is recorded in `_EXPECTATIONS` so the diff is
human-inspectable when a regression fires.

The set is acknowledged to contain compensating-errors cases: some
certs match SAP within ±1 because the cert-calibration prices absorb
multiple structural deviations from spec. Hand-trace of 7536-3827
showed PE matched (-0.29) but cost was £143 (12%) under cert's implied
cost — a multi-factor gap (price calibration + missing gas standing
charge + lighting over-prediction) that cancels back into SAP ±1. We
accept this with the tolerance choice: tightening to PE ±5 in our
sample would have yielded zero fixtures.

Tolerance can tighten over the session as we close the PE bias
(currently +38 kWh/m² systematic).

All 301 domain tests pass; no behaviour changed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 07:06:58 +00:00
Khalim Conn-Kowlessar
afdf297f3b slice S-B31: Table 12c DLF on heat-network main and HW-from-main
Heat-network certs (cat=6) were under-predicted in cost — SAP bias
+6.31 across 13 sample certs, PE bias -15.6 (we under-predicted PE).
Root cause: missing distribution-loss-factor application.

SAP 10.2 spec references:
  - Table 12 note (k): "Cost is per unit of heat generated (i.e.
    before distribution losses); emission and primary factors are per
    unit of fuel used by the heat generator."
  - §C3.1: "Where a heat network is listed in the PCDB, the DLF is
    already factored into the cost, CO2 and PE factors recorded
    therein, so a DLF of 1 should be entered in worksheet (306) to
    avoid double counting." (Implication: non-PCDB networks MUST
    apply DLF.)
  - Table 12c (p. 193): DLF by age band, 1.20 (A pre-1900) →
    1.50 (K+ 2007+).
  - RdSAP 10 §10.11 Table 29 cross-references Table 12c.

Mechanism: setting main_heating_efficiency = 1/DLF (and water_eff
when HW inherits from main via codes 901/902/914) makes the
calculator's main_fuel_kwh = q_useful × DLF = q_generated, which
multiplied by the per-kWh-generated unit price gives the cost the
spec mandates.

Affects:
  - Heat-network main heating (sap_main_heating_code in 301-304 OR
    main_heating_category == 6)
  - HW from main on such certs (water_heating_code in 901/902/914)

Trade-off: CO2/PE for heat-network certs will under-predict ~20%
versus the spec's "fuel-burned × per-fuel-factor" formula, because
our architecture uses one main_fuel_kwh value for cost AND CO2/PE.
For SAP-rating purposes (the priority) this is acceptable; the PE
bias actually moves in the right direction here (cat=6 PE bias
-15.6 → -5.6) because the under-counting partially cancels a
pre-existing larger under-count.

Parity probe at 300 certs, seed=7:
  SAP MAE 4.69 → 4.61 (-0.08)
  SAP bias 0.98 → 0.87 (-0.11)
  PE  MAE 43.32 → 43.11 (-0.21)
  cat=6 PE bias -15.6 → -5.6  (+10.0, correct direction)
  cat=6 PE MAE  40.3 → 35.8   (-4.5)
  cat=6 our_pe  158.5 → 225.0 (cert 230.6 — converged)

Cumulative across S-B23 → S-B31:
  SAP MAE  5.34 → 4.61 (-0.73)
  PE  MAE 57.28 → 43.11 (-14.17)
  PE bias 51.56 → 38.64 (-12.92)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 22:59:36 +00:00
Khalim Conn-Kowlessar
f14f76daf8 docs: pin spec-aligned secondary-heating fraction per Appendix A
An attempted slice (S-B30, not committed) hypothesised that
`main_heating_fraction=1` on the cert meant "no secondary heating" and
overrode Table 11's 10% default. Probe at 300 certs penalised it:
SAP MAE 4.69 → 4.85, SAP bias 0.98 → 1.61. The hypothesis was wrong
and I should have read the spec before coding.

SAP 10.2 Appendix A1 (p. 43) defines `main_heating_fraction` as the
allocation between TWO main heating systems when both exist; not as
the main-vs-secondary fraction. 99% of corpus certs have =1, meaning
"single main, 100% allocation".

SAP 10.2 Appendix A4(d) (p. 45) is explicit: "If any fixed secondary
heater has been identified, the calculation proceeds with the
identified secondary heater" and "Table 11 gives the fraction of the
heating that is assumed to be supplied by the secondary system" —
no override based on main_heating_fraction.

Adds:
- Regression test pinning the spec behaviour
  (test_main_heating_fraction_does_not_override_table11_secondary_default)
- Regression test for the already-spec-aligned fallback path
- _secondary_fraction docstring explaining why main_heating_fraction
  is NOT consulted (with reference to the failed attempt)
- secondary_heating_type kwarg on make_sap_heating (test-only, was
  missing — needed to construct the regression fixture)

Probe at 300 certs unchanged from prior baseline:
  SAP MAE 4.69, bias 0.98
  PE MAE 43.32, bias 37.69

The hand-trace finding that cert 9036-0827 over-predicts cost remains
real, but the secondary-heating fraction is per-spec. The residual
~£33 gap on that cert is most likely missing PCDB efficiency lookup
(cert has main_heating_data_source=1 and index_number=10241 — PCDB
data — and we fall back to category-default 0.80 vs typical PCDB-
listed condensing-boiler 0.90+). Deferred to Session C per ADR-0009.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 22:22:04 +00:00
Khalim Conn-Kowlessar
3ab09845e7 slice S-B29: parse measured U from full-SAP floor + roof descriptions
Parallel of S-B24 (walls) for the other envelope elements. Full-SAP
assessments lodge a measured/calculated U-value directly in the
description ("Average thermal transmittance X W/m²K") for floors
(~1 391 corpus certs) and roofs (~1 140 certs). Per spec:
  - §5.11 (roofs) opening clause defers to assessor's value when
    present
  - §5.12 (floors): "Unless provided by the assessor the floor
    U-value is calculated according to BS EN ISO 13370"

Both u_floor and u_roof now invoke `_measured_u_from_description`
first; if it parses a value, they return it directly and skip the
cascade. No range cap (consistent with S-B24 design — calculator
mirrors what the assessor lodged).

Parity probe at 300 certs, seed=7: headlines unchanged (same parquet
sampling gap as S-B24 — full-SAP certs filtered out upstream). Slice
correctness proved by:
- 1 unit test for u_floor measured-U parse
- 1 unit test for u_roof measured-U parse
- existing 287 tests passing, no regressions

A bulk-zip-based probe to measure the corpus-wide impact remains the
needed tooling investment (see S-B24 commit message).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 21:54:17 +00:00
Khalim Conn-Kowlessar
25261d5c8b slice S-B28: §5.11.4 — roof "NI" + insulated description → 50 mm joist row
346 corpus certs lodge roof_insulation_thickness="NI" (Not Indicated,
parsed to 0 by _parse_thickness_mm). When the description also signals
retrofit insulation ("Pitched, insulated (assumed)" / "Flat,
insulated" / "Roof room(s), insulated (assumed)"), our cascade
returned the uninsulated Table 16 row-0 value (U=2.30).

RdSAP 10 §5.11.4 (page 44, end of section): "If retrofit insulation
present of unknown thickness use 50 mm". That maps to Table 16 row
"Insulation at joists at ceiling level, 50 mm" = 0.68 W/m²K. The fix
is the analog of S-B27 for roofs: when insulation_thickness_mm==0
(the "NI" sentinel) and _described_as_insulated(description), return
0.68 instead of the row-0 lookup.

Per-cert delta: ΔU = 1.62 W/m²K on the affected slice; for typical
80 m² roof = 130 W/K HLC reduction ≈ 12 kWh/m² PEUI per cert.

Parity probe at 300 certs, seed=7:
  SAP MAE 4.72 → 4.69 (-0.03)  ← first SAP MAE drop in 3 slices
  PE MAE  44.19 → 43.32 (-0.87)
  PE bias 38.56 → 37.69 (-0.87)

Cumulative across S-B23 → S-B28:
  PE MAE  57.28 → 43.32 (-13.96)
  PE bias 51.56 → 37.69 (-13.87)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 21:49:44 +00:00
Khalim Conn-Kowlessar
1f49fa03cd slice S-B27: Table 19 footnote (2) — floor "NI" + insulated description
The cert's `floor_insulation_thickness` field carries "NI" (Not
Indicated) on 58% of corpus certs — by far the most common value. For
~2 413 of those (12% of corpus) the description also says "Solid,
insulated (assumed)" or "Suspended, insulated (assumed)" — the
assessor saw insulation but didn't measure the thickness. Our
`_parse_thickness_mm("NI")` returns 0, which feeds `u_floor` as an
explicit "0 mm" → r_f=0 → uninsulated-floor U-value. Wrong.

RdSAP 10 §5.12 Table 19 footnote (2) (page 46): "For floors which
have retrofitted insulation, use the greater of 50 mm and the
thickness according to the age band". `u_floor` now accepts a
`description` kwarg; when `_described_as_insulated(description)` is
true and the lodged thickness is missing/zero, ins_mm =
max(50, age-band default).

Geometry sanity-check, 100 m² × 40 m perimeter, w=0.3 (B=5):
- Uninsulated solid floor: d_t = 0.615, U = 0.60 W/m²K
- 50 mm assumption:        d_t = 2.758, U = 0.31 W/m²K

Parity probe at 300 certs, seed=7:
  PE MAE  45.37 → 44.19 (-1.18)
  PE bias 39.75 → 38.56 (-1.19)
  Band J bias +41.2 → +29.7 (-11.5)
  Band K bias +34.1 → +22.4 (-11.7)
  Band L bias +19.6 → +11.3 (-8.3)
  Band M bias +86.3 → +55.1 (-31.2)
  Bands A-H mostly unchanged (max(50, 0) = 50 either way; description
    overrides on older stock are rarer in this sample)

The K-L-M dwellings improved most because for them the age-band
default insulation (100-140 mm) is now applied instead of 0 mm.

Cumulative across S-B23 → S-B27:
  PE MAE  57.28 → 44.19 (-13.09)
  PE bias 51.56 → 38.56 (-13.00)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 21:40:18 +00:00
Khalim Conn-Kowlessar
361f91546b slice S-B26: NI thickness + assumed-insulated descriptions route to 50mm row
Two related bugs both produced U=1.7 for retrofit-insulated solid-brick
walls when the spec says U=0.55 (Table 6 footnote: "If a wall is known
to have additional insulation but the insulation thickness is unknown,
use the row in the table for 50 mm insulation"):

1. _insulation_bucket(0, True) returned 0 instead of 50. The "NI"
   sentinel parses to 0 via _parse_thickness_mm, then the bucket
   function's "< 25 -> 0" branch ignored the insulation_present signal.
   Affects 56 corpus certs lodging solid-brick with type=1 or type=3
   plus thickness="NI".

2. wall_ins_present was set False whenever wall_insulation_type == 4
   ("as-built / assumed"), even if the description said
   "...insulated (assumed)" or "...partial insulation (assumed)".
   Affects 128+51 = 179 corpus certs.

The same root pattern as S-B25 (cavity-wall description disambiguation),
extended to non-cavity constructions. `_cavity_described_as_filled`
generalised to `_described_as_insulated`; now used by:
- u_wall (cavity-wall dispatcher to the Filled-cavity row, S-B23/B25)
- heat_transmission_from_cert (override wall_ins_present for non-cavity
  walls so the 50 mm bucket routes per Table 6 footnote)

Parity probe at 300 certs, seed=7:
  PE MAE  45.74 → 45.37 (-0.37)
  PE bias 40.19 → 39.75 (-0.44)
  Band D bias +42.7 → +41.6 (-1.1)
  Band F bias +12.6 → +10.7 (-1.9)

Modest aggregate movement — the affected population is small (~0.6% of
corpus, ~2 certs in the 300 sample). The slice's correctness is proved
by 4 unit tests in test_rdsap_uvalues.py + 2 end-to-end tests in
test_heat_transmission.py.

Cumulative across S-B23 → S-B26:
  PE MAE  57.28 → 45.37 (-11.91)
  PE bias 51.56 → 39.75 (-11.81)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 21:19:33 +00:00
Khalim Conn-Kowlessar
6b934710d0 slice S-B25: description-based dispatch for as-built / assumed cavity
The RdSAP schema's `wall_insulation_type = 4` ("as-built / assumed")
covers two distinct cert populations that previously both routed to
the Cavity-as-built row (U=1.5 at band E):

  686 certs: "Cavity wall, as built, no insulation (assumed)" — U=1.5 ✓
 1171 certs: "Cavity wall, as built, insulated (assumed)" — should be 0.7
  147 certs: "Cavity wall, as built, partial insulation (assumed)" — 0.7

The description string disambiguates. The legacy production map at
recommendations/rdsap_tables.py:753 routes the latter two to "Filled
cavity" — we match that interpretation here for parity with the cert
assessor and the production recommendation engine.

`_cavity_described_as_filled` adds the description check; the existing
filled-cavity dispatcher in u_wall now fires on either signal:
- wall_insulation_type == 2 (S-B23 — explicit filled-cavity code)
- description contains "insulated" or "partial insulation" without
  the "no insulation" negation marker (S-B25 — assumed cavity-fill)

Parity probe at 300 certs, seed=7:
  PE MAE  46.78 → 45.74 (-1.04)
  PE bias 41.78 → 40.19 (-1.59)
  Band F bias +23.2 → +12.6 (-10.6)
  Band G bias +31.8 → +25.1 (-6.7)
  Band H bias +30.7 → +15.5 (-15.2)

Improvements localise to bands F-H (1976-1995), the era when Building
Regs mandated cavity insulation for new-builds — making "as built,
insulated (assumed)" the modal description. SAP MAE drifted up
+0.12 (cost-side residuals surfacing now that envelope is closer to
spec; tracked for follow-up).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 21:06:10 +00:00
Khalim Conn-Kowlessar
15613309df slice S-B24: parse measured U from full-SAP wall description
Full SAP assessments (~15% of corpus, 4 403 of 30 000 scanned bulk-zip
certs) lodge a measured/calculated wall U-value per BS EN ISO 6946 in
walls[i].description, e.g. "Average thermal transmittance 0.18 W/m²K".
These certs typically have wall_construction, wall_insulation_type and
construction_age_band all None, which the cascade defaults previously
resolved to U = 1.5 (uninsulated cavity at band E). RdSAP 10 §5.3:
"U values are obtained from … the construction type, date of
construction and, where applicable, thickness of additional insulation"
— but a measured value supersedes the cascade.

Corpus U-value distribution among parsed:
  median 0.21, mean 0.225, range 0.06-1.84
  80% at U ≈ 0.2 (Part L-compliant new-builds)
  10% at U ≈ 0.1 (passivhaus / very low)
  7%  at U ≈ 0.3 (older retrofitted full-SAP)
  3%  in the tail (conversions, edge cases)

Per affected cert (100 m² new-build at U 1.5 → 0.21):
  walls_w_per_k drops 129 → 21 W/K
  PEUI drops ≈ 120 kWh/m²

Implementation:
- _measured_u_from_description() regex-parses the phrase from the wall
  description; returns None on no-match or non-numeric so the cascade
  fall-through is preserved.
- u_wall checks the measured value FIRST, before any cascade logic.
- No range cap — calculator mirrors what the assessor lodged, per the
  "deterministic except for input errors" principle. Parse failure
  falls through cleanly.

Parity probe at 300 certs, seed=7: headlines unchanged. Direct check
on the sample: 0/300 certs carry an "Average thermal transmittance"
description. The v18a parquet filters full-SAP certs out somewhere
upstream, so this slice is invisible in the parquet-based probe. The
slice's correctness is proved by:
- 4 unit tests in test_rdsap_uvalues.py (tracer + regression on
  ordinary descriptions + parse-failure fallback + filled-cavity
  description still routes correctly)
- 1 end-to-end test in test_heat_transmission.py exercising a
  synthetic full-SAP cert through heat_transmission_from_cert
- All 274 domain tests passing, no regressions

Follow-up tooling: a bulk-zip-based parity probe that doesn't filter
to the parquet's subset is needed to measure this slice's corpus
impact. Separate dig.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 20:50:39 +00:00
Khalim Conn-Kowlessar
136f149d46 tooling: widen parity probe sap_score range to (5, 99)
Previous bound (20, 95) excluded full-SAP new-builds (sap_score 90+,
which carry the dramatic wall U-value gap) and deepest-tail heritage
certs (sap_score ≤ 20). Widening so the sample reflects the
populations where the calculator's biggest spec gaps live.

New baseline at 300 certs, seed=7:
  SAP MAE 5.34 → 4.59 (-0.75)
  PE MAE  48.99 → 46.78 (-2.21)
  PE bias 42.07 → 41.78 (-0.29)

Note: the v18a parquet only contains ~0.7% certs with age_band=None,
while the raw bulk zip has 15% full-SAP "Average thermal transmittance"
certs. The parquet is filtering them somewhere upstream — to be chased
in separate work. Until then, parity-probe MAE will under-show the true
corpus impact of slices that target full-SAP certs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 20:38:22 +00:00
Khalim Conn-Kowlessar
9a509e4102 slice S-B23: RdSAP 10 Table 6 "Filled cavity" row dispatch
The cert encodes filled-cavity walls as
(wall_construction=4 cavity, wall_insulation_type=2 filled,
wall_insulation_thickness="NI"). The previous cascade parsed "NI"→0
and ran the thickness-bucketed table, returning U=1.5 (the
"Cavity as built" row) — treating retrofit-filled cavities as if they
were uninsulated. Spec (RdSAP 10 Table 6, page 33) has a dedicated
"Filled cavity" row at U=0.7 for bands A-E, 0.40 at F, 0.35 at G-H,
and "as built" from band I onward.

Adds:
- WALL_INSULATION_FILLED_CAVITY constant (code 2 per RdSAP schema,
  confirmed empirically on 8 000 corpus certs against walls.description)
- _CAVITY_FILLED_ENG row in domain.ml.rdsap_uvalues
- dispatcher in u_wall when (construction=cavity, insulation_type=2)
- wall_insulation_type plumbing through heat_transmission_from_cert

Parity probe (300 certs, seed=7) before → after:
- PE MAE  57.28 → 48.99 (-8.3)
- PE bias 51.56 → 42.07 (-9.5)
- Band C bias +65.3 → +47.8 (-17.5)
- Band D bias +67.9 → +45.7 (-22.2)
- Band E bias +77.0 → +58.8 (-18.2)
- Band F bias +43.8 → +25.4 (-18.4)
- Band K-L bias unchanged (filled-cavity row falls back to as-built
  from band I onward per spec footnote; correct no-op)

Future slices already lit up by the same enumeration:
- type=1 external / type=3 internal insulation rows (~440 certs)
- type=6 filled + external / type=7 filled + internal (~22 certs)
- type=None "Average thermal transmittance X W/m²K" string parse
  (1 358 certs — biggest follow-up)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 20:15:41 +00:00