The flat floor-exposure heuristic keys on dwelling_type: a flat defaults
to has_exposed_floor=False (assuming a heated dwelling below). The
Elmhurst Summary path lodges a ground-floor flat's vertical position as a
"Ground floor" floor_type rather than the API floor_heat_loss=1 exposed
code, and the mapper can label such a flat "Top-floor flat" — so the
cascade dropped the ground floor entirely (a ground floor is in contact
with the ground and carries heat loss).
Treat a "ground floor" floor_type as a heat-loss floor, overriding the
dwelling-level suppression upward — mirroring the existing "another
dwelling below" party override downward.
Worksheet-validated to 1e-4 on simulated case 45 (a ground-floor flat
the mapper labelled "Top-floor flat"): floor (28a) 0 -> 25.38 W/K,
fabric (33) 75.63 -> 101.0104, HTC (39) 112.93 -> 145.3579, all matching
the P960 exactly; SAP 67.81 -> 62.52. RdSAP-21.0.1 corpus within-0.5
69.5% -> 69.7% (MAE 0.859 -> 0.854). Floors ratcheted. Pinned in
test_heat_transmission (ground-floor billed + party-floor suppressed).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds whole-dwelling property_type/built_form to EpcSimulation (folded by
apply_simulations) and maps those override components. property_type drives
party-wall heat loss + ASHP/solar/wall eligibility, so a landlord correction now
moves both the SAP calc and the measure menu; built_form has no calculator
consumer today (feeds the ML transform). Written as the landlord text value
(park-home check is text-only). Refines ADR-0032 dec-4.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Extends WallType coverage to timber/stone/system-built/cob/park-home/curtain and
adds RoofType "Pitched, N mm loft insulation" -> roof_insulation_thickness. The
"(assumed) insulated"/"partial" wall states stay deferred (ambiguous code, needs
Elmhurst validation per ADR-0032); property_type/built_form carry no SAP weight.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The SQLModel had drifted to a `bill_` prefix on the Bill Derivation block, but
the FE-owned Drizzle table uses unprefixed names (`heating_kwh`, `hot_water_kwh`
… `total_annual_bill_gbp`) plus a nullable `fuel_rates_period`. INSERTs failed
with UndefinedColumn. Rename the columns to mirror the live table column-for-
column (the prefix's anti-clash purpose is moot: `heating_kwh` != the recorded
`space_heating_kwh`), and add the `fuel_rates_period` column — left None until
Bill Derivation threads the snapshot period through.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Baseline stage is the first consumer to read these off a persisted EPC
end-to-end, surfacing three gaps that only manifest on real API data:
- Only the 21.0.1 mapper copied through the recorded current-performance
scalars (SAP rating, CO2, PEUI) and *no* mapper mapped the EPC band, so
Lodged Performance raised for 17.x/18.0/19.0/20.0.0 certs. Overlay all four
from the raw payload in `from_api_response`, once, for every schema version.
- Likewise the `renewable_heat_incentive` block (baseline space/water-heating
kWh) was only mapped by the 21.x paths. Gap-fill it centrally from the raw
payload when a mapper left it unset.
- The FE-owned `epc_property` date columns are Postgres `timestamp`s while the
SQLModel mirror types them `str`, so a read hands back a `datetime` and
`date.fromisoformat()` raised. Normalise via `_as_date()`.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The deduplicated `epc.roofs[]` list cannot be indexed 1:1 against the
building parts (190/329 multi-part certs have len(roofs) != len(parts)),
so every part's `u_roof` consumed a SINGLE join of all roof descriptions.
That leaked one part's insulation state onto another: a "Flat, no
insulation" extension dragged a "Pitched, insulated (assumed)" main roof
to the uninsulated 2.30, ~3x over-stating its heat loss. 3-part certs
systematically under-rated (56% within-0.5, mean -0.79 SAP).
Partition the non-RR roof descriptions into flat vs pitched/sloping and
match each part to its own kind (`_main_roof_descriptions_by_kind`),
falling back to the global join when a part's kind has no matching entry.
Corpus cert 100010129331: roof 110.5 -> 31.3 W/K, +13.10 -> -0.05 SAP.
RdSAP-21.0.1 within-0.5 68.8% -> 69.5% (MAE 0.888 -> 0.859; PE 13.9 ->
13.6); 3-part cohort 56% -> 61%. Floors/ceilings ratcheted. Pinned in
test_heat_transmission (by_kind split + mixed-roof no-contamination).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The §5.8 Table-14 added-insulation R-value adjustment was gated to
WALL_SOLID_BRICK, so a stone (granite/sandstone) wall lodging
wall_insulation_type 1/3 ("External"/"Internal") + a thickness fell
through the §5.6 thin-wall branch and was billed at its UNINSULATED U
(e.g. sandstone 520 mm + 100 mm internal: 1.64 instead of 0.30 → ~5×
the wall heat loss). Mirror the brick insulation branch into the stone
block, feeding the RAW §5.6 U₀ into the §5.8 chain per the same rule the
brick branch and the dry-lined granite pin 000565 already follow (the
Table-6 footnote (a) 1.7 cap does not apply on the insulated path).
Corpus cert 100052159386 (sandstone 520 mm + 100 mm internal): -26.20 ->
-4.08 SAP, walls 300 -> 55 W/K. RdSAP-21.0.1 corpus within-0.5 68.6% ->
68.8% (SAP MAE 0.942 -> 0.888; PE MAE 14.3 -> 13.9; CO2 0.27 -> 0.26);
floors/ceilings ratcheted. Unit-pinned in test_rdsap_uvalues.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
scripts/run_first_run_e2e.py runs the real Ingestion -> Baseline -> Modelling
pipeline against the DB by composing build_first_run_pipeline + dispatch_first_run
with the live source clients (the Lambda handler can't run locally — its
_source_clients_from_env still raises, #1136). Unlike run_modelling_e2e it runs
real ingestion (persists EPC/spatial/solar) and has no inspect-only mode, so it's
gated behind --confirm (preview otherwise); measure scoping comes only from the
Scenario's exclusions (the pipeline threads no --measures), and the modelling
batch is all-or-nothing, both documented.
Extract the shared env/engine/S3 plumbing into scripts/e2e_common.py (public
load_env/build_engine/s3_parquet_reader) so both runners share one source and
neither imports the other's privates.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The gov API lodges a NON-SEPARATED conservatory (conservatory_type=4) as a
glazed "building part" carrying only {floor_area, room_height,
double_glazed, glazed_perimeter} — no fabric, no floor dimensions. The
four fields were undeclared on the 21.0.1 SapBuildingPart, so `from_dict`
dropped them and the conservatory was silently lost: it billed no §6.1
window/rooflight/floor and added nothing to TFA (5 corpus certs over-rated
— too little heat loss → SAP too high).
Fix (21.0.1 schema + mapper):
- declare the four glazed fields on `SapBuildingPart`;
- `_api_sap_conservatory` builds `EpcPropertyData.sap_conservatory` from
the glazed BP (identified by a lodged `glazed_perimeter`; only type-4
conservatories lodge it — separated ones, §6.2, lodge nothing);
- exclude the glazed BP from the fabric building-part loop (it is billed
by the §6.1 cascade, not as a dwelling part);
- `_total_floor_area_from_building_parts` adds the conservatory floor area
to TFA (drives occupancy → §4/§5 demand).
Validation is cross-mapper parity, NOT a corpus back-solve: the API mapper
feeds the SAME worksheet-validated §6.1 cascade (`conservatory_geometry`,
pinned to 1e-4 against the case-44 Summary) as the Elmhurst path — so the
API conservatory fabric is correct by construction. `from_api_response`
on an injected type-4 cert reproduces the glazed wall (perimeter × ground-
floor room height = 22.05), glazed roof (floor/cos20 = 12.77) and Table 25
double U_eff (2.758 wall / 2.993 roof); a separated (type 2/3) cert lodges
no glazed BP → disregarded per §6.2.
Gauges: corpus within-0.5 67.9% → 68.6% (MAE 0.959 → 0.942; floor 0.67→0.68,
ceiling 0.97→0.95); /tmp eval mean|err| 0.822 → 0.817. Harness 47/47
0-raised; regression = the 3 pre-existing fails; pyright net-zero (65=65).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The new pipeline left no per-Property record of a run (the old engine set
property.has_recommendations and populated property_details_epc). Restore the
marker: PropertyRepository.mark_modelled sets has_recommendations (true when the
Plan carries measures, mirroring the old engine) and bumps updated_at, so a
first-run under the new process is identifiable as updated_at >= 2026-06-01.
ModellingOrchestrator marks each Property after its Scenarios (true if any
Scenario yielded a measure); run_modelling_e2e's --persist path marks it too
(its compute runs on in-memory fakes, so the DB UoW sets it directly). Adds the
has_recommendations/updated_at columns to the PropertyRow mirror.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Close the §6.1 conservatory demand cascade per RdSAP 10 §6.1 + Table 25.
Solar gains (§6, solar_gains.py) — Table 25 note (PDF p.51): "The
orientation of windows in a conservatory is not recorded, thus solar
gains are calculated using the default solar flux (East/West orientation,
with 20° pitch for roof windows)." The glazed wall bills onto the (76)
East line (vertical, average-overshading Z); the glazed roof onto the
(82) roof-window line (20° pitch, Z=1.0), both at Table 25 g=0.76, FF=0.70.
TFA-occupancy (mapper) — §6.1: the conservatory floor area is added to the
dwelling total floor area. TFA drives occupancy → §5 internal gains + §4
hot-water demand, so the non-separated conservatory's floor area now
enters `EpcPropertyData.total_floor_area_m2` (the worksheet's (4) = 95.38
carries it). Separated conservatories (§6.2) stay excluded.
Pinned against the case-44 P960 demand cascade at abs=1e-4: (73) internal
gains 625.1759, (83) solar gains 495.8655, (95) useful gains 1079.6510,
(99) space heating per m² 89.8073 — the full §6.1 chain reproduces EXACTLY.
The whole-dwelling SAP (72.9517) / CO2 (3241.8656) are not pinned: the
case-44 Summary omits the House-Coal secondary heater (SAP 633) the P960
descriptor carries (cf. case 43), so the cascade computes no secondary —
the entire residual (+349.77 kg CO2). A Summary-input defect, independent
of §6.1; every conservatory-affected line ref is exact. Worksheet harness
stays 47/47 0-raised; corpus unchanged (API path; mirror is the next slice).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
SAP 10.2 §2 (17)-(18): a measured/design air permeability at 50 Pa from a
Blower Door test routes infiltration via `(18) = AP50/20 + (8)`, in
preference to the components-based (16) estimate. The Elmhurst extractor
read only the AP4 ("Pulse") column of §12.2, so a Blower Door result
(§12.2 "Pressure Test Result (AP50)") fell through to the structural-
infiltration default — over-counting ventilation heat loss.
Surfaced by simulated case 44 (AP50 4.50): effective air change rate was
0.81 vs the worksheet's 0.58 (+38% ventilation loss). The cascade already
supports `air_permeability_ap50` (preferred over AP4); this wires the read
end to end (extractor → ElmhurstSiteNotes → SapVentilation → cert_to_inputs).
Pinned against the case-44 P960 §2 at abs=1e-4: (18) infiltration 0.3417
(= 4.5/20 + 0.1167) and (25) Jan effective ach 0.5812. Worksheet harness
stays 47/47 0-raised.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Every worklist UPRN now carries schema · engine SAP / lodged · flag. Tally:
64 healthy, 19 MVHR-not-credited (🚩 flag B), 6 heat-pump fuel-39 (🚩 flag A),
4 sparse/NOT MAPPABLE (⛔), 3 Elmhurst-pinned. MVHR is the largest accuracy gap.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Autonomous-run triage of the moderate eng-vs-lodged gaps resolves them into two
patterns, both flagged for owner review (not auto-fixable):
- Heat-pump fuel code 39 mis-priced as gas (over-rates; both gap directions).
- MVHR heat recovery modelled as plain extract loss → systematic UNDER-rating
(~8-12 SAP) on every full-SAP cert carrying a mechanical_vent_system_index_number.
New memory mvhr-heat-recovery-not-modelled; needs the Appendix Q / PCDB MVHR
efficiency model.
findings doc updated with the classification.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Some SAP-Schema-17.x/18.0.0 certs lodge sap_openings width/height in MILLIMETRES
mixed with metre rows in the same array (e.g. a 2025x2100 mm window beside a
3.06x1 m one). The 17.1 mapper read them all as metres → a 4.25M m2 window →
HTC in the millions → SAP clamped to 1.
Fix (TDD, datatypes/epc/domain/mapper.py): _sanitise_opening_dimension_m treats
any dimension > 50 m as mm and divides by 1000; _sap_opening_area_m2 applies it
to areas. Wired into the window, roof-window, and door-area-weighting paths.
The 3 broken certs (uprn_10093117227 / 10090317693 / 10091636031) now score
90 / 81 / 79 instead of 1.
3 RED->GREEN slices + refactor; new test class
TestFromSapSchema17_1OpeningUnitSanitisation + sap_17_1_mm_openings.json fixture;
0 new pyright errors; no regressions.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Schema coverage (datatypes/epc/domain/mapper.py):
- SAP-Schema-18.0.0: full-SAP shape ≡ 17.1 → from_sap_schema_17_1, no normalisation.
- SAP-Schema-16.0: same reduced-field 16.x path; default the omitted `tenure`
field in _normalize_sap_schema_16_x (metadata; SAP cascade never reads it).
Genuinely sparse 16.x certs (missing core fabric fields) still fail loud.
- Regression tests + sap_18_0_0.json / sap_16_0.json fixtures; 0 new pyright errors.
Autonomous triage of the worklist (scripts/hyde/autonomous_run_findings.md):
- Found + diagnosed 2 bugs (flagged, NOT fixed): (1) MAPPER — full-SAP openings
lodged in mm read as m → multi-million-m2 windows → SAP clamps to 1 (uprn_
10093117227 / 10090317693 / 10091636031); (2) CALCULATOR — database heat-pump
fuel code 39 mis-priced as gas, over-rates ~14 (uprn_10093114053).
- Most certs map within +/-4 of lodged.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- SAP-Schema-16.3: same reduced-field RdSAP shape as 16.2 — generalise the
normaliser to _normalize_sap_schema_16_x and route both 16.2/16.3 through it.
uprn_44012843 maps → SAP 79 (lodged 81).
- SAP-Schema-17.0: structurally identical to the full-SAP 17.1 schema (measured
sap_opening_types), so it parses with the 17.1 dataclass and reuses
from_sap_schema_17_1 with no normalisation. uprn_10023444324 → 80, uprn_
10023444320 → 81.
- Regression tests (16.3 dispatch, 17.0 dispatch) + sap_16_3.json / sap_17_0.json
fixtures; 0 new pyright errors. All 7 e2e UPRNs now map.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
SAP-Schema-16.2 (datatypes/epc/domain/mapper.py):
- 16.2 is structurally an RdSAP-17.1 cert under a different name; add
_normalize_sap_schema_16_2 (field renames + defaults) and dispatch to the
tested from_rdsap_schema_17_1 mapper. uprn_100020933699 maps → SAP 71.
- Honour a "Single glazed" windows description when multiple_glazing_type="ND"
(was defaulting to double) → RdSAP-21 code 5; eng 72→71 (lodged 70).
- 4 regression tests + sap_16_2.json fixture; 0 new pyright errors.
Flat party-wall fix (domain/sap10_calculator/worksheet/heat_transmission.py):
- Full-SAP flats carry flatness in dwelling_type, not property_type, so the
party-wall default fell through to the 0.25 house value instead of the RdSAP
Table-15 flat 0.0. Add _is_flat_or_maisonette_dwelling fallback + regression
test. uprn_10093116529 80→81 (matches the cert's lodged party u_value 0).
Accuracy corpus pins (tests/domain/sap10_calculator/test_real_cert_sap_accuracy.py):
- uprn_10093116543 (SAP-17.1 gas-combi semi): engine 81 (Elmhurst 77; documented
full-SAP→RdSAP residual — measured wall/floor U + PCDB boiler vs RdSAP defaults).
- uprn_10093116529 (SAP-17.1 g/f flat): engine 81 (Elmhurst 78).
devcontainer: add poppler-utils (pdfinfo) for the documents-parser PDF fixtures.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Routes run_modelling through prop.effective_epc and dumps each target's
property_overrides before the run, so a landlord wall override moves the
calculated SAP. Records the overlay design in ADR-0032.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>