Extends `SapRoomInRoof` with six optional fields capturing the RdSAP10
§3.9.2 Simplified Type 2 lodgement: common_wall_length_m / height_m
plus two gable length/height pairs.
Type 2 fires when `common_wall_height_m` is set and < 1.8 m (otherwise
the space is a separate storey). Geometry per spec page 23:
A_common_wall = L × (0.25 + H)
A_gable = L × (0.25 + H_gable)
− Σ ((H_gable − H_common_wall_i)² / 2)
A_RR_final = A_RR − Σ A_common_wall − Σ A_gable
(− party / sheltered / connected when lodged, future
slice when a fixture exercises them)
Common walls and gables route to walls_w_per_k at U_main_wall (per spec:
"Common wall U-value is inferred from the U-value of the main wall in
the building part below"). A_RR_final routes to roof_w_per_k at
u_rr_default_all_elements (Table 18 col 4).
Synthetic test: 1-storey cavity-uninsulated dwelling at age B + RR
(floor 10 m², common_wall_length 5 m × 1 m height). Pins
walls_w_per_k = 60 × 1.5 + 6.25 × 1.5 = 99.375 W/K and
roof_w_per_k = 30 × 0.40 + 26.025 × 2.30 = 71.857 W/K at abs=0.001.
No production fixture exercises Type 2 yet — synthetic test is the
unit-level guard until a Type 2 cert lands in the corpus.
Reference: RdSAP 10 (10-06-2025) §3.9.2 page 22-23.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Surfaces four cert lodgements that the §2 ventilation cascade was
missing on the cert→inputs path. Without them, `cert_to_inputs` was
defaulting:
- extract_fans_count → 0 (PDF: 1-2 fans per fixture)
- percent_draughtproofed → 0 (PDF: 75-100% per fixture)
- sheltered_sides → 2 (PDF: 1-3 per fixture — hardcoded TODO)
- has_suspended_timber_floor → False (PDF: True on 000477/000487)
Net effect on (25)m monthly effective ACH ranged from -19% (000477)
to +5% (000490) → propagated 1:1 through HLC × ΔT → useful space heat
→ main + secondary fuel kWh → cost / SAP integer.
Schema:
- `SapVentilation` gains 4 new optional fields: `sheltered_sides`,
`has_suspended_timber_floor`, `suspended_timber_floor_sealed`,
`has_draught_lobby`. RdSAP cert lodges these but the type didn't
surface them.
- `cert_to_inputs.cert_to_inputs` reads them when set; falls back to
the SAP10.2 §2 worst-case defaults (sheltered=2, no timber floor,
no draught lobby) when the cert hasn't lodged. Removes the long-
standing `sheltered_sides=2` hardcode + 4 TODOs.
- `make_minimal_sap10_epc` accepts a `sap_ventilation` kwarg.
Per-fixture build_epc() updates lodge the U985 PDF values verbatim.
E2E pin: new parametrized test
`test_elmhurst_cert_to_inputs_monthly_infiltration_ach_matches_u985_
worksheet` asserts `inputs.monthly_infiltration_ach[m] == LINE_25_
EFFECTIVE_ACH[m]` at abs=1e-3 across all 6 fixtures + 12 months
(72 assertions). All pass.
Useful space heating drift:
000474: useful 10821.69 → 10765.85 (Δ -55.8 kWh vs PDF 10612.86 → +1.4% over, was +2.0%)
000490: useful 11262.05 → 11184.06 (Δ -78.0 kWh vs PDF 11183.28 → +0.007% — essentially exact)
SAP integer status:
000474: 62 = PDF 62 (delta 0) ✓
000490: 58 vs PDF 57 (delta 1; continuous 57.77 vs 57.40)
— remaining residual is pumps_fans hardcoded at 130 kWh
vs PDF 160 (Table 4f cascade not yet implemented → -£4 cost
+ 0.3 continuous SAP). Next slice.
Tightens `result.secondary_heating_fuel_kwh_per_yr` pin abs=10 → abs=0.1
(was loose to absorb the +0.7% useful overshoot which has now closed).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
SapFloorDimension gains an is_exposed_floor flag (default False) signalling
that the floor sits over outside air or unheated space rather than soil —
typical for an extension that hangs off the main from the first storey
upward (Elmhurst 000490 Extension 1 is exactly this shape).
heat_transmission_from_cert now consults the flag on the part's ground
SapFloorDimension and dispatches to u_exposed_floor (Table 20) instead
of the BS EN ISO 13370 / Table 19 cascade. Basement floor still wins
priority (Table 23 § 5.17 overrides everything else for that part).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Lands the production code that the just-committed Elmhurst conformance
fixtures (6455d48b) exercise: the SAP10.3 calculator orchestrator
(domain.sap.calculator.Sap10Calculator), the RdSAP-driven cert→inputs
mapper (domain.sap.rdsap.cert_to_inputs), and the EpcPropertyData
strict-type pass that P6.1 starts.
calculator.py is the entry point. Two surfaces depending on the caller's
shape:
- Sap10Calculator().calculate(epc) — full RdSAP mapper + worksheet loop
- calculate_sap_from_inputs(inputs) — pure physics over typed inputs
P6.1 introduces BuildingPartIdentifier as a strictly-typed replacement
for bare-string matching on SapBuildingPart.identifier (motivated by
the pain point at worksheet/dimensions.py:74-82). Two boundary factories
canonicalise raw inputs: from_api_string for the gov-EPC API, and
extension(n) for site-notes / construction id flows.
Also catches up two transitive deps that 6455d48b implicitly required
but I missed:
- ml/rdsap_uvalues.py — party-wall U-value rows that heat_transmission
resolves; the U=0.0 branch the 000516 fixture exercises lands here.
- ml/tests/_fixtures.py — make_minimal_sap10_epc that every Elmhurst
fixture imports. Without this catch-up, checking out 6455d48b in
isolation would ImportError.
Out of scope (will commit separately): ml/transform.py legacy envelope
drift; backend/ FastAPI + documents_parser layer; etl/ scratch.
824 tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
15 new features wired through schema -> domain -> mapper -> transform:
Main Dwelling fabric (11):
- wall_insulation_type, wall_insulation_thickness_mm, wall_dry_lined,
wall_thickness_mm, party_wall_construction
- roof_insulation_location, roof_insulation_thickness_mm
- floor_construction, floor_insulation, floor_insulation_thickness_mm,
floor_heat_loss
Dwelling-level scalars (4):
- multiple_glazed_proportion, number_baths, number_baths_wwhrs,
extract_fans_count
Thickness strings like '50mm'/'NI'/'ND' parsed via _parse_thickness_mm; NI
(no insulation) lands as 0mm so the model sees the physical zero rather than
a missing value. Categorical sentinels ('NA'/'NI'/'ND') become None.
Also fixed long-standing typo `multiple_glazed_propertion` -> `_proportion`
in domain dataclass + its lone DB-model usage.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two production fixes surfaced by the live run:
- mapper.from_rdsap_schema_21_0_1 now sets the three ML target scalars
(energy_rating_current, co2_emissions_current, energy_consumption_current).
They were silently None for every cert before, leaving the only labels as
the kWh fields from renewable_heat_incentive.
- train_baseline coerces object-dtype columns to numeric (None -> NaN) and
drops rows with null target per fit, so LightGBM accepts the frame.
E2E on 500 real certs (~1s):
sap_score R^2=0.604 MAPE=0.084
co2_emissions R^2=0.813 MAPE=0.130
peui_raw R^2=0.979 MAPE=0.026
space_heating_kwh R^2=0.823 MAPE=0.213
hot_water_kwh R^2=0.519 MAPE=0.115
peui_ucl excluded: UCL correction still needs wiring.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Currently fails on SapWindow.glazing_gap (first of ~30 fields the dataclass
incorrectly treats as required). Will go GREEN once 14j sweeps Optional.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
SAP10 EPCs with measured PV carry photovoltaic_supply as a nested
list of arrays (peak_power, pitch, orientation, overshading) rather
than the legacy unmeasured wrapper {none_or_no_details:
{percent_roof_area: N}}. The schema-21 dataclasses now accept both
shapes via Union[PhotovoltaicSupply, List[List[PhotovoltaicArray]]],
and from_dict._coerce now dispatches list values onto list type
variants of multi-type Unions.
EpcPropertyData.SapEnergySource gains
photovoltaic_arrays: Optional[List[PhotovoltaicArray]] — populated
when the measured shape is present, otherwise None. The legacy
photovoltaic_supply field is preserved for the fallback case.
Both schema-21.0.0 and 21.0.1 mappers dispatch via the new
_map_schema_21_pv helper.
Unblocks Slice 11 (PV feature aggregation in EpcMlTransform).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Thirteen window-aggregate features land on the transform: count,
total area, eight SAP-octant area columns (N/NE/E/SE/S/SW/W/NW),
area-weighted draught-proofing pct, and area-weighted u_value +
solar transmittance (nullable, populated only when windows carry
transmission_details). Windows with orientation outside 1-8 (0,
NR) contribute to count and total area but no octant.
Also: epc codes CSV (gov api /api/codes export, RdSAP-Schema-21.x +
older versions) moved next to EpcPropertyData as epc_codes.csv —
canonical SAP enum source for upcoming categorical-share slices.
.gitignore exception added so the reference CSV is tracked.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>