mirror of
https://github.com/Hestia-Homes/Model.git
synced 2026-06-30 13:10:47 +00:00
Merge pull request #1268 from Hestia-Homes/feature/hyde_make_it_more_accurate_with_tests
Feature/hyde make it more accurate with tests
This commit is contained in:
commit
4e3eb52a37
7 changed files with 252 additions and 86 deletions
|
|
@ -163,6 +163,47 @@ epc.json + elmhurst_summary.pdf + elmhurst_worksheet.pdf under
|
||||||
After all: re-run the modelling_e2e failure sweep to confirm the cohort skips drop
|
After all: re-run the modelling_e2e failure sweep to confirm the cohort skips drop
|
||||||
to zero (or only P6-class), and that no new gate regression slipped in.
|
to zero (or only P6-class), and that no new gate regression slipped in.
|
||||||
|
|
||||||
|
### ✅ OUTCOMES (2026-06-23, "do all") — 553 tests pass, gate 26/26, 0 new pyright
|
||||||
|
- **P1 ✅ FIXED** — `_normalize_sap_schema_16_x` derives `built_form` from the
|
||||||
|
dwelling_type text (`_derive_built_form_16x`: Mid-terrace→4 / End-terrace→3 /
|
||||||
|
Semi-detached→2 / Detached→1; flats → modal 4). built_form is ML-only (SAP calc
|
||||||
|
never reads it) so SAP-neutral; gate verified insensitive (26/26). 5 SAP-16.0
|
||||||
|
flat certs now map (8742→SAP 66=lodged, etc.). +2 tests.
|
||||||
|
- **P2 ✅ FIXED** — `photovoltaic_supply` as a measured-array LIST crashed
|
||||||
|
`from_rdsap_schema_{17_0,17_1,18_0,19_0,20_0_0}` (`'list' has no attribute
|
||||||
|
none_or_no_details`). All 5 now route through `_map_schema_21_pv`, whose list
|
||||||
|
branch is made dict-tolerant (`_pv_array_field` reads dict OR dataclass) — they
|
||||||
|
now CAPTURE the PV arrays like 21.0.x instead of crashing/dropping. cert
|
||||||
|
6102-…-2292 (uprn 22086693, RdSAP-20.0.0) maps with 2 arrays @1.14kW. +test.
|
||||||
|
- **P3 ✅ generalised / cert fail-loud** — `_normalize_sap_schema_16_x` now handles
|
||||||
|
`windows` as a dict (not just list). cert 9768 (uprn 100062614005, 16.1) still
|
||||||
|
fails loud — it ALSO omits door_count/habitable_room_count/glazed_area
|
||||||
|
(genuinely sparse "High performance glazing" cert) → CORRECT to fail; resilience
|
||||||
|
skips+reports it.
|
||||||
|
- **P6 ✅ same as P3** — cert 8257 (uprn 10070009758, 16.2) — identical sparse
|
||||||
|
"High performance glazing" pattern, fails loud on door_count. Correct.
|
||||||
|
- **P4 ⚖️ FOR KHALIM (gate decision)** — left `multiple_glazed_proportion` fail-loud.
|
||||||
|
Re-confirmed: defaulting it (ANY value) still regresses the gate (29/36 +
|
||||||
|
25/36) because it makes a frozen-fixture cert a prediction donor — the field is
|
||||||
|
ML-only (not carried to EpcPropertyData), so the regression is donor-pool
|
||||||
|
composition, NOT the value. Closing this gap needs a GATE RE-BASELINE decision
|
||||||
|
(whether the donor-pool tip is benign per the #1245 precedent) — your call; I did
|
||||||
|
not unilaterally loosen the tighten-only gate.
|
||||||
|
- **P5 🔍 FOR KHALIM (large recursive alignment)** — RdSAP-21.0.0 lacks the
|
||||||
|
ADR-0028 Optional widenings 21.0.1 has, and the gap is RECURSIVE (not just the
|
||||||
|
~13 top-level + SapWindow): `SapBuildingPart` (13 fields incl.
|
||||||
|
construction_age_band/wall_construction/identifier — CORE FABRIC),
|
||||||
|
`SapEnergySource` (pv_battery_count, wind_turbine_details), MainHeatingDetail,
|
||||||
|
ShowerOutlets, PvBatteries, Addendum, + top-level low_energy_fixed_lighting_
|
||||||
|
bulbs_count/multiple_glazed_proportion/suggested_improvements. Aligning to
|
||||||
|
21.0.1's "everything Optional, tolerate sparse" means the 21.0.0 mapper must
|
||||||
|
None-coalesce CORE fabric — a modelling decision in your domain. Partial widening
|
||||||
|
was REVERTED (half-aligned schema is worse). cert 3135 (uprn 100021919725) stays
|
||||||
|
fail-loud → resilience skips+reports. Recommend: align RdSapSchema21_0_0 fully to
|
||||||
|
21.0.1 (recursive) + review mapper coalescing, in a dedicated change.
|
||||||
|
- **Corpus-validation pending** for the 2 now-mappable certs (P1 uprn 10070004512,
|
||||||
|
P2 uprn 22086693) — Elmhurst build + summary/worksheet + commit.
|
||||||
|
|
||||||
- [x] 🔧 10096028301 — SAP-Schema-19.1.0 (full-SAP g/f FLAT, band M, combi PCDB 17929, MEV, AP50 3.5) · eng 82 / elm 82 (lodged 85) · PINNED engine 82. 🔧 mapper added: `from_sap_schema_19_1_0`. Built in Elmhurst (boiler 17929 via search, control CBE/2106, water from primary, MEV on, AP50 Blower Door 3.5+cert). Engine EXACTLY matches Elmhurst worksheet (82.11 vs 82); engine-on-Elmhurst-inputs 82.16 ≈ 82 → calculator faithful. −3 vs lodged = measured-U-vs-RdSAP-default + MEV extract-not-recovery (documented). No mapper change beyond coverage.
|
- [x] 🔧 10096028301 — SAP-Schema-19.1.0 (full-SAP g/f FLAT, band M, combi PCDB 17929, MEV, AP50 3.5) · eng 82 / elm 82 (lodged 85) · PINNED engine 82. 🔧 mapper added: `from_sap_schema_19_1_0`. Built in Elmhurst (boiler 17929 via search, control CBE/2106, water from primary, MEV on, AP50 Blower Door 3.5+cert). Engine EXACTLY matches Elmhurst worksheet (82.11 vs 82); engine-on-Elmhurst-inputs 82.16 ≈ 82 → calculator faithful. −3 vs lodged = measured-U-vs-RdSAP-default + MEV extract-not-recovery (documented). No mapper change beyond coverage.
|
||||||
- [x] 🔧 100021943298 — SAP-Schema-16.1 (g/f FLAT, band B, solid-brick internal, combi PCDB 10328) · eng 76 / elm 75 (lodged 72) · PINNED engine 76. 🔧 mapper added: `from_sap_schema_16_1`. Built in Elmhurst (boiler 10328 via search, control CBE/2106, water from primary, wall insulation thickness Unknown); worksheet 75 → engine within ~1 (tightest agreement, reduced-field). Boiler-select + water-heating + control dialogs all driven via automation (two-step row→Select / cascade + coordinate-OK). No mapper change beyond coverage.
|
- [x] 🔧 100021943298 — SAP-Schema-16.1 (g/f FLAT, band B, solid-brick internal, combi PCDB 10328) · eng 76 / elm 75 (lodged 72) · PINNED engine 76. 🔧 mapper added: `from_sap_schema_16_1`. Built in Elmhurst (boiler 10328 via search, control CBE/2106, water from primary, wall insulation thickness Unknown); worksheet 75 → engine within ~1 (tightest agreement, reduced-field). Boiler-select + water-heating + control dialogs all driven via automation (two-step row→Select / cascade + coordinate-OK). No mapper change beyond coverage.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -189,6 +189,15 @@ def _sap_opening_area_m2(width: Any, height: Any) -> float:
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _pv_array_field(array: Any, name: str) -> Any:
|
||||||
|
"""Read a PV-array field from either a `PhotovoltaicArray` dataclass (21.0.x,
|
||||||
|
where the schema Union parsed the nested list) or a raw dict (older schemas
|
||||||
|
that leave the list unparsed)."""
|
||||||
|
if isinstance(array, dict):
|
||||||
|
return cast(Dict[str, Any], array).get(name)
|
||||||
|
return getattr(array, name)
|
||||||
|
|
||||||
|
|
||||||
def _map_schema_21_pv(
|
def _map_schema_21_pv(
|
||||||
es_pv_supply: Any,
|
es_pv_supply: Any,
|
||||||
) -> tuple[Optional[PhotovoltaicSupply], Optional[List[PhotovoltaicArray]]]:
|
) -> tuple[Optional[PhotovoltaicSupply], Optional[List[PhotovoltaicArray]]]:
|
||||||
|
|
@ -206,12 +215,18 @@ def _map_schema_21_pv(
|
||||||
None. With no PV data at all, both are None.
|
None. With no PV data at all, both are None.
|
||||||
"""
|
"""
|
||||||
if isinstance(es_pv_supply, list):
|
if isinstance(es_pv_supply, list):
|
||||||
|
# The nested list's leaves are `PhotovoltaicArray` dataclasses when the
|
||||||
|
# schema's Union type parsed them (21.0.x), or raw dicts when an older
|
||||||
|
# schema (17.0–20.0.0) types the field as the wrapper only and leaves the
|
||||||
|
# list unparsed — `_pv_array_field` reads either shape.
|
||||||
flattened = [
|
flattened = [
|
||||||
PhotovoltaicArray(
|
PhotovoltaicArray(
|
||||||
peak_power=_measurement_value(array.peak_power),
|
peak_power=_measurement_value(_pv_array_field(array, "peak_power")),
|
||||||
pitch=int(_measurement_value(array.pitch)),
|
pitch=int(_measurement_value(_pv_array_field(array, "pitch"))),
|
||||||
orientation=_pv_orientation(array.orientation),
|
orientation=_pv_orientation(_pv_array_field(array, "orientation")),
|
||||||
overshading=int(_measurement_value(array.overshading)),
|
overshading=int(
|
||||||
|
_measurement_value(_pv_array_field(array, "overshading"))
|
||||||
|
),
|
||||||
)
|
)
|
||||||
for inner_list in es_pv_supply
|
for inner_list in es_pv_supply
|
||||||
for array in inner_list
|
for array in inner_list
|
||||||
|
|
@ -538,6 +553,7 @@ class EpcPropertyDataMapper:
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def from_rdsap_schema_17_0(schema: RdSapSchema17_0) -> EpcPropertyData:
|
def from_rdsap_schema_17_0(schema: RdSapSchema17_0) -> EpcPropertyData:
|
||||||
es = schema.sap_energy_source
|
es = schema.sap_energy_source
|
||||||
|
pv_supply, pv_arrays = _map_schema_21_pv(es.photovoltaic_supply)
|
||||||
return EpcPropertyData(
|
return EpcPropertyData(
|
||||||
uprn=schema.uprn,
|
uprn=schema.uprn,
|
||||||
assessment_type=schema.assessment_type,
|
assessment_type=schema.assessment_type,
|
||||||
|
|
@ -656,17 +672,8 @@ class EpcPropertyDataMapper:
|
||||||
is_dwelling_export_capable=False,
|
is_dwelling_export_capable=False,
|
||||||
wind_turbines_terrain_type=str(es.wind_turbines_terrain_type),
|
wind_turbines_terrain_type=str(es.wind_turbines_terrain_type),
|
||||||
electricity_smart_meter_present=False,
|
electricity_smart_meter_present=False,
|
||||||
photovoltaic_supply=(
|
photovoltaic_supply=pv_supply,
|
||||||
PhotovoltaicSupply(
|
photovoltaic_arrays=pv_arrays,
|
||||||
none_or_no_details=PhotovoltaicSupplyNoneOrNoDetails(
|
|
||||||
percent_roof_area=es.photovoltaic_supply.none_or_no_details.percent_roof_area,
|
|
||||||
)
|
|
||||||
)
|
|
||||||
# ADR-0028: photovoltaic_supply can be absent, None, or a
|
|
||||||
# sparse shape without none_or_no_details — guard the read.
|
|
||||||
if getattr(es.photovoltaic_supply, "none_or_no_details", None)
|
|
||||||
else None
|
|
||||||
),
|
|
||||||
),
|
),
|
||||||
sap_building_parts=[
|
sap_building_parts=[
|
||||||
SapBuildingPart(
|
SapBuildingPart(
|
||||||
|
|
@ -954,6 +961,7 @@ class EpcPropertyDataMapper:
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def from_rdsap_schema_17_1(schema: RdSapSchema17_1) -> EpcPropertyData:
|
def from_rdsap_schema_17_1(schema: RdSapSchema17_1) -> EpcPropertyData:
|
||||||
es = schema.sap_energy_source
|
es = schema.sap_energy_source
|
||||||
|
pv_supply, pv_arrays = _map_schema_21_pv(es.photovoltaic_supply)
|
||||||
# ADR-0028: instantaneous_wwhrs holds bath/shower ROOM counts.
|
# ADR-0028: instantaneous_wwhrs holds bath/shower ROOM counts.
|
||||||
iw = schema.sap_heating.instantaneous_wwhrs
|
iw = schema.sap_heating.instantaneous_wwhrs
|
||||||
return EpcPropertyData(
|
return EpcPropertyData(
|
||||||
|
|
@ -1097,15 +1105,8 @@ class EpcPropertyDataMapper:
|
||||||
is_dwelling_export_capable=False,
|
is_dwelling_export_capable=False,
|
||||||
wind_turbines_terrain_type=str(es.wind_turbines_terrain_type),
|
wind_turbines_terrain_type=str(es.wind_turbines_terrain_type),
|
||||||
electricity_smart_meter_present=False,
|
electricity_smart_meter_present=False,
|
||||||
photovoltaic_supply=(
|
photovoltaic_supply=pv_supply,
|
||||||
PhotovoltaicSupply(
|
photovoltaic_arrays=pv_arrays,
|
||||||
none_or_no_details=PhotovoltaicSupplyNoneOrNoDetails(
|
|
||||||
percent_roof_area=es.photovoltaic_supply.none_or_no_details.percent_roof_area,
|
|
||||||
)
|
|
||||||
)
|
|
||||||
if getattr(es.photovoltaic_supply, "none_or_no_details", None)
|
|
||||||
else None
|
|
||||||
),
|
|
||||||
),
|
),
|
||||||
sap_building_parts=[
|
sap_building_parts=[
|
||||||
SapBuildingPart(
|
SapBuildingPart(
|
||||||
|
|
@ -1145,6 +1146,7 @@ class EpcPropertyDataMapper:
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def from_rdsap_schema_18_0(schema: RdSapSchema18_0) -> EpcPropertyData:
|
def from_rdsap_schema_18_0(schema: RdSapSchema18_0) -> EpcPropertyData:
|
||||||
es = schema.sap_energy_source
|
es = schema.sap_energy_source
|
||||||
|
pv_supply, pv_arrays = _map_schema_21_pv(es.photovoltaic_supply)
|
||||||
# ADR-0028: instantaneous_wwhrs holds bath/shower ROOM counts.
|
# ADR-0028: instantaneous_wwhrs holds bath/shower ROOM counts.
|
||||||
iw = schema.sap_heating.instantaneous_wwhrs
|
iw = schema.sap_heating.instantaneous_wwhrs
|
||||||
return EpcPropertyData(
|
return EpcPropertyData(
|
||||||
|
|
@ -1295,16 +1297,8 @@ class EpcPropertyDataMapper:
|
||||||
is_dwelling_export_capable=False,
|
is_dwelling_export_capable=False,
|
||||||
wind_turbines_terrain_type=str(es.wind_turbines_terrain_type),
|
wind_turbines_terrain_type=str(es.wind_turbines_terrain_type),
|
||||||
electricity_smart_meter_present=False,
|
electricity_smart_meter_present=False,
|
||||||
photovoltaic_supply=(
|
photovoltaic_supply=pv_supply,
|
||||||
PhotovoltaicSupply(
|
photovoltaic_arrays=pv_arrays,
|
||||||
none_or_no_details=PhotovoltaicSupplyNoneOrNoDetails(
|
|
||||||
percent_roof_area=es.photovoltaic_supply.none_or_no_details.percent_roof_area,
|
|
||||||
)
|
|
||||||
)
|
|
||||||
if es.photovoltaic_supply
|
|
||||||
and es.photovoltaic_supply.none_or_no_details
|
|
||||||
else None
|
|
||||||
),
|
|
||||||
),
|
),
|
||||||
sap_building_parts=[
|
sap_building_parts=[
|
||||||
SapBuildingPart(
|
SapBuildingPart(
|
||||||
|
|
@ -1375,6 +1369,7 @@ class EpcPropertyDataMapper:
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def from_rdsap_schema_19_0(schema: RdSapSchema19_0) -> EpcPropertyData:
|
def from_rdsap_schema_19_0(schema: RdSapSchema19_0) -> EpcPropertyData:
|
||||||
es = schema.sap_energy_source
|
es = schema.sap_energy_source
|
||||||
|
pv_supply, pv_arrays = _map_schema_21_pv(es.photovoltaic_supply)
|
||||||
return EpcPropertyData(
|
return EpcPropertyData(
|
||||||
uprn=schema.uprn,
|
uprn=schema.uprn,
|
||||||
assessment_type=schema.assessment_type,
|
assessment_type=schema.assessment_type,
|
||||||
|
|
@ -1496,17 +1491,8 @@ class EpcPropertyDataMapper:
|
||||||
is_dwelling_export_capable=False,
|
is_dwelling_export_capable=False,
|
||||||
wind_turbines_terrain_type=str(es.wind_turbines_terrain_type),
|
wind_turbines_terrain_type=str(es.wind_turbines_terrain_type),
|
||||||
electricity_smart_meter_present=False,
|
electricity_smart_meter_present=False,
|
||||||
photovoltaic_supply=(
|
photovoltaic_supply=pv_supply,
|
||||||
PhotovoltaicSupply(
|
photovoltaic_arrays=pv_arrays,
|
||||||
none_or_no_details=PhotovoltaicSupplyNoneOrNoDetails(
|
|
||||||
percent_roof_area=es.photovoltaic_supply.none_or_no_details.percent_roof_area,
|
|
||||||
)
|
|
||||||
)
|
|
||||||
# ADR-0028: photovoltaic_supply can be absent, None, or a
|
|
||||||
# sparse shape without none_or_no_details — guard the read.
|
|
||||||
if getattr(es.photovoltaic_supply, "none_or_no_details", None)
|
|
||||||
else None
|
|
||||||
),
|
|
||||||
),
|
),
|
||||||
sap_building_parts=[
|
sap_building_parts=[
|
||||||
SapBuildingPart(
|
SapBuildingPart(
|
||||||
|
|
@ -1581,6 +1567,12 @@ class EpcPropertyDataMapper:
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def from_rdsap_schema_20_0_0(schema: RdSapSchema20_0_0) -> EpcPropertyData:
|
def from_rdsap_schema_20_0_0(schema: RdSapSchema20_0_0) -> EpcPropertyData:
|
||||||
es = schema.sap_energy_source
|
es = schema.sap_energy_source
|
||||||
|
# `photovoltaic_supply` is polymorphic in 20.0.0 too (None / wrapper dict /
|
||||||
|
# measured-array LIST) — route it through the shared dispatch like 21.0.0 so a
|
||||||
|
# cert lodging measured arrays as a list (e.g. cert 6102-6227-8000-0083-2292,
|
||||||
|
# uprn 22086693) maps instead of raising "'list' object has no attribute
|
||||||
|
# 'none_or_no_details'" and sinking the whole prediction cohort.
|
||||||
|
pv_supply, pv_arrays = _map_schema_21_pv(es.photovoltaic_supply)
|
||||||
# ADR-0027: instantaneous_wwhrs holds bath/shower ROOM counts.
|
# ADR-0027: instantaneous_wwhrs holds bath/shower ROOM counts.
|
||||||
iw = schema.sap_heating.instantaneous_wwhrs
|
iw = schema.sap_heating.instantaneous_wwhrs
|
||||||
return EpcPropertyData(
|
return EpcPropertyData(
|
||||||
|
|
@ -1737,16 +1729,8 @@ class EpcPropertyDataMapper:
|
||||||
is_dwelling_export_capable=False,
|
is_dwelling_export_capable=False,
|
||||||
wind_turbines_terrain_type=str(es.wind_turbines_terrain_type),
|
wind_turbines_terrain_type=str(es.wind_turbines_terrain_type),
|
||||||
electricity_smart_meter_present=False,
|
electricity_smart_meter_present=False,
|
||||||
photovoltaic_supply=(
|
photovoltaic_supply=pv_supply,
|
||||||
PhotovoltaicSupply(
|
photovoltaic_arrays=pv_arrays,
|
||||||
none_or_no_details=PhotovoltaicSupplyNoneOrNoDetails(
|
|
||||||
percent_roof_area=es.photovoltaic_supply.none_or_no_details.percent_roof_area,
|
|
||||||
)
|
|
||||||
)
|
|
||||||
if es.photovoltaic_supply
|
|
||||||
and es.photovoltaic_supply.none_or_no_details
|
|
||||||
else None
|
|
||||||
),
|
|
||||||
),
|
),
|
||||||
sap_building_parts=[
|
sap_building_parts=[
|
||||||
SapBuildingPart(
|
SapBuildingPart(
|
||||||
|
|
@ -3258,6 +3242,34 @@ def _default_missing_post_town(data: Dict[str, Any]) -> Dict[str, Any]:
|
||||||
return {**data, "post_town": ""}
|
return {**data, "post_town": ""}
|
||||||
|
|
||||||
|
|
||||||
|
def _derive_built_form_16x(dwelling_type: Any) -> int:
|
||||||
|
"""Derive the RdSAP `built_form` code from a cert's `dwelling_type` text when
|
||||||
|
the numeric field is omitted (some 16.0 certs). The form is stated in the
|
||||||
|
description for houses/bungalows ("Mid-terrace house" → 4, "End-terrace" → 3,
|
||||||
|
"Semi-detached" → 2, "Detached" → 1). Flats ("Ground-floor flat") do not encode
|
||||||
|
a form, so fall back to the modal 4 (Mid-terrace) — `built_form` is an ML-only
|
||||||
|
feature the SAP calculator never reads, so the fallback is SAP-neutral.
|
||||||
|
|
||||||
|
`dwelling_type` arrives as a plain string or a `{'value': ...}` dict (real-API
|
||||||
|
shape); both are handled."""
|
||||||
|
if isinstance(dwelling_type, dict):
|
||||||
|
dwelling_type = cast(Dict[str, Any], dwelling_type).get("value")
|
||||||
|
text = str(dwelling_type or "").lower()
|
||||||
|
if "semi-detached" in text:
|
||||||
|
return 2
|
||||||
|
if "detached" in text:
|
||||||
|
return 1
|
||||||
|
if "enclosed end" in text:
|
||||||
|
return 5
|
||||||
|
if "enclosed mid" in text:
|
||||||
|
return 6
|
||||||
|
if "end-terrace" in text:
|
||||||
|
return 3
|
||||||
|
if "mid-terrace" in text or "terrace" in text:
|
||||||
|
return 4
|
||||||
|
return 4 # flats / unstated form → modal built_form (SAP- and gate-neutral)
|
||||||
|
|
||||||
|
|
||||||
def _normalize_sap_schema_16_x(data: Dict[str, Any]) -> Dict[str, Any]:
|
def _normalize_sap_schema_16_x(data: Dict[str, Any]) -> Dict[str, Any]:
|
||||||
"""Rewrite a `SAP-Schema-16.2`/`16.3` API doc onto the `RdSAP-Schema-17.1`
|
"""Rewrite a `SAP-Schema-16.2`/`16.3` API doc onto the `RdSAP-Schema-17.1`
|
||||||
shape so it can reuse the tested `from_rdsap_schema_17_1` mapper.
|
shape so it can reuse the tested `from_rdsap_schema_17_1` mapper.
|
||||||
|
|
@ -3294,6 +3306,13 @@ def _normalize_sap_schema_16_x(data: Dict[str, Any]) -> Dict[str, Any]:
|
||||||
if isinstance(windows, list) and windows:
|
if isinstance(windows, list) and windows:
|
||||||
window_list: List[Any] = cast(List[Any], windows)
|
window_list: List[Any] = cast(List[Any], windows)
|
||||||
d.setdefault("window", window_list[0])
|
d.setdefault("window", window_list[0])
|
||||||
|
elif isinstance(windows, dict):
|
||||||
|
# Some 16.x certs (e.g. "High performance glazing" certs 9768-…-7974 /
|
||||||
|
# 8257-…-4992) lodge `windows` as a single dict rather than a list — take
|
||||||
|
# it directly. (These particular certs still fail loud below: they also omit
|
||||||
|
# door_count / habitable_room_count / glazed_area — genuinely insufficient
|
||||||
|
# data — but a windows-as-dict cert that DOES carry the rest now maps.)
|
||||||
|
d.setdefault("window", windows)
|
||||||
d.setdefault("schema_version_original", d.get("schema_version", ""))
|
d.setdefault("schema_version_original", d.get("schema_version", ""))
|
||||||
# Some 16.x certs (notably 16.0) omit `tenure` — RdSapSchema17_1 requires it.
|
# Some 16.x certs (notably 16.0) omit `tenure` — RdSapSchema17_1 requires it.
|
||||||
# It is address/occupancy metadata the SAP cascade never reads, so default a
|
# It is address/occupancy metadata the SAP cascade never reads, so default a
|
||||||
|
|
@ -3313,6 +3332,18 @@ def _normalize_sap_schema_16_x(data: Dict[str, Any]) -> Dict[str, Any]:
|
||||||
# still fail loud — that is correct, the fabric data is insufficient.
|
# still fail loud — that is correct, the fabric data is insufficient.
|
||||||
d.setdefault("insulated_door_count", 0)
|
d.setdefault("insulated_door_count", 0)
|
||||||
|
|
||||||
|
# Some 16.x certs (notably 16.0, e.g. cert 8742-6624-9300-2780-4926) lodge
|
||||||
|
# `property_type`/`dwelling_type` but omit `built_form` — RdSapSchema17_1
|
||||||
|
# requires it. `built_form` is an ML-only feature (the SAP-10 calculator never
|
||||||
|
# reads it — only `sap10_ml/transform`), so it is SAP-neutral; and a default is
|
||||||
|
# gate-neutral here (verified — the component-accuracy fixture has no
|
||||||
|
# built-form-missing cert, so unlike `multiple_glazed_proportion` it does not
|
||||||
|
# tip the donor pool). DERIVE it from the dwelling_type description where the
|
||||||
|
# form is stated (houses), falling back to the modal value for flats (whose
|
||||||
|
# dwelling_type — "Ground-floor flat" — does not encode form).
|
||||||
|
if "built_form" not in d:
|
||||||
|
d["built_form"] = _derive_built_form_16x(d.get("dwelling_type"))
|
||||||
|
|
||||||
# NB: we deliberately do NOT default `multiple_glazed_proportion` here. A 16.x
|
# NB: we deliberately do NOT default `multiple_glazed_proportion` here. A 16.x
|
||||||
# cert that omits it (e.g. 16.3 cert 0418-3986-7250-2884-7970) is left to fail
|
# cert that omits it (e.g. 16.3 cert 0418-3986-7250-2884-7970) is left to fail
|
||||||
# the RdSapSchema17_1 parse and be handled by the cohort skip-and-report path —
|
# the RdSapSchema17_1 parse and be handled by the cohort skip-and-report path —
|
||||||
|
|
|
||||||
|
|
@ -191,6 +191,30 @@ class TestFromRdSapSchema20_0_0:
|
||||||
schema = from_dict(RdSapSchema20_0_0, load("20_0_0.json"))
|
schema = from_dict(RdSapSchema20_0_0, load("20_0_0.json"))
|
||||||
return EpcPropertyDataMapper.from_rdsap_schema_20_0_0(schema)
|
return EpcPropertyDataMapper.from_rdsap_schema_20_0_0(schema)
|
||||||
|
|
||||||
|
def test_photovoltaic_supply_as_dict_list_is_mapped_not_crashed(self) -> None:
|
||||||
|
# 20.0.0 types `photovoltaic_supply` as the wrapper only (not the 21.0.x
|
||||||
|
# Union), so a cert lodging measured arrays as a LIST leaves the leaves as
|
||||||
|
# raw dicts. The mapper previously did `es.photovoltaic_supply.none_or_no_
|
||||||
|
# details` → "'list' object has no attribute 'none_or_no_details'", sinking
|
||||||
|
# the whole prediction cohort. Regression for cert 6102-6227-8000-0083-2292
|
||||||
|
# (uprn 22086693). It now routes through `_map_schema_21_pv`, whose
|
||||||
|
# dict-tolerant array reader captures the arrays.
|
||||||
|
data = load("20_0_0.json")
|
||||||
|
data["sap_energy_source"]["photovoltaic_supply"] = [
|
||||||
|
[{"pitch": 2, "peak_power": {"value": 1.14, "quantity": "kW"},
|
||||||
|
"orientation": 5, "overshading": 1}],
|
||||||
|
[{"pitch": 2, "peak_power": 1.14, "orientation": 5, "overshading": 1}],
|
||||||
|
]
|
||||||
|
schema = from_dict(RdSapSchema20_0_0, data)
|
||||||
|
|
||||||
|
result = EpcPropertyDataMapper.from_rdsap_schema_20_0_0(schema)
|
||||||
|
|
||||||
|
arrays = result.sap_energy_source.photovoltaic_arrays
|
||||||
|
assert arrays is not None and len(arrays) == 2
|
||||||
|
assert arrays[0].peak_power == 1.14
|
||||||
|
assert arrays[0].orientation == 5
|
||||||
|
assert result.sap_energy_source.photovoltaic_supply is None
|
||||||
|
|
||||||
def test_uprn(self, result: EpcPropertyData) -> None:
|
def test_uprn(self, result: EpcPropertyData) -> None:
|
||||||
assert result.uprn == 12457
|
assert result.uprn == 12457
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -523,6 +523,32 @@ class TestFromSapSchema16_2:
|
||||||
assert epc.uprn == 100020933894
|
assert epc.uprn == 100020933894
|
||||||
assert Sap10Calculator().calculate(epc).sap_score == 61 # lodged 56
|
assert Sap10Calculator().calculate(epc).sap_score == 61 # lodged 56
|
||||||
|
|
||||||
|
def test_16_x_missing_built_form_is_derived_from_dwelling_type(self) -> None:
|
||||||
|
# Some 16.0 certs (e.g. cert 8742-6624-9300-2780-4926) lodge
|
||||||
|
# `property_type`/`dwelling_type` but omit `built_form`, which RdSapSchema17_1
|
||||||
|
# requires — previously aborted the prediction cohort. The normaliser derives
|
||||||
|
# it from the dwelling_type text ("Semi-detached house" → 2). built_form is
|
||||||
|
# ML-only (SAP calc never reads it), so this is SAP- and gate-neutral.
|
||||||
|
data = load("sap_16_0.json")
|
||||||
|
del data["built_form"]
|
||||||
|
assert data["dwelling_type"] == "Semi-detached house"
|
||||||
|
|
||||||
|
epc = EpcPropertyDataMapper.from_api_response(data)
|
||||||
|
|
||||||
|
assert isinstance(epc, EpcPropertyData)
|
||||||
|
assert epc.built_form == "2"
|
||||||
|
|
||||||
|
def test_16_x_missing_built_form_for_flat_falls_back_to_modal(self) -> None:
|
||||||
|
# A flat's dwelling_type ("Ground-floor flat") does not encode a form, so the
|
||||||
|
# derivation falls back to the modal 4 (Mid-terrace) — keeps the cert mappable.
|
||||||
|
data = load("sap_16_0.json")
|
||||||
|
del data["built_form"]
|
||||||
|
data["dwelling_type"] = "Ground-floor flat"
|
||||||
|
|
||||||
|
epc = EpcPropertyDataMapper.from_api_response(data)
|
||||||
|
|
||||||
|
assert epc.built_form == "4"
|
||||||
|
|
||||||
def test_16_x_missing_insulated_door_count_defaults_to_zero(self) -> None:
|
def test_16_x_missing_insulated_door_count_defaults_to_zero(self) -> None:
|
||||||
# Some 16.x certs lodge `door_count` but omit `insulated_door_count`,
|
# Some 16.x certs lodge `door_count` but omit `insulated_door_count`,
|
||||||
# which RdSapSchema17_1 requires — previously raised "missing required
|
# which RdSapSchema17_1 requires — previously raised "missing required
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
from sqlmodel import Session, col, delete
|
from sqlmodel import Session, col, update
|
||||||
|
|
||||||
from domain.modelling.plan import Plan
|
from domain.modelling.plan import Plan
|
||||||
from infrastructure.postgres.modelling import PlanModel, RecommendationModel
|
from infrastructure.postgres.modelling import PlanModel, RecommendationModel
|
||||||
|
|
@ -10,7 +10,12 @@ from repositories.plan.plan_repository import PlanRepository
|
||||||
class PlanPostgresRepository(PlanRepository):
|
class PlanPostgresRepository(PlanRepository):
|
||||||
"""Maps a Plan and its Plan Measures onto the live ``plan`` /
|
"""Maps a Plan and its Plan Measures onto the live ``plan`` /
|
||||||
``recommendation`` tables (ADR-0017). Does not commit — the Unit of Work
|
``recommendation`` tables (ADR-0017). Does not commit — the Unit of Work
|
||||||
owns the transaction (ADR-0012)."""
|
owns the transaction (ADR-0012).
|
||||||
|
|
||||||
|
A re-run INSERTs a fresh Plan rather than deleting the prior one (the cascade
|
||||||
|
delete was slow); when the new Plan is the default it demotes any prior
|
||||||
|
default Plan for the same (property_id, scenario_id) to ``is_default=False``,
|
||||||
|
so readers can select the current Plan via ``is_default=True``."""
|
||||||
|
|
||||||
def __init__(self, session: Session) -> None:
|
def __init__(self, session: Session) -> None:
|
||||||
self._session = session
|
self._session = session
|
||||||
|
|
@ -24,15 +29,20 @@ class PlanPostgresRepository(PlanRepository):
|
||||||
portfolio_id: int,
|
portfolio_id: int,
|
||||||
is_default: bool,
|
is_default: bool,
|
||||||
) -> int:
|
) -> int:
|
||||||
# Idempotent replace for (property_id, scenario_id): deleting the Plan
|
# Soft-replace (ADR-0012): keep prior Plans as history rather than DELETEing
|
||||||
# cascades to its recommendation rows via the plan_id FK (ON DELETE
|
# them — the cascade delete of recommendation rows was the slow part. When
|
||||||
# CASCADE), so a re-run overwrites rather than duplicating (ADR-0012).
|
# this Plan is the default, demote every prior Plan for the same
|
||||||
self._session.exec( # type: ignore[call-overload]
|
# (property_id, scenario_id) to is_default=False, so exactly one Plan for
|
||||||
delete(PlanModel).where(
|
# the pair stays default (the one just inserted).
|
||||||
col(PlanModel.property_id) == property_id,
|
if is_default:
|
||||||
col(PlanModel.scenario_id) == scenario_id,
|
self._session.exec( # type: ignore[call-overload]
|
||||||
|
update(PlanModel)
|
||||||
|
.where(
|
||||||
|
col(PlanModel.property_id) == property_id,
|
||||||
|
col(PlanModel.scenario_id) == scenario_id,
|
||||||
|
)
|
||||||
|
.values(is_default=False)
|
||||||
)
|
)
|
||||||
)
|
|
||||||
|
|
||||||
plan_row = PlanModel.from_domain(
|
plan_row = PlanModel.from_domain(
|
||||||
plan,
|
plan,
|
||||||
|
|
|
||||||
|
|
@ -8,10 +8,12 @@ from domain.modelling.plan import Plan
|
||||||
class PlanRepository(ABC):
|
class PlanRepository(ABC):
|
||||||
"""Persists a Plan (and its Plan Measures) for a Property + Scenario.
|
"""Persists a Plan (and its Plan Measures) for a Property + Scenario.
|
||||||
|
|
||||||
One Plan per (Property, Scenario). The write is idempotent on re-run: it
|
A re-run INSERTs a fresh Plan and keeps the prior one as history rather than
|
||||||
replaces the existing Plan for that pair rather than duplicating (ADR-0012
|
deleting it. When the new Plan is the default, prior default Plans for the
|
||||||
/ ADR-0017). `portfolio_id` and `is_default` are supplied by the
|
same (Property, Scenario) are demoted to `is_default=False`, so the current
|
||||||
orchestrator (the former from the trigger, the latter from the Scenario).
|
Plan is the one with `is_default=True` (ADR-0012 / ADR-0017). `portfolio_id`
|
||||||
|
and `is_default` are supplied by the orchestrator (the former from the
|
||||||
|
trigger, the latter from the Scenario).
|
||||||
"""
|
"""
|
||||||
|
|
||||||
@abstractmethod
|
@abstractmethod
|
||||||
|
|
@ -24,6 +26,7 @@ class PlanRepository(ABC):
|
||||||
portfolio_id: int,
|
portfolio_id: int,
|
||||||
is_default: bool,
|
is_default: bool,
|
||||||
) -> int:
|
) -> int:
|
||||||
"""Persist ``plan`` and return its Plan id, replacing any existing Plan
|
"""Persist ``plan`` and return its Plan id. Keeps prior Plans for
|
||||||
for ``(property_id, scenario_id)``."""
|
``(property_id, scenario_id)`` as history; when ``is_default`` is True,
|
||||||
|
demotes those prior Plans to ``is_default=False``."""
|
||||||
...
|
...
|
||||||
|
|
|
||||||
|
|
@ -2,10 +2,11 @@
|
||||||
Plan Measures to the live ``plan`` / ``recommendation`` tables (ADR-0017).
|
Plan Measures to the live ``plan`` / ``recommendation`` tables (ADR-0017).
|
||||||
|
|
||||||
The Plan is the parent; each selected Plan Measure is a ``recommendation`` row
|
The Plan is the parent; each selected Plan Measure is a ``recommendation`` row
|
||||||
linked by the new ``plan_id`` FK. A re-run replaces (delete the Plan for the
|
linked by the new ``plan_id`` FK. A re-run INSERTs a fresh Plan and keeps the
|
||||||
(property, scenario) → cascade its recommendations → insert fresh), so the
|
prior one as history (no cascade delete); when the new Plan is the default it
|
||||||
batch write is idempotent (ADR-0012). CO₂ is stored in tonnes (calculator kg
|
demotes prior default Plans for the (property, scenario) to ``is_default=False``
|
||||||
÷ 1000) to match the live column contract.
|
(ADR-0012). CO₂ is stored in tonnes (calculator kg ÷ 1000) to match the live
|
||||||
|
column contract.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
@ -147,31 +148,61 @@ def test_save_persists_null_per_measure_savings_when_unbilled(
|
||||||
assert rec_rows[0].energy_cost_savings is None
|
assert rec_rows[0].energy_cost_savings is None
|
||||||
|
|
||||||
|
|
||||||
def test_save_is_idempotent_on_rerun_for_the_same_property_and_scenario(
|
def test_rerun_keeps_history_and_demotes_the_prior_default_plan(
|
||||||
db_engine: Engine,
|
db_engine: Engine,
|
||||||
) -> None:
|
) -> None:
|
||||||
# Arrange — first run
|
# Arrange — first (default) run
|
||||||
with Session(db_engine) as session:
|
with Session(db_engine) as session:
|
||||||
PlanPostgresRepository(session).save(
|
first_id = PlanPostgresRepository(session).save(
|
||||||
_plan(), property_id=10, scenario_id=7, portfolio_id=1, is_default=True
|
_plan(), property_id=10, scenario_id=7, portfolio_id=1, is_default=True
|
||||||
)
|
)
|
||||||
session.commit()
|
session.commit()
|
||||||
|
|
||||||
# Act — re-run the same (property, scenario)
|
# Act — re-run the same (property, scenario) as the default
|
||||||
with Session(db_engine) as session:
|
with Session(db_engine) as session:
|
||||||
PlanPostgresRepository(session).save(
|
second_id = PlanPostgresRepository(session).save(
|
||||||
_plan(), property_id=10, scenario_id=7, portfolio_id=1, is_default=True
|
_plan(), property_id=10, scenario_id=7, portfolio_id=1, is_default=True
|
||||||
)
|
)
|
||||||
session.commit()
|
session.commit()
|
||||||
|
|
||||||
# Assert — replaced, not duplicated (cascade removed the old measures)
|
# Assert — the prior Plan is kept as history (no delete), and only the new
|
||||||
|
# Plan is the default; exactly one Plan for the pair stays is_default=True.
|
||||||
with Session(db_engine) as session:
|
with Session(db_engine) as session:
|
||||||
plan_rows = session.exec(
|
plan_rows = session.exec(
|
||||||
select(PlanModel).where(col(PlanModel.property_id) == 10)
|
select(PlanModel).where(col(PlanModel.property_id) == 10)
|
||||||
).all()
|
).all()
|
||||||
rec_rows = session.exec(
|
by_id = {p.id: p for p in plan_rows}
|
||||||
select(RecommendationModel).where(col(RecommendationModel.property_id) == 10)
|
|
||||||
).all()
|
|
||||||
|
|
||||||
assert len(plan_rows) == 1
|
assert len(plan_rows) == 2
|
||||||
assert len(rec_rows) == 1
|
assert first_id != second_id
|
||||||
|
assert by_id[first_id].is_default is False # demoted
|
||||||
|
assert by_id[second_id].is_default is True # the current default
|
||||||
|
assert sum(1 for p in plan_rows if p.is_default) == 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_rerun_as_non_default_does_not_demote_the_prior_default(
|
||||||
|
db_engine: Engine,
|
||||||
|
) -> None:
|
||||||
|
# Arrange — a default Plan exists
|
||||||
|
with Session(db_engine) as session:
|
||||||
|
first_id = PlanPostgresRepository(session).save(
|
||||||
|
_plan(), property_id=12, scenario_id=7, portfolio_id=1, is_default=True
|
||||||
|
)
|
||||||
|
session.commit()
|
||||||
|
|
||||||
|
# Act — re-run as NON-default (e.g. a non-default scenario); no demotion runs
|
||||||
|
with Session(db_engine) as session:
|
||||||
|
PlanPostgresRepository(session).save(
|
||||||
|
_plan(), property_id=12, scenario_id=7, portfolio_id=1, is_default=False
|
||||||
|
)
|
||||||
|
session.commit()
|
||||||
|
|
||||||
|
# Assert — the prior default is untouched (we only demote when saving a default)
|
||||||
|
with Session(db_engine) as session:
|
||||||
|
plan_rows = session.exec(
|
||||||
|
select(PlanModel).where(col(PlanModel.property_id) == 12)
|
||||||
|
).all()
|
||||||
|
by_id = {p.id: p for p in plan_rows}
|
||||||
|
|
||||||
|
assert len(plan_rows) == 2
|
||||||
|
assert by_id[first_id].is_default is True
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue