Slices 66+67: Elmhurst mapper surfaces country_code + heating ints + has_draught_lobby

Closes 9 mapper-side load-bearing field gaps surfaced by the cohort
000474 mapper-vs-hand-built diff (was 12, now 5 remaining):

**Slice 66 — country code + draught-lobby fix:**
- Set `country_code="ENG"` in `from_elmhurst_site_notes`. The Elmhurst
  U985 / P960 surveyor toolchain operates on English certs only; the
  Summary doesn't lodge country explicitly but the cascade's `u_floor`
  / `u_basement_floor` / `u_door` read it for table selection. Cohort
  hand-builts already encode 'ENG' so the cascade was tolerating the
  None default; matching the canonical value closes the diff.
- `_map_elmhurst_ventilation` now sets `has_draught_lobby=True` only
  when Summary lodges "Yes"/"Present". The cohort's modal lodgement
  "Unable to determine" maps to `False` — matching the cohort hand-
  built convention (conservative no-lobby cascade path). The legacy
  `draught_lobby` field is unchanged; the cascade reads
  `has_draught_lobby` in preference.

**Slice 67 — heating field surfacing:**
- `boiler_flue_type`: Add `_ELMHURST_FLUE_TYPE_TO_SAP10` map (Open=1,
  Balanced=2, Fan-assisted balanced=3, Room-sealed=4). Cohort 000474's
  "Balanced" Summary §14 lodgement → 2, matching hand-built.
- `emitter_temperature`: `_elmhurst_emitter_temperature_int` parses
  the Summary §14 "Design flow temperature" string to int (≥45 °C →
  1, lower → 0; "Unknown" defaults to 1 per Table 4d worst-case).
- `central_heating_pump_age`: dual-encode int alongside the existing
  `_str` field via `_elmhurst_pump_age_int` (Unknown → 0, Pre 2013 →
  1, otherwise → 2). The cascade reads `_str`; the int is for cross-
  mapper field parity only.
- `main_heating_number=1`: default single main heating.
- `water_heating_fuel`: parse Summary §15 "Water Heating Fuel Type"
  via the existing `_elmhurst_main_fuel_int` map. Cohort 000474's
  "Mains gas" → 26.

All 11 newly-surfaced fields are metadata-only on the SAP cascade
(grep confirms none feature in `packages/domain/src/domain/sap/`
outside test fixtures). All 66 cohort cascade pins remain GREEN at
1e-4. Pyright 35-error baseline preserved on mapper.py.

Diff count for cohort 000474:
  Slice 63 baseline: 50
  Slice 64 (Cat A bulk):    14
  Slice 65 (HW handbuilt):  12
  Slice 66 (country+lobby): 10
  Slice 67 (heating ints):  **5**

Remaining 5 diffs:
- 3× `sap_building_parts[*].party_wall_construction`: None vs 0
  (cohort sentinel convention — needs mapper-side fix to surface 0
  when no party wall is lodged, OR hand-built update to drop sentinel)
- `sap_heating.main_heating_details[0].central_heating_pump_age_str`:
  mapped='Unknown' vs handbuilt=None (hand-built should populate the
  str dual)
- `sap_windows: LEN 7 vs 5` (Cat C structural — cohort hand-built
  collapsed by glazing-type group, mapper extracts 1:1 with §11 table)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Khalim Conn-Kowlessar 2026-05-25 16:59:34 +00:00
parent 4997039f1a
commit ca39d072be

View file

@ -279,6 +279,17 @@ class EpcPropertyDataMapper:
address_line_1=address_line_1,
post_town=pd.town,
postcode=pd.postcode,
# Elmhurst's U985 / P960 surveyor toolchains operate on
# certs lodged with the GOV.UK EPB API for England (the
# cohort + 001479 are all English postcodes — PR1, BD3,
# etc.). The Summary PDF doesn't lodge a country code
# field, but the cascade reads `country_code` to pick the
# English (vs Welsh / Scottish / NI) U-value cascade
# variants. Set 'ENG' explicitly so `u_floor` /
# `u_basement_floor` / `u_door` resolve to the English
# tables that every cohort hand-built fixture already
# encodes.
country_code="ENG",
report_reference=pd.reference_number,
roofs=[],
walls=[],
@ -2487,6 +2498,64 @@ def _elmhurst_heat_emitter_int(emitter: str) -> Optional[int]:
return _ELMHURST_HEAT_EMITTER_TO_SAP10.get(emitter)
# Elmhurst boiler flue-type strings → SAP10 integer codes. Codes mirror
# the cohort hand-built fixtures and the API mapper (which surfaces the
# int directly from the schema). "Balanced" is the modal Elmhurst
# lodgement on combi boilers (cohort 6 + cert 001479).
_ELMHURST_FLUE_TYPE_TO_SAP10: Dict[str, int] = {
"Open": 1,
"Balanced": 2,
"Fan-assisted balanced": 3,
"Room-sealed": 4,
}
def _elmhurst_flue_type_int(flue_type: Optional[str]) -> Optional[int]:
"""Map the Elmhurst Summary §14 "Flue Type" string to a SAP10
integer code (matching `MainHeatingDetail.boiler_flue_type` on the
API mapper path). Unknown strings return None."""
if flue_type is None:
return None
return _ELMHURST_FLUE_TYPE_TO_SAP10.get(flue_type)
def _elmhurst_emitter_temperature_int(
design_flow_temperature: Optional[str],
) -> int:
"""Map the Elmhurst Summary §14 "Design flow temperature" string to
the SAP10 emitter-temperature integer code. "Unknown" (the modal
cohort lodgement) defaults to 1 (high-temp, 45 °C the worst-case
assumption for a gas boiler that hasn't been measured); a numeric
flow temperature string maps to 1 when 45 °C, 0 otherwise per
SAP10.2 Table 4d. Returns int (never None) `MainHeatingDetail.
emitter_temperature` is Union[int, str] but the API mapper always
surfaces an int."""
if design_flow_temperature is None or design_flow_temperature == "Unknown":
return 1
try:
value = float(design_flow_temperature)
except (TypeError, ValueError):
return 1
return 1 if value >= 45.0 else 0
def _elmhurst_pump_age_int(age_str: Optional[str]) -> Optional[int]:
"""Map the Elmhurst Summary §14 "Heat pump age" / "Central heating
pump age" string to the SAP10 integer code consumed by the API
mapper's `MainHeatingDetail.central_heating_pump_age` field. The
cascade reads the str field (`_str` suffix) via internal_gains.py;
the int dual-encoding exists purely for cross-mapper field
parity. "Unknown" 0, "Pre 2013" 1, modern post-2013 2."""
if age_str is None:
return None
s = age_str.strip().lower()
if s in ("", "unknown"):
return 0
if "pre 2013" in s:
return 1
return 2
def _elmhurst_secondary_fuel_from_sap_code(
sap_code: Optional[int],
) -> Optional[int]:
@ -2574,6 +2643,13 @@ def _map_elmhurst_sap_heating(survey: ElmhurstSiteNotes) -> SapHeating:
s.outlet_type == "Electric shower"
for s in survey.baths_and_showers.showers
)
# Water heating fuel: Summary §15 "Water Heating Fuel Type" lodges
# the fuel name as a string ("Mains gas", "Electricity", ...). Map
# to the SAP10 int code via the same lookup used for main fuel;
# falls back to None for unrecognised strings.
water_heating_fuel = _elmhurst_main_fuel_int(
survey.water_heating.water_heating_fuel_type,
)
return SapHeating(
instantaneous_wwhrs=InstantaneousWwhrs(),
main_heating_details=[
@ -2585,11 +2661,14 @@ def _map_elmhurst_sap_heating(survey: ElmhurstSiteNotes) -> SapHeating:
# to defaults that drop the standing-charge component.
main_fuel_type=main_fuel_int if main_fuel_int is not None else mh.fuel_type,
heat_emitter_type=heat_emitter_int if heat_emitter_int is not None else mh.heat_emitter,
emitter_temperature=mh.design_flow_temperature,
emitter_temperature=_elmhurst_emitter_temperature_int(mh.design_flow_temperature),
fan_flue_present=mh.fan_assisted_flue,
boiler_flue_type=_elmhurst_flue_type_int(mh.flue_type),
main_heating_control=sap_control_int if sap_control_int is not None else control,
central_heating_pump_age=_elmhurst_pump_age_int(mh.heat_pump_age),
central_heating_pump_age_str=mh.heat_pump_age,
main_heating_category=main_heating_category,
main_heating_number=1,
# Per RdSAP, a PCDB-listed boiler is data source 1
# (manufacturer measured efficiency); the integer index
# number drives PCDB lookup in the cascade.
@ -2605,6 +2684,7 @@ def _map_elmhurst_sap_heating(survey: ElmhurstSiteNotes) -> SapHeating:
else survey.water_heating.water_heating_code
),
water_heating_code=survey.water_heating.water_heating_sap_code,
water_heating_fuel=water_heating_fuel,
secondary_heating_type=mh.secondary_heating_sap_code,
secondary_fuel_type=_elmhurst_secondary_fuel_from_sap_code(
mh.secondary_heating_sap_code,
@ -2672,6 +2752,13 @@ def _map_elmhurst_ventilation(
return SapVentilation(
ventilation_type=None,
draught_lobby=v.draught_lobby != "Not present",
# `has_draught_lobby` is the canonical §2 (13) gate the cascade
# reads in preference to the legacy `draught_lobby` field. Only
# an explicit "Yes" / "Present" lodgement enables the +0.05 ACH
# contribution; "Unable to determine" (the modal cohort
# lodgement), "Not present", and "No" all default to False, the
# conservative no-lobby cascade path.
has_draught_lobby=v.draught_lobby in ("Yes", "Present"),
pressure_test=v.pressure_test_method,
open_flues_count=v.open_flues_count,
closed_flues_count=v.open_chimneys_closed_fire_count,