Model/backend/documents_parser
Khalim Conn-Kowlessar 043620802f Slice S0380.53: Elmhurst §14.0 "Main Heating SAP Code" extraction + strict-raise
Cert 000565 surfaced an Elmhurst extractor schema gap. §14.0 lodges
"Main Heating SAP Code 224" identifying Main 1 as an Air Source Heat
Pump (SAP 10.2 Table 4a row 224: "Air source heat pump, 2013 or
later") — but the extractor was dropping the line. The mapper
therefore produced a `MainHeatingDetail` with `sap_main_heating_code
= None` AND `main_heating_index_number = None` (because `PCDF boiler
Reference = 0` for HP certs), leaving the cascade to fall back to
the 0.80 gas-boiler default efficiency.

Cascade impact on cert 000565 main_heating_fuel_kwh_per_yr pin:
- Before: actual 62,375.80 kWh/yr (= 59,008 / 0.80 wrong default)
  Δ +27,665.01 vs U985-0001-000565.pdf expected 34,710.79
- After:  actual 29,353.32 kWh/yr (= 59,008 / 1.70 HP COP via §A4.1)
  Δ −5,357.47 (remaining gap is on the space_heating side, not
  heating efficiency)

The strict-raise mirrors [[unmapped-api-code]] (Slice S0380.51) and
[[unmapped-elmhurst-label]] (cylinder size / glazing type) — when
neither the §14.0 SAP code nor the PCDB boiler reference identifies
Main 1, the mapper raises `UnmappedElmhurstLabel("main_heating",
...)` so the coverage gap surfaces at extraction time instead of as
an opaque downstream SAP delta. Per user end-of-S0380.52 directive:
"if we're missing mapping on EpcPropertyDataMapper - let's raise an
exception".

Spec source: SAP 10.2 §A4 Appendix A "Heat pump cascade", Table 4a
row 224 (Air source heat pump, 2013 or later) — `seasonal_efficiency`
reads the SAP code when no PCDB Table 105/362 record overrides.

Touched:
- datatypes/epc/surveys/elmhurst_site_notes.py: `MainHeating.
  main_heating_sap_code: Optional[int]` field added (treat 0 as None
  per Elmhurst convention — PCDB-listed boilers lodge §14.0 SAP code
  as 0 and identify themselves via the PCDB index instead)
- backend/documents_parser/elmhurst_extractor.py:
  `_extract_main_heating` reads §14.0 "Main Heating SAP Code" via the
  existing `_local_val` slice helper; 0/absent → None
- datatypes/epc/domain/mapper.py: `_map_elmhurst_sap_heating` passes
  `sap_main_heating_code=mh.main_heating_sap_code` to
  `MainHeatingDetail`, and raises `UnmappedElmhurstLabel` when
  neither identifier resolves

Cohort regression check: 415 pass + 10 expected 000565 failures
(unchanged from S0380.52 — same pins, different residuals). Pyright
net-zero on all 3 touched files.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-01 16:28:47 +00:00
..
handler address JTK review comments 2026-04-20 15:11:17 +00:00
tests Slice S0380.52: cert 000565 Elmhurst-only mapper-driven cascade pin + glazing-label coverage 2026-06-01 16:28:47 +00:00
__init__.py Map to RdSapSiteNotes from site notes JSON 🟥 2026-04-16 13:54:03 +00:00
db_writer.py include updating epc_property_data to pashub to ara workflow 2026-04-29 09:55:14 +00:00
elmhurst_extractor.py Slice S0380.53: Elmhurst §14.0 "Main Heating SAP Code" extraction + strict-raise 2026-06-01 16:28:47 +00:00
extractor.py Handle wall thickness "Unmeasurable" 🟩 2026-04-30 16:41:16 +00:00
local_runner.py update local runner to work for elmhurst 2026-04-24 14:01:36 +00:00
parser.py load ecmk site notes to db 2026-04-29 11:20:47 +00:00
pdf.py update local runner to work for elmhurst 2026-04-24 14:01:36 +00:00