Model/backend/documents_parser
Khalim Conn-Kowlessar 6b02bad018 Slice S0380.64: Elmhurst per-extension wall_construction mappings + strict-raise
Pre-S0380.64 the mapper silently fell through to wall_construction=None
on three Elmhurst code lodgements that the cohort PDFs use:

  - "SG Stone: granite or whinstone" (cert 000565 Ext1)
  - "B Basement wall" (cert 000565 Ext3 + Ext4)
  - "CF Cavity masonry filled" party wall (cert 000565 Ext1)

Cascade impact on cert 000565 (vs U985-0001-000565.pdf worksheet):
  - sap_score                30 → 29 EXACT (was Δ +1)
  - sap_score_continuous     30.23 → 29.14 (Δ +1.72 → +0.63)
  - space_heating_kwh_per_yr 57909 → 59274 (Δ −1100 → +266)
  - HTC                      1281 → 1321 W/K (was 234 W/K short
    of worksheet line 39 monthly avg 1515.38)

Spec basis:
  - SG → 1 (WALL_STONE_GRANITE per domain.sap10_ml.rdsap_uvalues)
    is the granite-specific Elmhurst variant of "ST Stone"; same
    SAP10 enum, no cascade behaviour change for stone walls.
  - B → 6 (BASEMENT_WALL_CONSTRUCTION_CODE per
    datatypes/epc/domain/epc_property_data.py:361) routes the
    cascade through `part.main_wall_is_basement` →
    `u_basement_wall(age_band)` per RdSAP 10 §5.17 / Table 23
    (heat_transmission.py:640). Empirically established from a
    2026 50k-bulk GOV.UK API sweep (88% co-occurrence with
    walls[].description = "Basement wall").
  - CF → 4 (Cavity, RdSAP 10 Table 15 row 3 spec U=0.20). The
    cascade's `u_party_wall` returns 0.0 / 0.5 / 0.25 for code 4
    today, so CF conservatively rounds up to the cavity-unfilled
    U=0.5 — matches the pre-existing
    `_API_PARTY_WALL_CONSTRUCTION_TO_SAP10[3]` approximation
    until `u_party_wall` gains a filled-cavity branch (TODO).

Strict-coverage gate per [[reference-unmapped-api-code]] mirror:
`_elmhurst_wall_construction_int` and
`_elmhurst_party_wall_construction_int` now raise
`UnmappedElmhurstLabel` on a non-empty Elmhurst code that isn't in
the lookup dict, rather than silently returning None. Empty
lodgings (absent fields) continue to return None — the cascade's
own defaults apply. The silent-None failure mode is what hid cert
000565's ~300 W/K cascade fabric-loss gap from the audit chain
until the S0380.64 space-heating residual probe surfaced it.

Cohort coverage swept: every Summary PDF in the test fixtures
folder lodges only {SO, CA, CW, SG, B} wall types and
{'', S, U, CU, CF} party-wall types — the new dict entries cover
all observed codes, so strict-raise does not regress any cohort
fixture (478 pass, 9 expected 000565 cascade-gap fails; was 427
pass + 10 fails per HANDOVER_CERT_000565_COST_CASCADE.md).

Pyright net-zero on touched files (mapper.py 32 → 32 errors;
test_summary_pdf_mapper_chain.py 13 → 13 errors — all pre-existing
in unrelated sections).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 08:57:25 +00:00
..
handler address JTK review comments 2026-04-20 15:11:17 +00:00
tests Slice S0380.64: Elmhurst per-extension wall_construction mappings + strict-raise 2026-05-29 08:57:25 +00:00
__init__.py Map to RdSapSiteNotes from site notes JSON 🟥 2026-04-16 13:54:03 +00:00
db_writer.py include updating epc_property_data to pashub to ara workflow 2026-04-29 09:55:14 +00:00
elmhurst_extractor.py Slice S0380.58: Elmhurst per-extension Room(s) in Roof extraction + TFA fix 2026-05-28 22:58:43 +00:00
extractor.py Handle wall thickness "Unmeasurable" 🟩 2026-04-30 16:41:16 +00:00
local_runner.py update local runner to work for elmhurst 2026-04-24 14:01:36 +00:00
parser.py load ecmk site notes to db 2026-04-29 11:20:47 +00:00
pdf.py update local runner to work for elmhurst 2026-04-24 14:01:36 +00:00