fix(elmhurst-mapper): classify top-floor flat from roof type, not room-in-roof

`_elmhurst_dwelling_type` derived a flat's roof exposure from
`room_in_roof is not None`, so a top-floor flat whose roof is a plain
external "PS Pitched, sloping ceiling" (no room-in-roof) fell through to
"Mid-floor flat". The cascade's `_dwelling_exposure` then treats a
mid-floor flat's roof as a party ceiling (RdSAP 10 §5 / §3 — party
surfaces carry no heat loss) and drops the entire roof term: cert
001431's 105 m² roof at U=2.3 = 241.68 W/K (30) vanished, collapsing
(33) fabric heat loss 320.06 → 78.38 and over-rating SAP by ~5 points
(on top of the age-band roof-U bug — see prior commit).

Read the roof TYPE instead — the dual of the floor's "Another dwelling
below" signal. A flat's roof is a party ceiling only when its Elmhurst
code is S / A / NR (Same/Another dwelling or Non-residential space
above); F / PN / PA / PS are exposed external roofs, so the dwelling is
on the top storey. `has_exposed_roof = room_in_roof present OR
_elmhurst_roof_is_exposed(roof)` — which is exactly what the function's
own docstring already described as the intent ("RR present or external
roof"), now implemented.

With both upstream fixes the full chain (Summary PDF → extractor →
mapper → cert_to_inputs → calculator) reproduces the worksheet's §11a
unrounded SAP 56.3649 at abs < 1e-4, with (30)/(33)/(37) matching to
the decimal. Only flat fixture reclassified; 000784 (top-floor, RR) and
000910 (ground-floor) unchanged. Regression suite green bar the 3
pre-existing unrelated fails.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Khalim Conn-Kowlessar 2026-06-10 08:18:51 +00:00
parent 1033526812
commit b473f6a1ec
2 changed files with 73 additions and 6 deletions

View file

@ -180,6 +180,45 @@ def test_summary_001431_topfloor_extracts_main_property_age_band() -> None:
assert survey.construction_age_band == "C 1930-1949"
def test_summary_001431_topfloor_flat_classified_as_top_floor() -> None:
# Arrange — the recommendation "after" Summary lodges §6.0 "Position
# of flat in block of flats: Top Floor": floor "A Another dwelling
# below" (party) + roof "PS Pitched, sloping ceiling" (an exposed
# external roof, NOT a room-in-roof). The mapper must classify it
# Top-floor (roof exposed) — not Mid-floor — so the cascade charges
# the roof heat-loss term.
pages = _summary_pdf_to_textract_style_pages(_SUMMARY_001431_TOPFLOOR_PDF)
site_notes = ElmhurstSiteNotesExtractor(pages).extract()
# Act
epc = EpcPropertyDataMapper.from_elmhurst_site_notes(site_notes)
# Assert
assert epc.dwelling_type == "Top-floor flat"
def test_summary_001431_topfloor_full_chain_sap_matches_worksheet_pdf() -> None:
# Arrange — gas-boiler-upgrade-with-cylinder recommendation "after"
# worksheet (P960-0001-001431). Top-floor flat, PS sloping roof at
# U=2.3 (age C, uninsulated) → (30) roof 241.68 W/K, (33) fabric
# 320.06, (37) HLC 348.76. Worksheet §11a lodges unrounded SAP
# 56.3649. Exercises both upstream fixes: the Date-Built age band
# (roof U 2.3 not 0.4) and the top-floor flat classification (roof
# not dropped).
pages = _summary_pdf_to_textract_style_pages(_SUMMARY_001431_TOPFLOOR_PDF)
site_notes = ElmhurstSiteNotesExtractor(pages).extract()
epc = EpcPropertyDataMapper.from_elmhurst_site_notes(site_notes)
# Act
result = calculate_sap_from_inputs(
cert_to_inputs(epc, prices=SAP_10_2_SPEC_PRICES)
)
# Assert
worksheet_unrounded_sap = 56.3649
assert abs(result.sap_score_continuous - worksheet_unrounded_sap) < 1e-4
def test_summary_000474_mapper_produces_three_building_parts() -> None:
# Arrange — cert U985-0001-000474 is a mid-terrace with 3 building
# parts (Main + 2 extensions) per the hand-built worksheet fixture

View file

@ -296,6 +296,7 @@ class EpcPropertyDataMapper:
built_form=built_form,
property_type=property_type,
floor=survey.floor,
roof=survey.roof,
room_in_roof=survey.room_in_roof,
)
@ -2311,26 +2312,53 @@ _ELMHURST_INSULATION_CODE_TO_SAP10: Dict[str, int] = {
}
# Elmhurst roof codes that denote a party ceiling (another/same dwelling
# or non-residential space ABOVE), so the flat's roof is NOT a heat-loss
# surface: S (Same dwelling above), A (Another dwelling above), NR
# (Non-residential space above). Every other roof code (F / PN / PA / PS)
# is an exposed external roof — the dwelling is on the top storey.
_ELMHURST_PARTY_ROOF_CODES: frozenset[str] = frozenset({"S", "A", "NR"})
def _elmhurst_roof_is_exposed(roof: Optional[ElmhurstRoofDetails]) -> bool:
"""Whether a flat's roof is an exposed (heat-loss) external roof.
The dual of the floor's "Another dwelling below" signal: the roof is
a party ceiling only when its Elmhurst code is S / A / NR (a dwelling
or non-residential space above). A plain external roof including a
"PS Pitched, sloping ceiling" with no room-in-roof is exposed, so
the flat sits on the top storey."""
if roof is None:
return False
return _leading_code(roof.roof_type) not in _ELMHURST_PARTY_ROOF_CODES
def _elmhurst_dwelling_type(
*,
built_form: str,
property_type: str,
floor: Optional[ElmhurstFloorDetails],
roof: Optional[ElmhurstRoofDetails],
room_in_roof: Optional[ElmhurstRoomInRoof],
) -> str:
"""Compose `EpcPropertyData.dwelling_type` from the Elmhurst Summary's
property-type + attachment + floor-location + RR presence.
property-type + attachment + floor-location + roof-type + RR presence.
For HOUSES: returns `f"{built_form} {property_type.lower()}"` the
historical contract ("Mid-Terrace house", "Detached house").
For FLATS: derives the floor-position prefix ("Top-floor",
"Mid-floor", "Ground-floor") from `floor.location` + RR presence:
- floor lodges "dwelling below" roof exposed (RR present or
external roof) Top-floor; roof party (no RR/external)
Mid-floor;
"Mid-floor", "Ground-floor") from `floor.location` + roof exposure:
- floor lodges "dwelling below" roof exposed (RR present OR an
external roof type, per `_elmhurst_roof_is_exposed`) Top-floor;
roof party (dwelling above, no RR) Mid-floor;
- floor not over another dwelling Ground-floor.
Reading the roof TYPE (not just room-in-roof presence) is the dual of
reading the floor location: a top-floor flat can have a plain external
sloping ceiling and no room-in-roof, which the RR-only test wrongly
routed to Mid-floor (dropping the roof heat-loss term).
The cascade's `_dwelling_exposure` (cert_to_inputs.py) is prefix-
matched on the lowercase result; correct flat-prefix detection is
the gate for floor / roof party-surface routing (RdSAP 10 §5).
@ -2339,7 +2367,7 @@ def _elmhurst_dwelling_type(
return f"{built_form} {property_type.lower()}".strip()
floor_loc = (floor.location if floor is not None else "") or ""
has_dwelling_below = "dwelling below" in floor_loc.lower()
has_exposed_roof = room_in_roof is not None
has_exposed_roof = room_in_roof is not None or _elmhurst_roof_is_exposed(roof)
if has_dwelling_below and has_exposed_roof:
position = "Top-floor"
elif has_dwelling_below: