Commit graph

2435 commits

Author SHA1 Message Date
Khalim Conn-Kowlessar
06989d6b0f fix(elmhurst-extractor): allocate single-glazed alt-wall windows to the alternative wall
The §11 layout parser keys a window's wall Location on the glazing-prefix /
orientation tokens around its data row. An alt-wall window lodges its
"Alternative wall 1" Location wrapped across the lines bracketing the W×H×A
row. For a DOUBLE-glazed alt window the prefix line also carries the glazing
phrase ("Double between 2002   Alternative wall"), so the partition breaks
there and the location survives into the window's pre-data slice. For a
SINGLE-glazed alt window the "Alternative wall" line stands alone with no
glazing-type word, so _partition_after_manuf scanned past it and swallowed
it into the PREVIOUS window's suffix — the window then defaulted to
"External wall" and its opening deducted from the wrong wall.

Fix: treat a standalone wall-location line ("Alternative wall" / "External
wall" / "Party wall") as a window boundary in _partition_after_manuf, so it
attaches to the following window's prefix. Surfaced by simulated case 34
(cert 001431 electric-storage flat): 2 of 4 single-glazed alt-wall windows
were mis-allocated, splitting 2.75/10.78 m² instead of the worksheet's
4.63/8.90 corridor/external opening areas.

Elmhurst-extractor only; API gauge unaffected. Regression gate green (3
pre-existing fails unrelated); worksheet harness 47/47 unchanged. Case 34's
alt-wall opening area now matches the worksheet; the corridor wall net area
is correct (the cert's residual is now isolated to the unheated-corridor
door, a separate slice).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 07:54:06 +00:00
Khalim Conn-Kowlessar
48b36d3d7e fix(elmhurst-mapper): carry §7 alternative-wall "Sheltered Wall" flag
The Elmhurst Summary §7 lodges "Alternative Wall N Sheltered Wall: Yes" for
a sub-area adjacent to an unheated buffer (e.g. a flat's corridor wall),
but the extractor dropped it and _map_elmhurst_alternative_wall never set
SapAlternativeWall.is_sheltered — so the cascade billed the sub-area at its
full exposed U instead of the RdSAP 10 Table 4 (p.22) sheltered U =
1/(1/U + 0.5).

The calculator already applies is_sheltered (_alt_wall_w_per_k) and the
gov-API path already wires sheltered_wall=="Y"; this brings the Elmhurst
front-end to parity. Three-part change: AlternativeWall.sheltered field +
_alternative_walls_from_lines parse ("Alternative Wall N Sheltered Wall") +
_map_elmhurst_alternative_wall is_sheltered=a.sheltered.

Surfaced by simulated case 34 (cert 001431 electric-storage flat): the
6.02 m² corridor wall billed at full U=1.50 (9.03 W/K) instead of the
sheltered 0.86 (5.18 W/K) — +3.85 W/K, -1.61 SAP. Post-fix the alt wall
matches the worksheet's (29a) 5.177 and case 34 closes from -1.61 to -0.30
(remaining residual is a separate window/wall area-allocation thread).

Elmhurst-mapper only: API SAP gauge unchanged (57.6% within 0.5); worksheet
harness 47/47 unaffected; regression gate green (3 pre-existing fails
unrelated); pyright net-zero.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 07:35:46 +00:00
Khalim Conn-Kowlessar
f3dcd7b43e fix(elmhurst-mapper): single-storey flat with exposed roof is Top-floor, not Ground-floor
The Elmhurst dwelling-type classifier keyed "Top-floor flat" on a "dwelling
below" floor lodgement. A single-storey flat exposed BOTH top (a real
external roof) AND bottom (floor over partially-heated space, no dwelling
below) therefore fell through to "Ground-floor flat" — which the cascade's
_dwelling_exposure maps to has_exposed_roof=False, dropping the external
roof entirely.

Surfaced by simulated case 34 (cert 001431 reconfigured as a slimline
electric-storage flat): the worksheet bills (30) External roof = 39.98 m²
x U=2.30 = 91.95 W/K — the dominant heat-loss element — but the cascade
dropped it, under-stating space-heating demand by 42% (6550 vs 11357
kWh/yr) and over-predicting SAP by +21.76 (57.07 vs worksheet 35.31).

Fix: an exposed (non-party) roof puts the flat on the top storey
regardless of what is below it. Classify as "Top-floor flat" whenever the
roof is exposed; the flat's exposed floor is recovered downstream by the
existing per-BP is_above_partially_heated_space / is_exposed_floor override
in heat_transmission (§3). Party-roof flats ("another dwelling above") are
unaffected and stay Ground-/Mid-floor.

This is an Elmhurst-mapper (dwelling_type) bug, NOT a calculator bug: the
calculator correctly trusts dwelling_type, and the gov-API path supplies
the position directly (cert 0036 — a genuine ground-floor flat whose API
data lodges a "Pitched, no access" roof construction under another dwelling
— stays party, 2.51 W/K). API SAP gauge unchanged (57.6% within 0.5);
worksheet harness 47/47 unaffected; case 34 roof now exact (residual -1.61
is a separate flat-corridor wall-U thread). Regression gate green (3
pre-existing fails unrelated).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 07:23:56 +00:00
Khalim Conn-Kowlessar
b0a47cda05 fix(elmhurst-mapper): strip interleaved Alternative-wall fragments from glazing label
When a property lodges an Alternative Wall, pdftotext interleaves the §11
"Location" column ("Alternative wall 1") into the wrapped glazing-TYPE cell,
producing labels like "Double between 2002 Alternative wall and 2021 1
Alternative wall" (cert 001431 storage-heater variants, simulated case 34).

The existing greedy trailing-suffix strip (\s+Alternative wall.*$) truncates
at the FIRST "Alternative wall", losing "and 2021" and yielding the
unmatchable "Double between 2002". Added a fallback that removes EVERY
"<External|Alternative|Party> wall [n]" fragment and any stray 1-2 digit
location index from the raw label, then retries the lookup. Loss-free: no
glazing-type key contains a wall-location phrase or a bare 1-2 digit number
(install-date years are 4 digits).

Unblocks the Summary cascade for any property with an Alternative Wall;
Summary-path only (the API path receives structured glazing codes, so the
API gauge is unaffected). Regression gate green (1 pre-existing fail
unrelated).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 07:07:08 +00:00
Khalim Conn-Kowlessar
85d6f8468c feat(elmhurst-extractor): capture section 15.1 Immersion Heater (Dual/Single)
The Elmhurst Summary section 15.1 "Hot Water Cylinder" block lodges
"Immersion Heater: Dual" / "Single"; the extractor dropped it, so the
Summary path left immersion_heating_type = None while the API path already
captured it. Capturing it drives SAP Table 13's high-rate-fraction
DHW-cost split (RdSAP 10 section 10.5 p.54: 1 = dual, 2 = single) and
brings the two front-ends to parity.

Three-file change: WaterHeating.immersion_type field +
_extract_water_heating parse (scoped to the 15.1..15.2 slice) +
_elmhurst_immersion_type_code mapper (strict-raise on an unmapped label,
mirroring _elmhurst_cylinder_insulation_code).

Safe to land now that the preceding commit zeroes the high-rate fraction
for 18-/24-hour tariffs: the 20 solid-fuel corpus certs (solid fuel 4-11:
WHC 903 dual immersion, 18-hour meter, 110 L) carry a dual immersion, but
their 18-hour tariff bills 100% low-rate per Table 12a's 7-/10-hour scope
— so they stay EXACT instead of regressing to the 10-hour-column ~0.10.
7-/10-hour Summary immersion certs now correctly cost the Table 13
high-rate fraction instead of falling to the immersion=None 100%-low
default.

Regression gate green (3 pre-existing fails unrelated); API gauge
unchanged (Summary-path-only): 57.6% within 0.5, mean|err| 1.185.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 22:16:21 +00:00
Khalim Conn-Kowlessar
020ac6f220 fix(elmhurst-mapper): strip wrapped building-part fragment from glazing label
pdftotext can wrap the §11 building-part column onto the glazing-TYPE
token without an intervening glazing-gap descriptor, e.g. "Double between
2002 and 2021 1st" (the "1st" marks the 1st Extension). The existing
trailing-gap fallback only strips the fragment when preceded by "N mm";
the bare ordinal raised UnmappedElmhurstLabel.

New `_ELMHURST_GLAZING_LABEL_TRAILING_BP_RE` strips a trailing ordinal
("1st"/"2nd"/…) or "Main" and retries the lookup. No glazing-type key
ends in an ordinal or "Main", so it is loss-free. Surfaced by worksheet
`simulated case 33` (direct-acting electric boiler + immersion), which
previously could not be routed through the Summary cascade.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 21:25:42 +00:00
Daniel Roth
d571ccaee5 remove number of address2uprn tests 2026-06-10 16:26:54 +00:00
Daniel Roth
26d34c345c get last_submission_date from hubspot 2026-06-10 14:45:25 +00:00
Khalim Conn-Kowlessar
5a74897fed fix(water-heating): gate DHW separate-timing on programmer + boiler age (RdSAP 10 §10.5)
`_separately_timed_dhw` returned True for any boiler+cylinder+from-main
cert, applying the SAP 10.2 Table 2b note b) ×0.9 temperature-factor
reduction unconditionally. For the lpg-boiler "before" worksheet (pre-
1998 LPG boiler SAP code 115 + 210 L cylinder, NO cylinder thermostat,
control 2113 "Room thermostat and TRVs" — no programmer) this dropped
the (53) temperature factor to 0.702 (= 0.60 × 1.3 × 0.9) where the
worksheet lodges 0.78 (= 0.60 × 1.3), under-counting cylinder storage
loss (55) by ~119 kWh/yr and over-rating SAP by ~0.25.

RdSAP 10 §10.5 (PDF p.57) "Hot water separately timed":
    No programmer, pre-1998 boiler → No
    Programmer, pre-1998 boiler    → Yes
    Post-1998 boiler               → Yes
DHW is therefore NOT separately timed only when a pre-1998 boiler is
paired with a no-programmer control. Add the two SAP 10.2 Table 4c(2) /
Table 4b lookups (controls without a programmer = {2101, 2103, 2111,
2113}; pre-1998 gas/LPG boilers 110-119 + oil 124/125/128) and return
False for that combination; every other boiler+cylinder cert keeps the
separately-timed default, so the change is confined to old low-control
stock and the heating corpus + goldens are unchanged.

Effect: the full chain (Summary PDF → extractor → mapper → cert_to_inputs
→ calculator) now reproduces the lpg-boiler worksheet's §11a unrounded
SAP -6.6499 at abs < 1e-4 (was -6.4013). Full regression suite green bar
the 3 pre-existing unrelated fails.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 10:07:27 +00:00
Khalim Conn-Kowlessar
90de1fc976 fix(elmhurst-mapper): map "Bottled gas" main fuel to bottled LPG, not mains gas
An LPG-boiler dwelling on the Summary → from_elmhurst_site_notes path
mapped to main_fuel_type=26 (mains gas), making it indistinguishable
from a mains-gas boiler downstream — wrong Table 12/32 cost / CO2 / PE
(bottled LPG is ~10.30 p/kWh vs mains gas 3.48), and it defeats any
"non-gas → gas only with a mains-gas connection" gate (an LPG dwelling
looks already-gas).

Root cause: the recommendation worksheets lodge the boiler carrier as
§15.0 "Water Heating Fuel Type: Bottled gas" (§14.0 carries only SAP
code 115, a Table 4b gas-family row, + "Main gas: Yes" in §14.2 — a
mains-gas CONNECTION, not the heating fuel). "Bottled gas" was absent
from `_ELMHURST_MAIN_FUEL_TO_SAP10`, so the §15.0 fuel resolved to None
and `_elmhurst_gas_boiler_main_fuel` fell through priority-1 to the
mains-gas meter flag → 26.

Map "Bottled gas" → 3 (bottled LPG MAIN heating): code 3 routes via
`API_FUEL_TO_TABLE_32`/`API_FUEL_TO_TABLE_12` → Table-code 3 (10.30 /
9.46 p/kWh). NOT the legacy "LPG bottled": 5 entry — API code 5 =
anthracite, and `canonical_fuel_code` resolves the same-valued Table-32
code 5 to anthracite (3.64 p/kWh), so a 5 here mis-prices the dwelling
as cheap solid fuel (verified: a 5 mapping moved SAP the WRONG way,
42.33 → 45.11; code 3 moves it to -6.40 vs the worksheet's -6.6499).
Also add 3 to `_GAS_LPG_MAIN_FUEL_CODES` so the §15.0-lodged bottled-LPG
water fuel is adopted as the boiler's space-heating carrier (priority 1)
instead of the meter flag.

Effect: main_fuel_type=3 (bottled LPG) and water_heating_fuel=3 (was
None). Mains-gas certs still → 26 (full regression suite green bar the 3
pre-existing unrelated fails); the MissingMainFuelType tripwire still
fires for genuinely-undeterminable carriers.

Spec: SAP 10.2 Table 12 / RdSAP 10 Table 32 (PDF p.95) — bottled LPG
main heating fuel code 3.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 08:48:15 +00:00
Khalim Conn-Kowlessar
b473f6a1ec fix(elmhurst-mapper): classify top-floor flat from roof type, not room-in-roof
`_elmhurst_dwelling_type` derived a flat's roof exposure from
`room_in_roof is not None`, so a top-floor flat whose roof is a plain
external "PS Pitched, sloping ceiling" (no room-in-roof) fell through to
"Mid-floor flat". The cascade's `_dwelling_exposure` then treats a
mid-floor flat's roof as a party ceiling (RdSAP 10 §5 / §3 — party
surfaces carry no heat loss) and drops the entire roof term: cert
001431's 105 m² roof at U=2.3 = 241.68 W/K (30) vanished, collapsing
(33) fabric heat loss 320.06 → 78.38 and over-rating SAP by ~5 points
(on top of the age-band roof-U bug — see prior commit).

Read the roof TYPE instead — the dual of the floor's "Another dwelling
below" signal. A flat's roof is a party ceiling only when its Elmhurst
code is S / A / NR (Same/Another dwelling or Non-residential space
above); F / PN / PA / PS are exposed external roofs, so the dwelling is
on the top storey. `has_exposed_roof = room_in_roof present OR
_elmhurst_roof_is_exposed(roof)` — which is exactly what the function's
own docstring already described as the intent ("RR present or external
roof"), now implemented.

With both upstream fixes the full chain (Summary PDF → extractor →
mapper → cert_to_inputs → calculator) reproduces the worksheet's §11a
unrounded SAP 56.3649 at abs < 1e-4, with (30)/(33)/(37) matching to
the decimal. Only flat fixture reclassified; 000784 (top-floor, RR) and
000910 (ground-floor) unchanged. Regression suite green bar the 3
pre-existing unrelated fails.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 08:18:51 +00:00
Khalim Conn-Kowlessar
1033526812 fix(elmhurst-extractor): read Main Property age band from §3.0 Date Built block
The Elmhurst Summary §3.0 "Date Built" lodges the per-building-part age
bands; the Main row reads "Main Property" / "C 1930-1949". But "Main
Property" ALSO heads the §4.0 Dimensions table, so the global
`_str_val("Main Property")` collides with it: when pdftotext renders
"3.0 Date Built:" glued onto its "Main Property" row token on one
layout line (as the recommendation worksheets do), the first standalone
"Main Property" match is the §4 dimensions header — returning its next
token "Floor" as the "age band".

That garbage age propagated to `u_roof`: for a "Pitched, sloping
ceiling" (PS) roof with no lodged insulation thickness, `u_roof` returns
the spec uninsulated U=2.3 for the correct age C but U=0.4 for the
unparseable "Floor" — collapsing the roof heat-loss term and inflating
SAP by ~14 points on the affected cert.

Scope the read to the Date-Built block (between "3.0 Date Built" and
"4.0 Dimensions") and take the first age row — a line beginning with a
single A-M band letter + space ("C 1930-1949", "A before 1900",
"J 2003-2006"). Building-part name rows never start that way, and the
Main row precedes any extension / room-in-roof rows.

Regression: full sap10_calculator + documents_parser suite green bar the
3 pre-existing unrelated fails (2 stone-wall U tests, test_total_floor_
area); the multi-bp / "A before 1900" fixtures (000516, 001431_case*,
6035) keep their age bands.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 16:41:00 +00:00
Jun-te Kim
3b7d26fe34 added test for a 1000 examples 2026-06-09 16:02:21 +00:00
Daniel Roth
5178cd02c5 UploadedFile, FileTypeEnum, FileSourceEnum importable from infrastructure.postgres.uploaded_file_table 🟩
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-09 11:50:51 +00:00
Jun-te Kim
b48700e964 Merge branch 'main' into feature/junte+khalim 2026-06-08 16:56:15 +00:00
Jun-te Kim
0ccb0e0bf9
Merge pull request #1196 from Hestia-Homes/feautre/additional_properties_for_tracking
Feautre/additional properties for tracking
2026-06-08 16:19:34 +01:00
Jun-te Kim
a1b4bf4e98 added 4 deal proeprties 2026-06-08 14:42:06 +00:00
Khalim Conn-Kowlessar
24492aa4ba Merge origin/main into feature/bill-derivation (calculator + mapper fixes)
Pulls in 42 commits of calculator/mapper accuracy fixes from the per-cert
mapper-validation and floor/roof/heating fronts.

Conflict resolutions:
- mapper `_is_elmhurst_roof_window`: main dropped the branch's "wall location →
  vertical" guard (it broke cert 000516's rooflight), but that re-broke cert
  001431's two External-wall U>3.0 windows (which must stay vertical). The two
  certs lodge a BYTE-IDENTICAL §11 row, so neither location nor U separates
  them — the real discriminator is the room-in-roof context. Replaced the
  unconditional U>3.0 backstop with one gated on the BP having a room-in-roof
  (`_elmhurst_bp_has_room_in_roof`): 000516's Main BP has a "Room in roof type
  1" (→ rooflight), 001431's does not (→ vertical). Validated against BOTH —
  full Elmhurst worksheet suite 1038 pass + the 001431 window-extraction pin.
- property_postgres_repository: kept main's `ids_by_uprn` method + the branch's
  `_restrictions_of` helper.
- sap_fuel.py: the branch relocated it to domain/billing/ (already carrying
  main's to_table_32_code normalization), so kept the old path deleted.

Fallout from main's fabric fixes (validated by the boiler-3 real-cert pin which
still reproduces at delta 0):
- re-pinned the boiler-1 + boiler-instant-hw ASHP snapshot scores;
- main's §14.2 gas-boiler main-fuel derivation resolved the BGB/102 baseline
  gap, so `test_gas_boiler_instant_hw_before_baselines` is now a passing test
  (was an xfail tripwire).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 13:12:21 +00:00
Daniel Roth
bd4ad9022c Merge branch 'main' into feature/handle-new-magicplan-response-structure 2026-06-08 12:36:27 +00:00
KhalimCK
1b94da16d0
Merge pull request #1189 from Hestia-Homes/feature/per-cert-mapper-validation
Feature/per cert mapper validation
2026-06-08 13:19:58 +01:00
Daniel Roth
c22ee3821b Merge branch 'main' into feature/handle-new-magicplan-response-structure 2026-06-08 09:57:26 +00:00
Daniel Roth
41a40c9ba0 Fix Pylance unknowns in SQLModel table tests and correct pytest paths 2026-06-08 09:56:54 +00:00
Khalim Conn-Kowlessar
6b04514645 fix(mapper): resolve gas-boiler main fuel from §14.2 mains-gas meter
A Summary §14.0 Table 4b gas boiler (SAP code 101-119) lodges no §14.0
"Fuel Type" string in the newer Elmhurst export. The carrier was resolved
only from §15.0 "Water Heating Fuel Type" — fine when the same boiler
heats the water, but a gas boiler paired with a SEPARATE electric
immersion lodges §15.0 "Electricity", so `_elmhurst_gas_boiler_main_fuel`
returned None and the cascade strict-raised MissingMainFuelType.

Cert 001431 boiler-1/boiler-2 "before" variants are exactly this config:
§14.0 SAP code 102/104 (mains-gas boiler), §15.0 electric immersion
(code 909), §14.2 Meters "Main gas: Yes". The meter flag is the
authoritative carrier signal — a 101-119 boiler on mains gas burns mains
gas — so adopt it (SAP10 main_fuel 26 per _ELMHURST_MAIN_FUEL_TO_SAP10
"Mains gas") when §15.0 can't disambiguate. §15.0 gas/LPG still wins when
present (keeps LPG-vs-mains-gas precision); no mains-gas meter + non-gas
§15.0 still strict-raises rather than guessing.

Spec: SAP 10.2 Table 4b "Seasonal efficiency for gas and liquid fuel
boilers" (PDF p.168), rows 101-119. Both certs now resolve main_fuel=26
and compute (was: hard raise).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-06 17:48:04 +00:00
Khalim Conn-Kowlessar
27375d93a4 fix(u-value): solid brick as-built U by thickness — §5.7 Table 13
A 440 mm (>420 mm) solid brick AS-BUILT wall computed U = 1.70 (the
220 mm bucket default) instead of the RdSAP-correct 1.10. The §5.7
Table 13 thickness path only fired for *insulated* brick (external/
internal + thickness > 0); the as-built case fell through to the
Table 6 cavity/solid age-band default.

Spec: RdSAP 10 Specification (9th June 2025), §5.7 "U-values for
uninsulated brick walls, age bands A to E", Table 13 (PDF p.40):
  ≤200 mm → 2.5, 200–280 mm → 1.7, 280–420 mm → 1.4, >420 mm → 1.1.
Table 6 footnote (b) on the "Solid brick as built" row (PDF p.40):
"Or from 5.7 if wall thickness is other than 200mm to 280mm" — the
thickness table supersedes the flat 1.7 default whenever a documentary
wall thickness is lodged (200–280 mm gives 1.7 either way). The §5.8 /
Table 14 dry-lining R is added on top only when the wall is dry-lined,
per the §5.7 closing sentence.

Validated against the user-generated Elmhurst worksheet "simulated
case 21" (replica of API cert 2818-3053-3203-2655-9204: mid-terrace,
age band B, solid brick as-built 440 mm, room-in-roof). New §3 cascade
pin `test_section_3_wall_u_by_thickness_case21_match_pdf` routes the
Summary through the real extractor + mapper and pins:
  (31) 155.1000, (33) 175.6208, (36) 23.2650, (37) 198.8858 — all 1e-4.
External walls Main U → 1.1000; Sheltered RR gable → 1/(1/1.10+0.5) =
0.71 (was 0.92). Pinned on §3 only (case-6 precedent): its code-908
instantaneous multi-point gas water heater has a separate §4 (219) gap.

Cross-check: sim case 20 (220 mm) stays at 1.70 — unchanged.

API SAP accuracy (scripts/eval_api_sap_accuracy.py, 896 computed certs):
% |err| < 0.5 SAP vs lodged: 42.6% → 43.8%; mean |err| 2.045 → 2.010.

Regression: tests/domain/sap10_calculator/ (1861), backend/
documents_parser/tests/ (574), datatypes/epc/ + rdsap golden fixtures
all green (pre-existing test_total_floor_area excepted). pyright strict
net-zero. No solid-brick fixture pin shifted (200–280 mm unchanged).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-06 14:40:06 +00:00
Khalim Conn-Kowlessar
1ed6d06804 fix(mapper): drop only U=0 internal RR stud walls, keep positive-U ones
A Detailed room-in-roof lodges "Stud Wall" surfaces, but the cascade billed
every one through Table 17 from its insulation — over-counting fabric on
internal studs that carry no heat loss. sim case 20's two studs lodge §8.1
Default U-value 0.00 and the P960 worksheet omits them from BOTH fabric heat
loss (§3: (33)=285.9847) and total exposed area (31)=239.68; the cascade
computed ~0.52 each → (33) +4.16 W/K and continuous SAP 43.05 vs 43.6322.

Gate the drop on the lodged Default U-value: 0.00 → internal knee wall,
return None (no heat loss, no area); positive → a real exposed knee wall
(cert 000565 Ext2 Detailed: 0.31 / 0.10) that still falls through to the
Table-17 path. The earlier over-broad "drop all studs" zeroed 000565's
genuine studs — this keeps them.

Pins test_summary_001431_case20_fabric_heat_loss_matches_worksheet_line_33
((33)=285.9847 at 1e-4); case 20 continuous SAP now EXACT (43.6322). 2850
pass (the lone test_total_floor_area failure is pre-existing on base);
pyright strict net-zero (32=32).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-06 10:47:30 +00:00
Khalim Conn-Kowlessar
795d36b732 fix(extractor): re-join §11 windows whose Area cell split onto its own line
Sim case 20's §11 lodges 5 windows but only 1 surfaced. The "W H Area"
cells tokenize inconsistently: a narrow Area column keeps all three on one
line ("1.80 2.10 3.78" — matches _WIDTH_HEIGHT_AREA_RE), but a wider Area
column triggers pdftotext's 2+-space split, dropping the Area onto its own
line ("5.79 2.00" then "11.58"). The 3-decimal data anchor never matched
those four rows, so they were lost — gutting §6 solar gains (5 windows →
1) and dropping continuous SAP 43.05 → 38.32 vs the worksheet's 43.6322.

Pre-merge a "W H" line + a following lone-decimal Area into the canonical
"W H Area" line, gated on Area ≈ W × H (the §11 Area is always the product)
so a frame factor / g-value / U-value below a dimension line is never
absorbed. One-line layouts (3 decimals) are untouched.

Pins via test_summary_001431_case20_extracts_all_five_section11_windows
(Summary_001431_case20.pdf mirrors sap worksheets/golden fixture debugging/
simulated case 20/). 573 documents_parser tests pass; pyright strict net-zero.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-06 10:35:21 +00:00
Jun-te Kim
98297f803a
Merge pull request #1186 from Hestia-Homes/feature/landlord_data
fix
2026-06-05 20:03:55 +01:00
Jun-te Kim
e60ca6ee5d source of the problem in address2uprn 2026-06-05 19:03:33 +00:00
Khalim Conn-Kowlessar
efa10fe6cb S0380.239: system-build walls take masonry structural infiltration (0.35)
RdSAP 10 §2 (Ventilation, "Walls" row): "Structural infiltration: 0.25
for steel or timber frame or 0.35 for masonry construction ... System
build: treated as masonry." `_is_timber_or_steel_frame` wrongly included
wall_construction code 6 (system build) alongside code 5 (timber frame),
handing system-build dwellings the 0.25 structural ACH instead of 0.35.

On the cat-10 room-heater fixture (ref 001431, walls SY System Build →
code 6) this under-stated the infiltration rate (18) by exactly 0.10
(0.45 vs worksheet 0.55), dropping the effective air change (25), the
ventilation heat loss (38)m = 0.33 × (25)m × (5), and the heat-transfer
coefficient (39) — so space-heating demand (98) came out 404 kWh low
((211) 11158.6 vs worksheet 11563.2). Restrict the 0.25 branch to code 5
only; code 6 (and everything else) is masonry at 0.35.

Pins the rating-block (38)m ventilation heat loss mean = 83.3613 W/K at
abs 1e-4 and asserts the classifier treats the system-build wall as
masonry. §4 suite green (2415 passed, 1 skipped); no existing fixture
relied on system-build → 0.25.

Residual after this slice: SAP +0.03 / cost -£0.95 — a small fabric (33)
gap (-0.15 W/K) plus lighting (232) +1.0 kWh remain as separate causes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 19:00:02 +00:00
Khalim Conn-Kowlessar
cbdee9ec3c S0380.238: single-point instantaneous water heaters incur no distribution loss
Water heating SAP code 909 (electric instantaneous) and 907 (single-point
gas) heat water at the point of use, serving one outlet with no
distribution pipework. Per SAP 10.2 §4 (p.23, l.1416): "'Single-point'
heaters, which are located at the point of use and serve only one outlet,
do not have distribution losses either." So worksheet (46)m = 0 and the
heat-required line collapses to SAP 10.2 worksheet l.7704
  (62)m = 0.85 × (45)m + (46)m + (57)m + (59)m + (61)m
        = 0.85 × (45)m   (all loss terms zero for a no-cylinder system).

`distribution_loss_monthly_kwh` already supported the
`is_instantaneous_at_point_of_use` flag (and its docstring already named
codes 907/909), but `water_heating_from_cert` hard-coded it to False, so
the cascade applied (46)m = 0.15 × (45)m to single-point heaters. That
0.15 distribution loss exactly cancelled the 0.85 reduction, leaving
(62)m = (45)m. On the cat-10 room-heater fixture (ref 001431, code 909)
that over-stated the water fuel (219) as 2082.6250 instead of the
worksheet's 1770.2313, and inflated the (65)m heat gains (692.47 vs
worksheet 442.55) which in turn suppressed space-heating demand.

Thread the cert's existing instantaneous flag (`_INSTANTANEOUS_WATER_CODES`
= {907, 909}) through `_water_heating_worksheet_and_gains` into both the
demand-pass and final `water_heating_from_cert` calls.

Pins (219) water fuel = 1770.2313 at abs 1e-4 via the extractor → mapper →
rating cascade. §4 suite green (2414 passed, 1 skipped); no existing
fixture exercised the 907/909 path. The residual space-heating fuel gap
((211) 11158.59 vs worksheet 11563.17) this exposes is a separate cause —
next slice.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 19:00:02 +00:00
Khalim Conn-Kowlessar
97f44b5364 fix(extractor): capture all 17 openable §11 windows on cert 001431
cert 001431's §11 lodges 17 windows but only 14 surfaced, via two distinct gaps:

1. Extractor (_extract_windows_from_layout): the one "Double glazing, known
   data" row whose §11 Data-Source cell is "BFRC data" was rejected — it is
   laid out as a standalone keyword line with the U-value on the next line
   and lodges no Frame Type/Factor/Gap cells, so it never matched the joined
   "<source> <U>" Manufacturer-line shape. Now anchored by a standalone
   data-source form, with the RdSAP 10 §3.7 default frame factor (0.7) for
   the absent frame cell.

2. Mapper (_is_elmhurst_roof_window): the two "Double pre 2002" rows
   (U 3.1 / 3.4 > 3.0) were reclassified as roof windows by the U-value
   backstop even though both are lodged on an "External wall". A window
   lodged on a wall is vertical by definition; guard the U-value backstop so
   it only fires when location/BP give no roof signal.

With both closed: 17 sap_windows, 0 misrouted to sap_roof_windows.

Re-homed onto the mapper-validation line from feature/bill-derivation
(orig f68cea27); the modelling-only regression test
(tests/domain/modelling/test_window_extraction_001431.py) stays on
bill-derivation. KNOWN: the mapper guard breaks cert 000516's
test_summary_pdf_mapper_chain pins (W6 U=3.10 routing) — must be resolved
before this PRs to main.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 19:00:02 +00:00
Daniel Roth
198d2afdb1 Merge branch 'main' into feature/handle-new-magicplan-response-structure 2026-06-05 14:35:56 +00:00
Daniel Roth
8e349704b1 move magic plan handler to applications/ 2026-06-05 14:33:26 +00:00
Khalim Conn-Kowlessar
f68cea27c9 fix(extractor): capture all 17 openable §11 windows on cert 001431
The Modelling glazing overlay's draught-proofing recompute (RdSAP 10 §8.1 —
a count over openable windows + doors) needs every openable window captured
with its draught_proofed flag. cert 001431's §11 lodges 17 windows but only
14 surfaced, via two distinct gaps:

1. Extractor (_extract_windows_from_layout): the one "Double glazing, known
   data" row whose §11 Data-Source cell is "BFRC data" was rejected — it is
   laid out as a standalone keyword line with the U-value on the next line
   and lodges no Frame Type/Factor/Gap cells, so it never matched the joined
   "<source> <U>" Manufacturer-line shape. Now anchored by a standalone
   data-source form, with the RdSAP 10 §3.7 default frame factor (0.7) for
   the absent frame cell.

2. Mapper (_is_elmhurst_roof_window): the two "Double pre 2002" rows
   (U 3.1 / 3.4 > 3.0) were reclassified as roof windows by the U-value
   backstop even though both are lodged on an "External wall". A window
   lodged on a wall is vertical by definition; guard the U-value backstop so
   it only fires when location/BP give no roof signal. The backstop's only
   pinned cert (000516 W6) hand-builds its sap_roof_windows and so is
   unaffected.

With both closed: 17 sap_windows, 0 misrouted to sap_roof_windows, 14
draught-proofed — reconstructing Elmhurst's lodged 84% (16/19 = (14 windows
+ 2 doors) / (17 windows + 2 doors)). Full calculator + modelling +
orchestration suites green (1885 pass); the 2 glazing draught-proofing
xfails remain (the overlay recompute is the glazing agent's front).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 14:33:25 +00:00
Jun-te Kim
6778c427bc
Merge pull request #1181 from Hestia-Homes/feature/landlord_data
property override
2026-06-05 15:16:06 +01:00
Daniel Roth
37b5a3a6e5 move domain code out of datatypes/domain 2026-06-05 14:07:28 +00:00
Daniel Roth
b3b4ae2191 Convert Door.width_mm to store actual millimetres (multiply size.x by 1000)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 13:30:54 +00:00
Daniel Roth
0211fb8092 Migrate all MagicPlan tests to single new-format fixture 🟪
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 12:59:56 +00:00
Jun-te Kim
b07db1ef6b property override 2026-06-05 12:18:13 +00:00
Khalim Conn-Kowlessar
2c36a8e1d6 Merge remote-tracking branch 'origin/main' into feature/bill-derivation
# Conflicts:
#	repositories/property/property_postgres_repository.py
#	tests/orchestration/fakes.py
2026-06-05 11:09:00 +00:00
Daniel Roth
5a582bbff0 Merge branch 'main' into feature/handle-new-magicplan-response-structure 2026-06-05 11:01:28 +00:00
KhalimCK
3bdfa0287c
Merge pull request #1169 from Hestia-Homes/feature/per-cert-mapper-validation
Feature/per cert mapper validation
2026-06-05 11:50:11 +01:00
Daniel Roth
ebd6f1623f Merge branch 'main' into feature/handle-new-magicplan-response-structure 2026-06-05 10:16:14 +00:00
Khalim Conn-Kowlessar
c882cb2c95 review: typehint Optional locals around _parse_thickness_mm call sites
PR feedback (dancafc): `_parse_thickness_mm` handles a None input and
returns Optional[int], so its call-return locals — and the Optional[str]
raws they read from `_local_val` — read clearer when annotated. Annotates
`thickness_raw`/`ins_thickness_raw: Optional[str]` and
`thickness_mm`/`insulation_thickness_mm: Optional[int]` at all four call
sites (_wall_details_from_lines, _alternative_walls_from_lines,
_roof_details_from_lines, _floor_details_from_lines), plus the adjacent
`u_val_raw`/`default_u` Optional pair in _floor_details_from_lines for
consistency. Matches the project convention of typehinting call-return
locals. No behaviour change; pyright clean, 569 parser tests pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 09:56:06 +00:00
Khalim Conn-Kowlessar
8323d9cf07 Merge branch 'feature/per-cert-mapper-validation' of https://github.com/Hestia-Homes/Model into feature/bill-derivation 2026-06-05 09:38:40 +00:00
Khalim Conn-Kowlessar
8133521c43 S0380.237: map "Secondary glazing - Low emissivity" → SAP 10.2 code 12
Completes the secondary-glazing family. S0380.235 mapped the unknown-data
(7) and normal-emissivity (11) secondary variants; the RdSAP-21.0.1
`glazed_type` enum also defines code 12 "secondary glazing, low
emissivity", whose Elmhurst §11 label "Secondary glazing - Low
emissivity" was unmapped and would strict-raise. Cascade code 12 carries
the same daylight/solar bucket as 7/11 (g_L=0.80, g⊥=0.76); the lodged
manufacturer U/g drive §3/§6. With this the double family (codes 1/2/3/
7/13 via their Elmhurst phrasings) and the secondary family (4/11/12) are
fully covered. Coverage test extended.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 09:35:35 +00:00
Khalim Conn-Kowlessar
ea35bed24c S0380.236: extension party-wall type read independently of "As Main Wall"
RdSAP 10 §3.3: "As Main Wall: Yes" makes an extension inherit the main
dwelling's external wall CONSTRUCTION only — the party wall type is
lodged separately per building part in the Summary §7 block and may
differ. `_extract_extensions` was copying `main_walls.party_wall_type`
into the inherited WallDetails, so every extension reused the main's
party wall U.

On the double_glazing fixture (Summary_001431) the Main lodges party
"CU Cavity masonry unfilled" (SAP10 wall_construction 4 → u_party_wall
0.5) but the 1st Extension lodges "U Unable to determine" (→ 0 → RdSAP
default 0.25). Pre-fix both building parts used 0.5, inflating worksheet
(32) party-wall heat loss by 6.56 W/K (Ext1 26.25 m² × 0.25). After the
fix worksheet (32) is exact: ours 32.573 vs worksheet 32.5725.

Now reads the extension's own "Party Wall Type" from its §7 chunk,
falling back to the main's only when the extension lodges none. Adds a
fixture + test asserting Main=4 / Ext=0 with distinct u_party_wall.
Suite 2413 pass; no cohort regression.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 09:19:43 +00:00
Khalim Conn-Kowlessar
3e45b7fa3b S0380.235: map the remaining Elmhurst §11 glazing labels to SAP 10.2 Table 6b
The double_glazing recommendation fixture (Summary_001431) exercises every
RdSAP-21 §11 glazing lodging in one cert; five labels were missing from
`_ELMHURST_GLAZING_LABEL_TO_SAP10` and strict-raised `UnmappedElmhurstLabel`:

  "Secondary glazing"                     -> 7   (Table 6b "secondary glazing", g_L 0.80)
  "Secondary glazing - Normal emissivity" -> 11  (RdSAP-21 secondary normal-E, g_L 0.80)
  "Triple pre 2002"                       -> 10  (triple pre-2002, g_L 0.70)
  "Triple with unknown install date"      -> 6   (generic triple glazed, g_L 0.70)
  "Single glazing, known data"            -> 15  (single known-data, g_L 0.90)

The glazing code's only cascade effect is the §5 (66)..(67) daylight factor
g_L in `_G_LIGHT_BY_GLAZING_CODE` (single 0.90 / double+secondary 0.80 /
triple 0.70); the lodged manufacturer U-value and solar_transmittance drive
§3 / §6 directly (`_g_perpendicular` prefers the lodged value). Codes are the
semantically-exact RdSAP-21 rows within the correct g_L bucket, kept distinct
for the strict-raise audit trail. Adds a full-coverage test over all 13
distinct labels. Suite 2413 pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 08:15:11 +00:00
Khalim Conn-Kowlessar
9521d52403 S0380.234: PV diverter (Appendix G4) — diverts surplus PV to the cylinder
SAP 10.2 Appendix G4 (PDF p.72-73). A PV diverter routes surplus PV
generation (the would-be export EPV,m × (1 − βm)) to an immersion heater
in the hot-water cylinder. Per G4 step 4:

    SPV,diverter,m = EPV,m × (1 − βm) × 0.8 × fPV,diverter,storageloss

(0.8 = cylinder heat-acceptance; fPV,diverter,storageloss = 0.9 for the
higher storage temperature), clamped to ≤ (62)m + (63a)m, and entered as
the negative worksheet (63b)m (step 5). The β factor is computed on the
PRE-diverter (219) per the §3a note (lines 5485-5486). Effects:
  - (64)m = (62)m + (63b)m → less main-system water-heating fuel (219);
  - export drops to EPV,ex,m = EPV,m(1 − βm) + (63b)m / 0.9 (§4 p.94
    line 5501); the onsite dwelling portion EPV,m × βm is unchanged.

Inclusion (G4 step 1) requires ALL of: a PV system connected to the
dwelling; a cylinder larger than (43) average daily HW use; no solar
water heating; no battery — else the diverter is disregarded.

Three layers:
  - extractor reads Summary §19 "Diverter present"; schema 21.0.0/21.0.1
    SapEnergySource gains `pv_diverter` (API `sap_energy_source.pv_diverter`);
  - `Renewables.pv_diverter_present` + domain `SapEnergySource.pv_diverter_present`,
    set in both the Elmhurst and API mapper paths;
  - `_pv_diverter_monthly_kwh` applies the G4 math after the β split;
    `cert_to_inputs` recomputes (219) and the PV export.

On simulated case 19 (electric storage heaters, 7-hour, PV + diverter):
SAP continuous 50.33 → 51.34 (worksheet 51.2221; both round to the
lodged 51), cost (255) 1847.5 → 1812.3 (ws 1816.6), CO2 (272) 3331 →
3120 (ws 3126), with (233a) dwelling 1280.6 (ws 1280.4). The residual
+0.11 SAP is an upstream winter Appendix-M monthly-EPV-shape gap +
fabric (33) +1.0, tracked as the next case-19 cause. Suite: 2412 pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 22:59:12 +00:00
Khalim Conn-Kowlessar
f326e4eb53 mapper: Elmhurst path populates roof_construction (int) for cross-mapper parity
The gov-EPC API mapper sets BOTH roof_construction (int) and
roof_construction_type (str, derived via _API_ROOF_CONSTRUCTION_TO_STR),
but the Elmhurst mapper set only the string — leaving roof_construction
None on every site-notes cert. The SAP cascade reads the STRING (so SAP
cross-mapper parity always held), but consumers of the int (e.g.
domain/sap10_ml/transform.py ML aggregates `main_dwelling_roof_
construction`) silently saw None on the Elmhurst path.

New `_elmhurst_roof_construction_int` maps the Elmhurst roof-type code to
the same SAP10 int the API lodges (F→1, PN→3, PA→4, PS→8, S/A→7),
harvested from the committed Summary fixtures. Unlike the wall map it
returns None (not a strict-raise) for unmapped codes: the int is not
cascade-load-bearing, so an unknown roof must not block the cert (vaulted
5 / thatched 6 / NR omitted until a fixture surfaces them).

The 6 hand-built U985 reference fixtures gain the matching
roof_construction int (4/4/3 etc.) so test_from_elmhurst_site_notes_
matches_hand_built_* still asserts structural parity. SAP output is
unchanged (cascade reads the string). §4 suite green (2407 passed); the
two pre-existing stone-§5.6 sap10_ml failures are unrelated/out of scope.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 21:16:20 +00:00