Commit graph

7 commits

Author SHA1 Message Date
Khalim Conn-Kowlessar
2b1f90a7de S0380.199: site-notes "Roof of Room" windows → roof windows (cross-mapper parity with S0380.198)
The Elmhurst extractor crashed parsing simulated-case-6's room-in-roof
window rows: the §11 "Location" cell "Roof of Room in Roof" wraps across
the layout prefix/suffix blocks and leaked into the glazing-type phrase
("Double between 2002 Roof of Room and 2021 in Roof" → UnmappedElmhurst-
Label). Fix (`_parse_window_from_anchors`): detect the roof-of-room
location tokens, strip them from the before/after blocks so the glazing
phrase reconstructs cleanly, and set location="Roof of Room".

Mapper: `_is_elmhurst_roof_window` gains a "Roof of Room" location branch
(highest-confidence rooflight signal, above the BP-roof-type / U>3.0
gates); `_ELMHURST_ROOF_WINDOW_U_BY_GLAZING` gains "Double between 2002
and 2021" → 2.30 (case 6 lodges the already-inclined roof-window U, so
the +0.30 inclination adjustment must not double-apply).

This is the site-notes mirror of S0380.198 (API window_wall_type=4):
both paths now route room-in-roof rooflights to (27a) at the inclined U.
Validated against the case-6 P960 worksheet at abs=1e-4:
  (27)  Windows      = 22.7408 (cascade 22.7407)
  (27a) Roof Windows = 13.0375 (cascade 13.0375, EXACT)
  (31)  ext area     = 336.13

Case 6 is pinned only on the §3 window line refs (new standalone test,
not added to the section-pin `_FIXTURES`) because its DUAL main heating
(51% rads + 49% underfloor, oil) makes the §10/§12 per-system lines
non-comparable to SapResult's aggregated fields — documented in the
fixture module. Summary mirrored to Summary_001431_case6.pdf.

Suite: 2355 passed, 1 skipped. New code: 0 pyright errors.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 12:46:18 +00:00
Khalim Conn-Kowlessar
570df83459 S0380.197: simulated case 5 e2e fixture — detached sandstone RR validates S0380.196 (RdSAP 10 §3.9.1 + Table 4 p.22)
Promotes user-simulated "case 5" (detached, sandstone-walled, room-in-roof
cousin of golden cert 0240) to an e2e worksheet fixture pinning the WHOLE
extractor → mapper → calculator pipeline at abs=1e-4 on all 11 Block-1
line refs. Its worksheet prints the exact RR-gable routing S0380.196
implements, validating that fix against ground truth:

  Roof room Main Gable Wall 1  15.68  U=0.35  (29a)  Exposed → walls @ main-wall U
  Roof room Main remaining area 61.73  U=0.30  (30)  A_RR shell − Σ gables
  External roof Main           14.52  U=0.11  (30)  loft residual
  Roof room Main Gable Wall 2  15.68  U=0.25  (32)  Party → party @ 0.25

gable area = 6.40 × 2.45 (§3.9.1 default RR storey height); A_RR remaining
= 12.5√(83.2/1.5) − 2×15.68 = 93.09 − 31.36 = 61.73 (RdSAP 10 §3.9.1(e)).
Confirms a DETACHED dwelling can lodge a Party RR gable (Table 4 p.22
row 2) — so my S0380.196 mapping (gable_wall_type 0=Party, 1=Exposed) is
correct; do not flip it.

Two extractor/mapper gaps surfaced and fixed (case 5 is the forcing test):
- Sandstone wall label "SS Stone: sandstone or limestone" had no
  `_ELMHURST_WALL_CODE_TO_SAP10` entry (raised UnmappedElmhurstLabel).
  Added "SS" → 2 (WALL_STONE_SANDSTONE), matching 0240's API
  wall_construction=2 (cross-mapper parity).
- Roof "Insulation Thickness 400+ mm" was silently dropped: the four
  thickness parsers used `.split()[0].isdigit()`, which rejects the
  trailing "+" → None → u_roof fell back to the age-J default 0.16
  instead of 0.11 (+1.09 W/K roof, the whole 0.12 SAP gap). Added
  `_parse_thickness_mm` (strips to leading digits) and applied it at all
  four sites (walls / alt-wall / roof / floor). The only existing fixture
  with "400+ mm" (000565 Stud Wall) routes via the RIR regex, unaffected.

Result: case 5 cascade ≡ worksheet at 1e-4 on SAP/ECF/cost/CO2 + every
energy stream. Neither gap affects 0240 (its API path captures both the
sandstone code and "400mm+"); 0240's residual is therefore non-fabric.

Suite: 2353 passed, 1 skipped. New code: 0 pyright errors.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 11:41:16 +00:00
Khalim Conn-Kowlessar
4a21717de6 S0380.195: pin sim case 4 (6035 floor geometry) e2e at 1e-4 — 6035 +19 PE is lodged divergence
Adds the user-simulated case-4 worksheet as e2e fixture `001431_6035` —
reproduces golden cert 6035's full floor geometry (Main ground-floor HLP
15.99 + first-floor HLP 8.32, the asymmetric upper storey) and 8 windows.
All 11 Block-1 line refs pin at abs=1e-4 against the worksheet (SAP 68,
ECF 2.2802, cost 937.2341, CO2 4682.3494, space 15745.3260, main fuel
18744.4357).

This is the 4th independent 1e-4 confirmation across the 6035 archetype
(sim cases 1-4). Case 4 matches 6035 on floors + window areas; the
residual ~50 kWh / £11 cascade delta vs 6035 is two lodged inputs only
(largest window orientation N vs S; meter type "Dual" vs API 2), not
calculator behaviour.

Conclusion: the cascade reproduces the spec engine exactly for 6035's
geometry, so 6035's +19 PE vs the lodged register is lodged-register
divergence (the gov.uk register's rounded value vs the spec-exact
worksheet), NOT a calculator gap. 6035 is a "pin-forever" lodged-only
cert. Bugs surfaced + fixed along the way: S0380.192 (Simplified-RR
remaining area) and S0380.193 (suspended-floor sealed rule).

2341 passed (+11), 0 failed; pyright net-zero.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 09:56:39 +00:00
Khalim Conn-Kowlessar
e7a0c9885e S0380.194: pin sim case 3 (near-exact 6035 replica) e2e at 1e-4
Adds the user-simulated case-3 worksheet as e2e fixture `001431_rr8` —
Main + Extension + Simplified room-in-roof with 8 windows (≈14.15 m²,
reproducing golden cert 6035's glazing) and Main ground-floor HLP 15.99.
All 11 Block-1 line refs pin at abs=1e-4 against the worksheet (SAP 68,
cost 951.3425, CO2 4767.4862, space 16086.3557, main fuel 19150.4235,
HW 3307.2639, lighting 262.0885).

This is the third independent 1e-4 confirmation that the cascade
reproduces the spec engine for the 6035 archetype (after S0380.192
Simplified-RR + S0380.193 suspended-floor). It differs from 6035 in one
input only — the Main first-floor HLP (15.99 here vs 6035's 8.32) — so
6035's +19 PE vs the lodged register is lodged-register divergence, not
a calculator gap. A byte-identical 6035 replica (first-floor HLP 8.32)
would let 6035 itself be pinned directly to close that out.

2330 passed (+11), 0 failed; pyright net-zero.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 09:46:56 +00:00
Khalim Conn-Kowlessar
62fc27a5cc S0380.193: suspended-floor (12) sealed rule fires only on a SUPPLIED U-value
RdSAP 10 §5 (PDF p.29) "Floor infiltration (suspended timber ground
floor only)", age band A-E, splits on whether a floor U-value is
supplied:
  a) [U-value supplied] if floor U-value < 0.5 → "sealed", (12) = 0.1
  b) [no U-value supplied] retro-fitted insulation → "sealed" 0.1;
     otherwise "unsealed", (12) = 0.2

`_has_suspended_timber_floor_per_spec` fed the cascade's COMPUTED default
U into rule (a), so an as-built/uninsulated suspended-timber floor whose
default U happens to be < 0.5 was marked "sealed" (0.1) where Elmhurst
uses "unsealed" (0.2). That dropped (18) infiltration 0.85 → 0.75, (25)
effective ACH, HTC, and understated space heating ~450 kWh.

Fix: gate rule (a) on `floor_u_value_known` — a computed default U is not
a supplied value, so it falls through to (b). Verified against the
cert 001431 sim-case-2 worksheet: floor "As built", U=0.43 (matches the
worksheet's (28a) 0.4300 exactly), (12)=0.2 unsealed. Golden cert 6035
(also a suspended uninsulated floor) is unaffected — its U=0.63 ≥ 0.5
already routed to unsealed.

Promotes sim case 2 to the e2e harness as `001431_rr` (Main + Extension
+ Simplified room-in-roof — the 6035 archetype). All 11 Block-1 line
refs pin at abs=1e-4, locking BOTH this fix and S0380.192 (Simplified-RR
remaining area) end-to-end: SAP 69, cost 920.5046, CO2 4566.7090, space
15269.8593, main fuel 18178.4039. 2319 passed (+11), 0 failed; pyright
net-zero.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 09:16:25 +00:00
Khalim Conn-Kowlessar
896b5740c3 S0380.191: pin simulated 001431 gas-combi end-to-end at 1e-4 (e2e harness)
Adds the user-simulated 001431 case (the cert that drove S0380.189/.190)
as an Elmhurst-only e2e fixture: Summary PDF → extractor → mapper →
calculator, every Block-1 SapResult field pinned against the
P960-0001-001431 worksheet at abs=1e-4. All 11 pins pass with zero
residual — the case is clean, confirming the S0380.190 gas-combi fuel
derivation closes the Summary path natively.

Verified the handover's flagged "+0.0007 SAP" was a target artifact, not
a cascade gap: the worksheet displays ECF (257) rounded to 1.6047 and
integer SAP (258)=78; the cascade's continuous SAP is computed from the
UNROUNDED ECF = (255)*(256)/((4)+45) = 660.9750*0.4200/173.0, giving
77.6147 — which matches the worksheet's own unrounded value. Pinning the
continuous SAP from the display-rounded ECF (→ 77.6144) was the wrong
target. Block-1 line refs all match exactly: (211) 10699.7225, (219)
3327.1592, (231) 86.0, (232) 283.2229, (255) 660.9750, (272) 3000.1664,
Σ(98) 8987.7669.

Summary mirrored into the tracked fixtures dir as
Summary_001431_gas_combi.pdf (distinct name — the corpus reuses cert
001431 across every heating variant); source Summary + worksheet tracked
under sap worksheets/golden fixture debugging/ as the pin ground truth.

2302 passed (+11), 0 failed; pyright net-zero on new/changed files.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 22:44:32 +00:00
Khalim Conn-Kowlessar
d1c87d84b0 Add heating-systems corpus PDF fixtures so CI can run the residual pins
test_heating_systems_corpus.py (and one pcdb-1 cross-check in
test_cert_to_inputs.py) read the 001431 controlled-variable corpus PDFs
directly at runtime from `sap worksheets/heating systems examples/`, but
that directory was never committed — it was supplied locally on
2026-05-30 and only ever existed on dev machines. CI therefore errored
with "no Summary PDF in …" for all 57 corpus variants.

Commit the 82 corpus PDFs (41 populated variant folders × Summary +
P960, 4.7 MB) in place so the cascade-vs-worksheet residual pins run in
CI, matching the existing convention where the U985 / 000565
conformance fixtures are committed under
backend/documents_parser/tests/fixtures/ (31 PDFs already tracked).

Only the .pdf fixtures are added; the stray .DS_Store and a P960 .txt
dump in pcdb 1/ are left untracked.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 15:38:03 +00:00