diff --git a/docs/sap-spec/NEXT_AGENT_PROMPT.md b/docs/sap-spec/NEXT_AGENT_PROMPT.md index 1cf5768d..3e0bfad8 100644 --- a/docs/sap-spec/NEXT_AGENT_PROMPT.md +++ b/docs/sap-spec/NEXT_AGENT_PROMPT.md @@ -1,238 +1,269 @@ -# Handover — API mapper validation via Elmhurst cross-check +# Handover — API mapper at 1e-3 on cert 001479, closing to 1e-4 -You are picking up branch `ara-backend-design-prd`. The end goal of -this workstream is clear and worth re-stating before anything else. +You are picking up branch `ara-backend-design-prd`. The cert 001479 API +path is at SAP delta **+0.0006** (was +3.08); fabric heat loss is +EXACT. The remaining work is closing the sub-1e-3 gap and validating +against more cert pairs. ## The end goal (re-confirmed by the user) -> **Production goal: `API JSON → EpcPropertyDataMapper.from_api_response -> → SAP10 calculator → SAP rating` must match the API-published SAP -> rating to within ±0.5 (the API publishes rounded integer SAPs).** +> **Production goal: `API JSON → EpcPropertyDataMapper.from_api_ +> response → SAP10 calculator → SAP rating` must match the SAP value +> the calculator emitted at lodge time to within 1e-4.** > -> The work in progress facilitates that by giving us an *independent* -> route to the same dwelling's `EpcPropertyData` — `Summary PDF → -> ElmhurstSiteNotesExtractor → EpcPropertyDataMapper.from_elmhurst_ -> site_notes → SAP`. Once both routes produce the same -> `EpcPropertyData` (or a documented superset) for the same cert, -> the API mapper is validated by transitivity. +> The acceptance tolerance is **1e-4 against the worksheet's +> continuous SAP value**, not ±0.5 against the published integer. +> ±0.5 only applies when no worksheet is available (the 8 cohort +> golden certs we have as API-only); when we have both API + worksheet +> (cert 001479), the 1e-4 bar is the bar. -The validation cohort is the 6 U985-surveyor certs (000474, 000477, -000480, 000487, 000490, 000516) — each has a hand-built -`EpcPropertyData` fixture that cascades to the worksheet PDF's lodged -SAP at 1e-4. The 7th cert (001479 / API ref `0535-9020-6509-0821-6222`) -is the first with **both** an Elmhurst site-notes lodgement AND a real -GOV.UK API counterpart — making it the load-bearing cross-mapper -parity-test fixture. +The earlier handover stated ±0.5 — that was wrong. The user +emphasised this twice: the calc is mechanical, identical inputs must +produce identical outputs, so when we have the continuous worksheet +value we should hit it exactly. See the conversation thread that led +to Slice 87. -Once both mappers produce equivalent `EpcPropertyData` for cert -001479, running each through the calculator and comparing the SAP -rating against the API-published `69` is the final acceptance test -for the production flow. - -## The workstream layers (current state of each) - -The work is structured as four nested validation layers — each -validates the layer below. Closing the inner-most one first means the -upper layers can rely on it as a reference. +## Validation layers (current state) ``` -Layer 4: API mapper validated end-to-end (production goal) +Layer 4: API mapper cascade SAP = worksheet SAP at 1e-4 (production goal) └── Layer 3: API mapper EpcPropertyData ≡ Elmhurst mapper EpcPropertyData - └── Layer 2: Elmhurst mapper EpcPropertyData ≡ hand-built fixture - └── Layer 1: hand-built fixture → cascade SAP at 1e-4 vs worksheet + └── Layer 2: Elmhurst-mapped EpcPropertyData → cascade SAP = worksheet SAP at 1e-4 + └── Layer 1: hand-built EpcPropertyData → cascade SAP = worksheet SAP at 1e-4 ``` -| Layer | Status | Where | +| Layer | Status | +|---|---| +| **1 — hand-built cascade pin** | ✅ 6 cohort certs (000474, 000477, 000480, 000487, 000490, 000516) GREEN at 1e-4; cert 001479 hand-built skeleton (Slice 62) still RED (2 of 11 pins green, hand-built has its own bugs — orthogonal to the production path) | +| **2 — Elmhurst-mapped path** | ✅ **Cert 001479 GREEN at 1e-4** (Slice 89); cohort: 2 GREEN (000477, 000516), 4 RED (000474, 000480, 000487, 000490 — Elmhurst U985 worksheets violate the RdSAP 10 §5 (12) spec; orthogonal to the production goal) | +| **3 — API-mapped ≡ Elmhurst-mapped (field-level)** | 🟡 Cascade outputs match within 1e-3 SAP; field-level diff test not yet written | +| **4 — API path cascade SAP** | 🟡 **Cert 001479 at +0.0006 SAP delta from worksheet** (was +3.08); 9 other golden certs pinned at residual-from-integer at tolerance 0 | + +## Cumulative API SAP delta progression (cert 001479) + +The big breakthrough: implementing the RdSAP 10 §5 (12) spec rule +(`Floor infiltration (suspended timber ground floor only)` — page 29 +of `docs/sap-spec/RdSAP 10 Specification 10-06-2025.pdf`) revealed a +series of API-mapper coverage gaps that all needed fixing for the +spec rule's premise to be met. Each slice closed one gap: + +| Slice | Fix | API SAP delta | |---|---|---| -| **1 — hand-built cascade pin** | ✅ 6 cohort certs GREEN at 1e-4; cert 001479 hand-built skeleton at 2/11 pins green (Slice 62 unfinished) | `test_e2e_elmhurst_sap_score.py::test_sap_result_pin` | -| **2 — Elmhurst-mapped ≡ hand-built** | ✅ Cohort 000474 fully GREEN (Slice 70); 5 other cohort certs PENDING; cert 001479 PENDING | `test_summary_pdf_mapper_chain.py::test_from_elmhurst_site_notes_matches_hand_built_NNNNNN` | -| **3 — API-mapped ≡ Elmhurst-mapped** | PENDING — no test exists yet | New file `test_api_vs_elmhurst_parity.py` (or extension of the chain test) | -| **4 — API mapper cascade ±0.5 SAP** | RED — cascade SAP 72.08 vs published 69 (delta +3.08, was +9.7 before slices 58-60); golden-fixtures residual pins green | `test_golden_fixtures.py` for cohort + new entry for `0535-9020-6509-0821-6222` | +| baseline | broken party wall enum, no descriptive strings | **+3.0752** | +| 87 | RdSAP 10 §5 (12) spec rule + Elmhurst-mapper switch to None | — | +| 88 | thread `bp.floor_construction_type` into `u_floor` cascade | — | +| 89 | PS pitched-sloping-ceiling roof area `÷ cos(30°)` (added `roof_construction_type` field on `SapBuildingPart`) | — | +| 90 | API `party_wall_construction` enum → SAP10 `u_party_wall` codes (1→3 Solid, 2→4 Cavity, etc.) | +1.5298 | +| 91 | descriptive strings via int→str lookups (`floor_construction_type`, `roof_construction_type`) + pre-1950 PS sloping → thickness=0 + per-bp roof description fix | +1.0970 | +| 92 | upper-floor `room_height_m += 0.25` + `is_exposed_floor` from `floor_heat_loss==1` + `floor_insulation_thickness="NI"→None` | +1.0022 | +| 93 | `window_transmission_details` from `glazing_type` int (code 3 → U=2.8/g=0.76, code 13 → U=1.4/g=0.72) | +1.1846 | +| 94 | `sheltered_sides` from API `built_form` + `floor_type` from `floor_heat_loss==7` | **+0.0006** | -## What's done (slices 54–70 in this branch) +Fabric breakdown for cert 001479 API path is now COMPLETELY EXACT +(all 6 components match worksheet to 4 d.p.): -Cascade-level fixes (help both mappers): -- Slice 58 `e3dc0b28` — secondary fuel cost routes through lodged `secondary_fuel_type` (was hard-coded to electric tariff); closed a 9-SAP-point ECF distortion on gas-secondary certs. -- Slice 59 `175873b4` — `heat_transmission_from_cert` apportions windows per `window_location` per bp (not all-on-Main); load-bearing for multi-bp dwellings with non-uniform wall U. -- Slice 60 `31c01a7e` — thermal bridging `y` is dwelling-wide (primary bp's age band), not per-bp. +| Component | Cascade | Worksheet target | +|---|---|---| +| walls | 39.7652 | 39.7652 ✓ | +| party walls | 17.0700 | 17.0700 ✓ | +| roof | 10.3438 | 10.3438 ✓ | +| floor | 23.1705 | 23.1705 ✓ | +| windows | 43.5962 | 43.5962 ✓ | +| doors | 5.5500 | 5.5500 ✓ | +| **fabric total** | **139.4957** | **139.4957 ✓** | -Elmhurst-mapper fixes (Slice 2 layer): -- Slice 54 `4427b58a` — `extensions_count` from `len(survey.extensions)`. -- Slice 55 `c89206fc` — party-wall code `"CU"` → 4 (cavity unfilled U=0.5). -- Slice 56 `07ed871f` — floor `"E To external air"` → `u_exposed_floor` Table 20. -- Slice 57 `7a9a8b7e` — PS sloping-ceiling + As-Built + pre-1950 age → `thickness=0` → U=2.30. -- Slice 66+67 `ca39d072` — `country_code="ENG"`, `has_draught_lobby` gate, plus 5 heating-detail int surfacings (`boiler_flue_type`, `emitter_temperature`, `central_heating_pump_age`, `main_heating_number`, `water_heating_fuel`). -- Slice 68 `6baf66cd` — Elmhurst party-wall `"U"` → 0 sentinel; cohort hand-built `central_heating_pump_age_str="Unknown"`. +## What's left (queue, in priority order) -Hand-built fixture work (Slice 1 layer + parity setup): -- Slice 62 `ee98dbe0` — created `_elmhurst_worksheet_001479.py` skeleton; 2/11 cascade pins green (the rest need iteration; `sap_score_continuous=65.99 vs 69.0094`, gap −3.02 SAP). -- Slice 64 `b5cbfe83` — bulk-update cohort 000474 hand-built with Cat A fields (descriptive strings, ventilation zero counts, top-level booleans); 50 → 14 mapper-vs-hand-built diffs. -- Slice 65 `4997039f` — added `shower_outlets` + `number_baths` to cohort 000474 hand-built. -- Slice 69 `d8a37029` — expanded cohort 000474 windows 5 → 7 (1:1 with §11 table). -- Slice 70 `035d916d` — added window-subfield exclusion to diff helper + `frame_factor=0.7` default in `make_window`. **Cohort 000474 diff GREEN**. +### 1. Close cert 001479's residual 0.0006 SAP gap (1-3 slices) -Diff test infrastructure (Slice 63 `01d234dd`): -- `_LOAD_BEARING_FIELDS` allow-list in `test_summary_pdf_mapper_chain.py` (~40 top-level fields driving cascade or cross-mapper semantics). -- `_NON_LOAD_BEARING_WINDOW_SUBFIELDS` deny-list (descriptive int/str encodings that don't affect cascade). -- `_diff_load_bearing` recursive helper, strict-pyright-clean (`mapped/hand_built: object`, narrowed via isinstance). -- `test_from_elmhurst_site_notes_matches_hand_built_000474` is the tracer-bullet test. - -## What's RED right now +The remaining gap is non-fabric. Diff against the Summary path's +intermediate cascade values (which lands at 1e-4 GREEN): ``` -$ git log --oneline -1 backed | head -1 -035d916d Slice 70: cohort 000474 mapper-vs-hand-built diff is GREEN +Σ internal_gains_monthly_w: API 5339.27 Sum 5313.55 delta +25.72 +Σ solar_gains_monthly_w: API 5510.10 Sum 5508.60 delta +1.50 +Σ mean_internal_temp_monthly_c: API 214.87 Sum 213.51 delta +1.35 +Σ monthly_infiltration_ach: API 8.95 Sum 10.91 delta -1.96 +hot_water_kwh_per_yr: API 2365.00 Sum 2358.31 delta +6.69 ``` -Two RED forcing functions on the branch: +Specifically: +- **Infiltration is still under by ~2 ACH/year**. The (12) spec rule + applies on both paths now (after Slice 87), so it's something else + — possibly `has_draught_lobby` (API=None, Summary=False; cascade + treats both as False so it shouldn't matter; verify) or `(13) + draught_lobby_ach`. Or storey count. Probe with + `ventilation_from_cert(api_mapped)` vs `ventilation_from_cert(sum_ + mapped)`. +- **HW kWh +6.7** suggests a small Appendix J §1a occupancy + difference, or a different Tcold series, or shower outlets. +- **Internal gains +25.7 W·months** — probably a pumps_fans count or + lighting bulb count mismatch. -1. `test_summary_001479_full_chain_sap_matches_worksheet_pdf_exactly` — chain pin for cert 001479; cascade SAP `70.20` vs worksheet `69.0094` (delta `1.19`). 9 of 11 `test_sap_result_pin[001479-*]` fail in the same RED state. Closing requires either: - - Completing the 001479 hand-built (`_elmhurst_worksheet_001479.py` is the Slice 62 skeleton) — encode every worksheet input until 11/11 pins hit 1e-4. - - Or finding the remaining `~3 W/K` cascade gap (likely `u_floor` Table 19 for age C + PS sloping-ceiling roof area inclination factor — see prior handover at commit `0e4f4c05`). - -## What's GREEN right now - -- All 66 cohort `test_sap_result_pin[NNNNNN-*]` pins (6 certs × 11 fields) at 1e-4. -- 8 golden-fixture residual pins in `test_golden_fixtures.py` (cohort API certs). -- `test_from_elmhurst_site_notes_matches_hand_built_000474` — first parity validation. -- Pyright net-zero on every touched file's baseline. - -## Suggested next moves (in priority order) - -### 1. Parametrize the diff test over the 5 other cohort certs - -The toolchain is in place. For each cert 000477, 000480, 000487, 000490, 000516: - -```python -def test_from_elmhurst_site_notes_matches_hand_built_NNNNNN() -> None: - pages = _summary_pdf_to_textract_style_pages(_SUMMARY_NNNNNN_PDF) - site_notes = ElmhurstSiteNotesExtractor(pages).extract() - mapped = EpcPropertyDataMapper.from_elmhurst_site_notes(site_notes) - hand_built = _wNNNNNN.build_epc() - diffs: list[str] = [] - for field_name in _LOAD_BEARING_FIELDS: - diffs.extend(_diff_load_bearing( - getattr(mapped, field_name, None), - getattr(hand_built, field_name, None), - field_name, - )) - assert not diffs, ( - f"{len(diffs)} load-bearing divergence(s) ...\n " + - "\n ".join(diffs) - ) -``` - -Each will RED initially with a similar diff pattern to 000474. Most diffs should close mechanically by the same bulk-update pattern as Slice 64 (descriptive fields, ventilation zeros, top-level booleans, `wall_thickness_measured`, etc.). The unique-to-cert wrinkles need slice-by-slice attention. Could be parametrize-then-bulk-fix-then-iterate, or one cert at a time. - -Run diff probe (substitute `NNNNNN`): +Run the diff probe (the one from the conversation) to localise: ```bash PYTHONPATH=/workspaces/model:/workspaces/model/packages/domain/src python -c " -import sys; sys.path.insert(0, '/workspaces/model') from backend.documents_parser.tests.test_summary_pdf_mapper_chain import _diff_load_bearing, _LOAD_BEARING_FIELDS, _summary_pdf_to_textract_style_pages from backend.documents_parser.elmhurst_extractor import ElmhurstSiteNotesExtractor from datatypes.epc.domain.mapper import EpcPropertyDataMapper -from domain.sap.worksheet.tests import _elmhurst_worksheet_NNNNNN as wHB +import json, dataclasses from pathlib import Path -pages = _summary_pdf_to_textract_style_pages(Path('/workspaces/model/backend/documents_parser/tests/fixtures/Summary_NNNNNN.pdf')) + +api = json.loads(Path('/workspaces/model/packages/domain/src/domain/sap/rdsap/tests/fixtures/golden/0535-9020-6509-0821-6222.json').read_text()) +api_mapped = EpcPropertyDataMapper.from_api_response(api) +pages = _summary_pdf_to_textract_style_pages(Path('/workspaces/model/backend/documents_parser/tests/fixtures/Summary_001479.pdf')) sn = ElmhurstSiteNotesExtractor(pages).extract() -mapped = EpcPropertyDataMapper.from_elmhurst_site_notes(sn) -hb = wHB.build_epc() +sum_mapped = EpcPropertyDataMapper.from_elmhurst_site_notes(sn) diffs = [] for f in _LOAD_BEARING_FIELDS: - diffs.extend(_diff_load_bearing(getattr(mapped, f, None), getattr(hb, f, None), f)) -print(f'diff count: {len(diffs)}') -for d in diffs: print(f' {d}') + diffs.extend(_diff_load_bearing(getattr(api_mapped, f, None), getattr(sum_mapped, f, None), f)) +print(f'{len(diffs)} load-bearing divergences') +for d in diffs[:40]: print(f' {d}') " ``` -### 2. Complete cert 001479's hand-built (`_elmhurst_worksheet_001479.py`) +(NB: the original `_diff_load_bearing` was written for cohort +diff tests; the helper signature is `mapped, hand_built, path` — pass +api_mapped as `mapped` and sum_mapped as `hand_built` to surface API +gaps.) -Currently 2/11 cascade pins green. Worksheet target `69.0094`. Cascade output `65.99`. Likely missing inputs (compare against cohort 000490 which has a similar gas-combi+secondary config): -- Hot-water demand routing (Tcold model, occupancy) -- Thermal mass parameter -- Internal gains (appliance + cooking allowance) -- `multiple_glazed_proportion` -- §2 ventilation tuning +### 2. Layer 3 — write the API ≡ Elmhurst diff test (1 slice) -Diagnostic: `python -m pytest packages/domain/src/domain/sap/worksheet/tests/test_e2e_elmhurst_sap_score.py::test_sap_result_pin -k 001479 -v --no-cov` shows each pin's `actual vs expected`. +Add `test_from_api_response_matches_from_elmhurst_site_notes_001479` +in `backend/documents_parser/tests/test_summary_pdf_mapper_chain.py`, +mirroring the cohort `test_from_elmhurst_site_notes_matches_hand_ +built_NNNNNN` pattern. Use `_diff_load_bearing` with `_LOAD_BEARING_ +FIELDS`. This formalises Layer 3 as a 1e-4 gate (zero load-bearing +divergences between the two mapper outputs). -### 3. Add cert 001479 to the diff test (after 001479 hand-built lands 1e-4) +This test will start RED with the residual diffs from step 1; closing +those slices brings it to GREEN. -```python -def test_from_elmhurst_site_notes_matches_hand_built_001479() -> None: - ... -``` +### 3. More cert pairs (user is sourcing — pause for new data) -Likely RED initially. Close diffs the same way as 000474. +The user has agreed to source 2-3 more (Elmhurst worksheet + GOV.UK +API JSON) pairs to validate the mapper isn't 001479-overfit. +Suggested diversity: -### 4. API mapper → hand-built diff test (Layer 3) +- **Detached + RR** (would fix cert 0240's -14 residual which has a + Type-1 RR the mapper doesn't extract). +- **Mid-terrace with cavity-filled party walls** (API party_wall_ + construction=3 → spec U=0.2; currently mapped to SAP10 code 4 + which gives U=0.5; needs cascade extension at + `u_party_wall`). +- **Flat / maisonette** (party wall U=0 path; cert 9390 is one but + no worksheet). +- **Different age band** (E, J, K, L) to exercise the (12) spec + rule's age boundaries. -```python -def test_from_api_response_matches_hand_built_001479() -> None: - raw = json.loads(Path("packages/domain/src/domain/sap/rdsap/tests/fixtures/golden/0535-9020-6509-0821-6222.json").read_text()) - mapped = EpcPropertyDataMapper.from_api_response(raw) - hand_built = _w001479.build_epc() - # same _diff_load_bearing pattern -``` +Each new pair lands as a 1e-4 cascade-pin test. Pattern: ~3-5 new +mapper bugs per cert pair (similar to Slice 87-94 on 001479). Each +becomes its own slice. Stage by name; one slice = one commit. -The API JSON is already cached at `packages/domain/src/domain/sap/rdsap/tests/fixtures/golden/0535-9020-6509-0821-6222.json` (Slice 54 era). +### 4. Investigate goldenz with shifted residuals after Slices 87-94 -Diffs here will surface API-mapper coverage gaps. Each one is a slice; the API mapper at `from_api_response` / `from_rdsap_schema_21_0_1` paths needs corresponding extraction. +The Slice 87-94 fixes shifted residuals on 7 of 10 API-only golden +certs. The new residuals are pinned. Outliers that need attention: -### 5. The production acceptance test +- **0240** (-14): documented RR mapper gap (`'Roof room(s), + insulated (assumed)'` description not parsed; Type-1 RR + gable_wall_lengths not extracted) +- **0390-2954** (-6): large detached, age F, oil — likely a heating + efficiency cascade gap +- **6035** (-6): mid-terrace age A — possibly party wall config or + ventilation issue -Once Layer 3 is green for cert 001479: -- `test_golden_fixtures.py::test_golden_cert_residual_matches_pin[0535-9020-6509-0821-6222]` — add entry. API-mapped EPC cascades to within ±0.5 of API-published `69`. -- And `test_summary_001479_full_chain_sap_matches_worksheet_pdf_exactly` is GREEN at 1e-4. +These are tractable once you have a worksheet for any of them. -That's the production-flow acceptance: API → EpcPropertyData → SAP score within tolerance. +### 5. (deferred) Cohort chain test RED triage -## Conventions you must honour (project memory) +4 cohort chain tests (000474, 000480, 000487, 000490) are RED +because the Elmhurst U985 worksheets emit (12) values that don't +follow RdSAP 10 §5 — see the conversation re: identical Summary §9 +lodgements producing different worksheet (12) for cohort 000477 vs +000480. The cascade is now spec-correct; the Elmhurst tool isn't. +Options: (a) mark as known-Elmhurst-non-spec, (b) add per-cert +override field, (c) wait for more cert pairs to confirm pattern. +**Not blocking the production goal.** -- AAA test convention: every new test uses literal `# Arrange / # Act / # Assert` headers. -- `abs(diff) <= tol` not `pytest.approx` (strict-pyright partially-unknown). -- One slice = one commit; stage by name. -- 1e-4 tolerance for the Elmhurst path; 0.5 for the API path. No widening, no xfail (`feedback_zero_error_strict`). -- Strict pyright net-zero on every commit (per-file baselines: mapper.py 35, heat_transmission.py 13, cert_to_inputs.py 35). -- The 6 cohort cert hand-builts MUST keep cascading to 1e-4. If a mapper change breaks one, fix the mapper or update the hand-built to match — don't widen. +## Key conventions (project memory) -## Source-data caveats +- **AAA test convention** — every new test uses literal `# Arrange / + # Act / # Assert` headers. +- **`abs(diff) <= tol`** not `pytest.approx` (strict-pyright partial- + unknown). +- **One slice = one commit** — stage by name (`git add `). +- **1e-4 tolerance** for the worksheet-comparable paths (Elmhurst + Summary + API both have worksheets for cert 001479). No widening, + no xfail. +- **Strict pyright net-zero** per file. Baselines: `mapper.py` 33, + `heat_transmission.py` 13, `cert_to_inputs.py` 35, + `epc_property_data.py` 0. +- **Spec citation in commit messages** — when a slice implements a + spec rule, quote the spec text (RdSAP 10 page reference). User + asked us to confirm against docs. -- **Cert 001479 age band**: Summary §3 says `Ext1: M 2023 onwards`; worksheet header says `Ext1: L`. Assessor data-entry inconsistency. The 001479 hand-built uses `L` (to mirror the worksheet calc inputs); the Elmhurst mapper trusts the Summary `M`. This will surface as a 1-field diff in the eventual `001479` diff test — document and accept (or override per-cert in the hand-built). +## Cached artefacts -## Branch state +- `packages/domain/src/domain/sap/rdsap/tests/fixtures/golden/0535- + 9020-6509-0821-6222.json` — API JSON for cert 001479 (RdSAP-Schema- + 21.0.1). +- `backend/documents_parser/tests/fixtures/Summary_001479.pdf` — + Elmhurst site-notes PDF for cert 001479. +- `sap worksheets/lodged example/P960-0001-001479.pdf` — Domna's + worksheet output for cert 001479 (Continuous SAP 69.0094). +- `sap worksheets/U985-0001-NNNNNN.pdf` × 6 — cohort Elmhurst + worksheets (000474, 000477, 000480, 000487, 000490, 000516). +- `sap worksheets/U985-0001-NNNNNN.txt` × 6 — text exports of above. + +## Recent slice history (Slices 87-94, current branch) ``` -$ git log --oneline -15 -035d916d Slice 70: cohort 000474 mapper-vs-hand-built diff is GREEN -d8a37029 Slice 69: 1:1 windows expansion in cohort 000474 (5 → 7) -6baf66cd Slice 68: party-wall "U Unable" + central_heating_pump_age_str → 1 diff left -ca39d072 Slices 66+67: Elmhurst mapper surfaces country_code + heating ints + has_draught_lobby -4997039f Slice 65: add shower_outlets + number_baths to cohort 000474 hand-built -b5cbfe83 Slice 64: bulk-update cohort 000474 hand-built for Cat A diff parity -01d234dd Slice 63: RED tracer-bullet mapper-vs-hand-built diff test for cohort 000474 -7e1269fc Handover: hand-built fixture skeleton landed (Slice 62); 2/11 pins green -ee98dbe0 Slice 62: hand-built _elmhurst_worksheet_001479.py — skeleton + 11 RED pins -0e4f4c05 Handover: TDD red-green session — 4 more slices (58-60) + RED chain pin -31c01a7e Slice 60: thermal bridging y is dwelling-wide, not per-bp -175873b4 Slice 59: heat_transmission apportions window area per bp via window_location -e3dc0b28 Slice 58: secondary fuel cost routes through lodged secondary_fuel_type -a0d9d094 Handover: 4 cert-001479 slices in (54-57); gap at +7.62 SAP; non-fabric next -7a9a8b7e Slice 57: Pre-1950 Elmhurst sloping-ceiling roofs map to thickness=0 +03203418 Slice 94: API mapper sheltered_sides + floor_type — cert 001479 to 1e-3 +7281b7b3 Slice 93: API mapper window_transmission_details from glazing_type +8e752e57 Slice 92: API mapper floor dimensions (SAP +0.25m + exposed-floor + NI→None) +2cebba28 Slice 91: API mapper descriptive strings + roof description per-bp fix +fbbdca49 Slice 90: API mapper translates party_wall_construction → SAP10 enum +006e9842 Slice 89: PS pitched-sloping-ceiling roof area uses inclined surface +c40679d1 Slice 88: thread bp.floor_construction_type into u_floor cascade +aff331ff Slice 87: implement RdSAP 10 §5 (12) spec rule for suspended timber floor +2d3355ee Slice 86: 1:1 windows expansion in cohort 000516 (2 → 5 entries) +f863598d Slice 85: bulk-update cohort 000516 hand-built for Cat A diff parity ``` -## Cached artefacts (don't re-fetch) +Earlier slice context (71-86 closed cohort Layer 2) is in the prior +handover at commit `86eff23f` (`docs/sap-spec/NEXT_AGENT_PROMPT.md` +before this rewrite). -- `packages/domain/src/domain/sap/rdsap/tests/fixtures/golden/0535-9020-6509-0821-6222.json` — API JSON for cert 001479 (Slice 54 era, fetched via `OPEN_EPC_API_TOKEN` from `backend/.env`). -- `backend/documents_parser/tests/fixtures/Summary_001479.pdf` — site-notes PDF. -- `sap worksheets/lodged example/P960-0001-001479.pdf` — Elmhurst worksheet output for cert 001479. +## First action -## Probe scripts (regenerable in `/tmp`) +1. Confirm branch state matches `git log --oneline -1` → + `03203418` Slice 94. +2. Run the full sweep: + ```bash + PYTHONPATH=/workspaces/model:/workspaces/model/packages/domain/src \ + python -m pytest backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \ + packages/domain/src/domain/sap/worksheet/tests/test_e2e_elmhurst_sap_score.py \ + packages/domain/src/domain/sap/rdsap/tests/test_golden_fixtures.py \ + --no-cov -q + ``` + Expect ~75 passed / ~16 failed. The 9 failures on + `test_sap_result_pin[001479-*]` (cohort cascade for the hand-built + skeleton) and 4 cohort chain RED + 3 cohort diff RED are + pre-existing. +3. Run the API → Summary diff probe (script in §1 above) to surface + the remaining sub-1e-3 SAP gap. Likely candidates ranked by impact: + - Infiltration (-2 ACH/yr) → check `ventilation_from_cert()` + intermediate outputs for both paths + - HW kWh (+6.7) → check shower outlet count + Appendix J §1a path + - Internal gains (+25.7 W·months) → check pumps_fans + bulb counts +4. Don't lose sight of Layer 4: **API → SAP within 1e-4 of worksheet + continuous on cert 001479** is the production goal. Currently + delta +0.0006. -- `/tmp/probe_000474_handbuilt_diff.py` — diff cohort 000474 mapped vs hand-built (un-filtered). -- `/tmp/probe_000474_load_bearing.py` — diff cohort 000474 mapped vs hand-built (load-bearing scope, pre-filter). -- `/tmp/probe_001479.py` — cross-mapper diff + cascade for cert 001479. -- `/tmp/sensitivity_001479.py` — single-field patch SAP impact probe. -- `/tmp/perbp_001479.py` — per-bp cascade U-value dump. - -Good luck. Keep the end goal at the front of the work: **API → SAP within ±0.5 of published 69 on cert 001479** is the acceptance test. The cohort + Elmhurst diff layers are the trail of breadcrumbs that will get us there with high confidence. +Good luck. The user is sourcing more cert pairs in parallel; when +they arrive, each one will surface 3-5 mapper bugs along the same +pattern as Slices 87-94. The diagnostic methodology that worked here +(diff Summary-mapper vs API-mapper; localise by cascade component; +fix the API mapper to mirror the Summary's surfacing) will work +again.