From 86eff23f086644f292731a7712fa29f509d1a7ae Mon Sep 17 00:00:00 2001 From: Khalim Conn-Kowlessar Date: Mon, 25 May 2026 17:35:28 +0000 Subject: [PATCH] Handover: Layer-2 cohort 000474 GREEN; reframe with production end-goal first MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit User reframed the end goal explicitly: the production flow is `API JSON → EpcPropertyDataMapper.from_api_response → SAP calculator` landing within ±0.5 of the API-published SAP. The Elmhurst-site-notes work is the cross-validation route — same dwelling, independent path into EpcPropertyData. Once both routes agree on cert 001479, the API mapper is validated by transitivity. Restructure the handover around four nested validation layers: Layer 1 (hand-built cascade pin): 6 cohort certs GREEN; 001479 partial Layer 2 (Elmhurst ≡ hand-built): cohort 000474 GREEN; 5 others pending Layer 3 (API ≡ Elmhurst): test doesn't exist yet Layer 4 (API cascade ±0.5): 72.08 vs 69 (delta +3.08) Each layer validates the one below. Closing inner-most first means upper layers can lean on it as reference. Documents tools/patterns built in slices 63-70: - `_LOAD_BEARING_FIELDS` allow-list (~40 cascade/semantic fields) - `_NON_LOAD_BEARING_WINDOW_SUBFIELDS` deny-list (descriptive int/str encoding noise) - `_diff_load_bearing` recursive helper (strict-pyright-clean) - `test_from_elmhurst_site_notes_matches_hand_built_NNNNNN` tracer- bullet pattern (000474 is the worked example) Next-step ordering: parametrize over 5 other cohort certs, complete 001479 hand-built (currently 2/11 cascade pins green; gap −3.02 SAP), add cert 001479 to diff test, then add API mapper → hand-built diff test, then the production-flow acceptance pin in test_golden_fixtures for cert 001479. Lists source-data caveats (the M-vs-L Ext1 age discrepancy on 001479). Conventions to honour (AAA, abs(diff)<=tol, one slice=one commit, 1e-4 Elmhurst / 0.5 API, no widening, pyright net-zero). Cached artefacts (golden JSON, Summary PDF, worksheet PDF) noted. Co-Authored-By: Claude Opus 4.7 --- docs/sap-spec/NEXT_AGENT_PROMPT.md | 576 ++++++++++------------------- 1 file changed, 196 insertions(+), 380 deletions(-) diff --git a/docs/sap-spec/NEXT_AGENT_PROMPT.md b/docs/sap-spec/NEXT_AGENT_PROMPT.md index 01a7a33b..1cf5768d 100644 --- a/docs/sap-spec/NEXT_AGENT_PROMPT.md +++ b/docs/sap-spec/NEXT_AGENT_PROMPT.md @@ -1,238 +1,217 @@ -# Handover — wire up API↔Elmhurst↔Calculator parity test for cert 0535-9020-6509-0821-6222 +# Handover — API mapper validation via Elmhurst cross-check -You are picking up branch `ara-backend-design-prd` after the 6-fixture -Elmhurst Summary→SAP validation chain landed end-to-end at 1e-4. The -**next workstream** is the project's actual end goal: prove the API -mapper produces the same result as the Elmhurst-site-notes mapper and -both run cleanly through the calculator. +You are picking up branch `ara-backend-design-prd`. The end goal of +this workstream is clear and worth re-stating before anything else. -## The end goal (per the user) +## The end goal (re-confirmed by the user) -> Data from the API → `EpcPropertyData` → SAP10 calculator, matching the -> API-published SAP rating to within ±0.5 (the API publishes rounded -> integer SAPs). +> **Production goal: `API JSON → EpcPropertyDataMapper.from_api_response +> → SAP10 calculator → SAP rating` must match the API-published SAP +> rating to within ±0.5 (the API publishes rounded integer SAPs).** +> +> The work in progress facilitates that by giving us an *independent* +> route to the same dwelling's `EpcPropertyData` — `Summary PDF → +> ElmhurstSiteNotesExtractor → EpcPropertyDataMapper.from_elmhurst_ +> site_notes → SAP`. Once both routes produce the same +> `EpcPropertyData` (or a documented superset) for the same cert, +> the API mapper is validated by transitivity. -The Elmhurst Summary→SAP chain is now closed at 1e-4 across 6 fixtures -(`backend/documents_parser/tests/test_summary_pdf_mapper_chain.py` — -all 8 tests green; Slices 47–53). That gives us a **calibrated -alternate route** into `EpcPropertyData` for the same physical -dwelling, which means we can validate the API mapper by **comparing -its `EpcPropertyData` output against the Elmhurst mapper's output for -the same cert**. +The validation cohort is the 6 U985-surveyor certs (000474, 000477, +000480, 000487, 000490, 000516) — each has a hand-built +`EpcPropertyData` fixture that cascades to the worksheet PDF's lodged +SAP at 1e-4. The 7th cert (001479 / API ref `0535-9020-6509-0821-6222`) +is the first with **both** an Elmhurst site-notes lodgement AND a real +GOV.UK API counterpart — making it the load-bearing cross-mapper +parity-test fixture. -## The new resource (cert 001479) +Once both mappers produce equivalent `EpcPropertyData` for cert +001479, running each through the calculator and comparing the SAP +rating against the API-published `69` is the final acceptance test +for the production flow. -A single dwelling now has **all three** artefacts: +## The workstream layers (current state of each) -| Path | What | -|---|---| -| `sap worksheets/lodged example/Summary_001479.pdf` | Elmhurst Summary site-notes PDF | -| `sap worksheets/lodged example/P960-0001-001479.pdf` | Elmhurst Calculator worksheet output | -| GOV.UK EPB API certificate `0535-9020-6509-0821-6222` | The published cert | +The work is structured as four nested validation layers — each +validates the layer below. Closing the inner-most one first means the +upper layers can rely on it as a reference. -- Worksheet PDF lodges unrounded SAP **69.0094** (line "SAP value") → - rating **C 69** (rounded integer published in §11a + the API). -- Summary PDF current SAP rating: **C 69**, Potential **C 76**, Fuel - Bill £1056, Emissions 2.509 tonnes. -- Surveyor P960-0001 (Richard Matthew Ratcliff); Inspection 29/10/2025; - processed 31/10/2025; postcode PR1 0LX; UPRN A005608690 (note: starts - with `A`, may be a placeholder); 67 Howick Park Drive, Penwortham, - Preston. -- `Lodgement Required: Yes` — distinguishes this cert from the other - 6 cohort certs (U985 surveyor) where `Lodgement Required: No`. This - one was actually pushed to the GOV.UK EPB API, hence the cert - reference. +``` +Layer 4: API mapper validated end-to-end (production goal) + └── Layer 3: API mapper EpcPropertyData ≡ Elmhurst mapper EpcPropertyData + └── Layer 2: Elmhurst mapper EpcPropertyData ≡ hand-built fixture + └── Layer 1: hand-built fixture → cascade SAP at 1e-4 vs worksheet +``` -There's a separate folder `sap worksheets/extended test case/` with -`Summary_000565.pdf` and `U985-0001-000565.pdf` — those are -**not** the right pair for this workstream (no API counterpart). The -user clarified the source mid-handover; the correct location is -`sap worksheets/lodged example/`. +| Layer | Status | Where | +|---|---|---| +| **1 — hand-built cascade pin** | ✅ 6 cohort certs GREEN at 1e-4; cert 001479 hand-built skeleton at 2/11 pins green (Slice 62 unfinished) | `test_e2e_elmhurst_sap_score.py::test_sap_result_pin` | +| **2 — Elmhurst-mapped ≡ hand-built** | ✅ Cohort 000474 fully GREEN (Slice 70); 5 other cohort certs PENDING; cert 001479 PENDING | `test_summary_pdf_mapper_chain.py::test_from_elmhurst_site_notes_matches_hand_built_NNNNNN` | +| **3 — API-mapped ≡ Elmhurst-mapped** | PENDING — no test exists yet | New file `test_api_vs_elmhurst_parity.py` (or extension of the chain test) | +| **4 — API mapper cascade ±0.5 SAP** | RED — cascade SAP 72.08 vs published 69 (delta +3.08, was +9.7 before slices 58-60); golden-fixtures residual pins green | `test_golden_fixtures.py` for cohort + new entry for `0535-9020-6509-0821-6222` | -## The 5-step plan +## What's done (slices 54–70 in this branch) -The user is explicit on the workflow: +Cascade-level fixes (help both mappers): +- Slice 58 `e3dc0b28` — secondary fuel cost routes through lodged `secondary_fuel_type` (was hard-coded to electric tariff); closed a 9-SAP-point ECF distortion on gas-secondary certs. +- Slice 59 `175873b4` — `heat_transmission_from_cert` apportions windows per `window_location` per bp (not all-on-Main); load-bearing for multi-bp dwellings with non-uniform wall U. +- Slice 60 `31c01a7e` — thermal bridging `y` is dwelling-wide (primary bp's age band), not per-bp. -1. **Fetch the API response** for cert `0535-9020-6509-0821-6222`. - The existing client is at `backend/epc_client/epc_client_service.py`: - ```python - from backend.epc_client.epc_client_service import EpcClientService - service = EpcClientService(auth_token=os.environ["OPEN_EPC_API_TOKEN"]) - epc_from_api = service.get_by_certificate_number("0535-9020-6509-0821-6222") - ``` - Cache the raw JSON to - `packages/domain/src/domain/sap/rdsap/tests/fixtures/golden/0535-9020-6509-0821-6222.json` - so tests run reproducibly without a token — that's the pattern other - golden fixtures already use (`0240-0200-…`, `0300-2747-…`, etc.). +Elmhurst-mapper fixes (Slice 2 layer): +- Slice 54 `4427b58a` — `extensions_count` from `len(survey.extensions)`. +- Slice 55 `c89206fc` — party-wall code `"CU"` → 4 (cavity unfilled U=0.5). +- Slice 56 `07ed871f` — floor `"E To external air"` → `u_exposed_floor` Table 20. +- Slice 57 `7a9a8b7e` — PS sloping-ceiling + As-Built + pre-1950 age → `thickness=0` → U=2.30. +- Slice 66+67 `ca39d072` — `country_code="ENG"`, `has_draught_lobby` gate, plus 5 heating-detail int surfacings (`boiler_flue_type`, `emitter_temperature`, `central_heating_pump_age`, `main_heating_number`, `water_heating_fuel`). +- Slice 68 `6baf66cd` — Elmhurst party-wall `"U"` → 0 sentinel; cohort hand-built `central_heating_pump_age_str="Unknown"`. -2. **Map the API response to `EpcPropertyData`** via the existing - `EpcPropertyDataMapper.from_api_response(raw_json)`. Only RdSAP- - Schema-21.0.0 / 21.0.1 are supported today; this cert (Elmhurst - RdSAP10, processed Oct 2025) is almost certainly 21.0.1 — verify. +Hand-built fixture work (Slice 1 layer + parity setup): +- Slice 62 `ee98dbe0` — created `_elmhurst_worksheet_001479.py` skeleton; 2/11 cascade pins green (the rest need iteration; `sap_score_continuous=65.99 vs 69.0094`, gap −3.02 SAP). +- Slice 64 `b5cbfe83` — bulk-update cohort 000474 hand-built with Cat A fields (descriptive strings, ventilation zero counts, top-level booleans); 50 → 14 mapper-vs-hand-built diffs. +- Slice 65 `4997039f` — added `shower_outlets` + `number_baths` to cohort 000474 hand-built. +- Slice 69 `d8a37029` — expanded cohort 000474 windows 5 → 7 (1:1 with §11 table). +- Slice 70 `035d916d` — added window-subfield exclusion to diff helper + `frame_factor=0.7` default in `make_window`. **Cohort 000474 diff GREEN**. -3. **Map the Summary PDF to `EpcPropertyData`** via the new chain: - ```python - from backend.documents_parser.elmhurst_extractor import ElmhurstSiteNotesExtractor - from datatypes.epc.domain.mapper import EpcPropertyDataMapper - # use _summary_pdf_to_textract_style_pages helper from - # backend/documents_parser/tests/test_summary_pdf_mapper_chain.py - pages = _summary_pdf_to_textract_style_pages(summary_pdf_path) - site_notes = ElmhurstSiteNotesExtractor(pages).extract() - epc_from_site_notes = EpcPropertyDataMapper.from_elmhurst_site_notes(site_notes) - ``` +Diff test infrastructure (Slice 63 `01d234dd`): +- `_LOAD_BEARING_FIELDS` allow-list in `test_summary_pdf_mapper_chain.py` (~40 top-level fields driving cascade or cross-mapper semantics). +- `_NON_LOAD_BEARING_WINDOW_SUBFIELDS` deny-list (descriptive int/str encodings that don't affect cascade). +- `_diff_load_bearing` recursive helper, strict-pyright-clean (`mapped/hand_built: object`, narrowed via isinstance). +- `test_from_elmhurst_site_notes_matches_hand_built_000474` is the tracer-bullet test. -4. **Compare the two `EpcPropertyData` objects field-by-field.** Any - difference is either (a) a mapper-coverage gap on one side or (b) - data the API doesn't publish (which would be a nightmare — the user - flagged this explicitly). Surface every diff; classify and fix. +## What's RED right now -5. **Pass both through the calculator** and assert: - - `calculate_sap_from_inputs(cert_to_inputs(epc_from_api))`'s - unrounded SAP is within **±0.5** of the API-published rounded - SAP (69). The 0.5 tolerance is the API-cert convention — the - published integer is rounded, so half a SAP point is just - rounding noise. - - `calculate_sap_from_inputs(cert_to_inputs(epc_from_site_notes))` - matches the worksheet PDF's unrounded SAP **69.0094** to **1e-4** - (extending the existing - `test_summary_pdf_mapper_chain.py` cohort pattern to this 7th - fixture). - - **The two cascade outputs match each other to ≤ 1e-4** when the - mappers are fully aligned — this is the load-bearing parity - proof the user is after. +``` +$ git log --oneline -1 backed | head -1 +035d916d Slice 70: cohort 000474 mapper-vs-hand-built diff is GREEN +``` -## Existing infra you should lean on +Two RED forcing functions on the branch: -- **`packages/domain/src/domain/sap/rdsap/tests/test_golden_fixtures.py`** - is the canonical API→SAP residual test pattern. It loads - `fixtures/golden/.json`, runs - `from_api_response → cert_to_inputs → calculate_sap_from_inputs`, - and pins the residual `(calc_sap - lodged_sap)`. The new cert - belongs in this file's `_EXPECTATIONS` tuple. +1. `test_summary_001479_full_chain_sap_matches_worksheet_pdf_exactly` — chain pin for cert 001479; cascade SAP `70.20` vs worksheet `69.0094` (delta `1.19`). 9 of 11 `test_sap_result_pin[001479-*]` fail in the same RED state. Closing requires either: + - Completing the 001479 hand-built (`_elmhurst_worksheet_001479.py` is the Slice 62 skeleton) — encode every worksheet input until 11/11 pins hit 1e-4. + - Or finding the remaining `~3 W/K` cascade gap (likely `u_floor` Table 19 for age C + PS sloping-ceiling roof area inclination factor — see prior handover at commit `0e4f4c05`). -- **`backend/documents_parser/tests/test_summary_pdf_mapper_chain.py`** - is the Elmhurst Summary→SAP pinning pattern. All 6 cohort certs use - this. The new cert (001479) needs a 7th `test_summary_001479_full_ - chain_sap_matches_worksheet_pdf_exactly` here pinned at 1e-4 vs - 69.0094. +## What's GREEN right now -- **The cross-mapper diff** is genuinely new — there's no existing - test that asserts `from_api_response(json) == from_elmhurst_site_ - notes(pdf)` for the same cert. You'll be writing it from scratch. - Consider a dedicated test file - `backend/documents_parser/tests/test_api_vs_elmhurst_parity.py` - asserting field-level equivalence (and cascade-output equivalence) - for the cert `001479 / 0535-9020-6509-0821-6222`. +- All 66 cohort `test_sap_result_pin[NNNNNN-*]` pins (6 certs × 11 fields) at 1e-4. +- 8 golden-fixture residual pins in `test_golden_fixtures.py` (cohort API certs). +- `test_from_elmhurst_site_notes_matches_hand_built_000474` — first parity validation. +- Pyright net-zero on every touched file's baseline. + +## Suggested next moves (in priority order) + +### 1. Parametrize the diff test over the 5 other cohort certs + +The toolchain is in place. For each cert 000477, 000480, 000487, 000490, 000516: + +```python +def test_from_elmhurst_site_notes_matches_hand_built_NNNNNN() -> None: + pages = _summary_pdf_to_textract_style_pages(_SUMMARY_NNNNNN_PDF) + site_notes = ElmhurstSiteNotesExtractor(pages).extract() + mapped = EpcPropertyDataMapper.from_elmhurst_site_notes(site_notes) + hand_built = _wNNNNNN.build_epc() + diffs: list[str] = [] + for field_name in _LOAD_BEARING_FIELDS: + diffs.extend(_diff_load_bearing( + getattr(mapped, field_name, None), + getattr(hand_built, field_name, None), + field_name, + )) + assert not diffs, ( + f"{len(diffs)} load-bearing divergence(s) ...\n " + + "\n ".join(diffs) + ) +``` + +Each will RED initially with a similar diff pattern to 000474. Most diffs should close mechanically by the same bulk-update pattern as Slice 64 (descriptive fields, ventilation zeros, top-level booleans, `wall_thickness_measured`, etc.). The unique-to-cert wrinkles need slice-by-slice attention. Could be parametrize-then-bulk-fix-then-iterate, or one cert at a time. + +Run diff probe (substitute `NNNNNN`): +```bash +PYTHONPATH=/workspaces/model:/workspaces/model/packages/domain/src python -c " +import sys; sys.path.insert(0, '/workspaces/model') +from backend.documents_parser.tests.test_summary_pdf_mapper_chain import _diff_load_bearing, _LOAD_BEARING_FIELDS, _summary_pdf_to_textract_style_pages +from backend.documents_parser.elmhurst_extractor import ElmhurstSiteNotesExtractor +from datatypes.epc.domain.mapper import EpcPropertyDataMapper +from domain.sap.worksheet.tests import _elmhurst_worksheet_NNNNNN as wHB +from pathlib import Path +pages = _summary_pdf_to_textract_style_pages(Path('/workspaces/model/backend/documents_parser/tests/fixtures/Summary_NNNNNN.pdf')) +sn = ElmhurstSiteNotesExtractor(pages).extract() +mapped = EpcPropertyDataMapper.from_elmhurst_site_notes(sn) +hb = wHB.build_epc() +diffs = [] +for f in _LOAD_BEARING_FIELDS: + diffs.extend(_diff_load_bearing(getattr(mapped, f, None), getattr(hb, f, None), f)) +print(f'diff count: {len(diffs)}') +for d in diffs: print(f' {d}') +" +``` + +### 2. Complete cert 001479's hand-built (`_elmhurst_worksheet_001479.py`) + +Currently 2/11 cascade pins green. Worksheet target `69.0094`. Cascade output `65.99`. Likely missing inputs (compare against cohort 000490 which has a similar gas-combi+secondary config): +- Hot-water demand routing (Tcold model, occupancy) +- Thermal mass parameter +- Internal gains (appliance + cooking allowance) +- `multiple_glazed_proportion` +- §2 ventilation tuning + +Diagnostic: `python -m pytest packages/domain/src/domain/sap/worksheet/tests/test_e2e_elmhurst_sap_score.py::test_sap_result_pin -k 001479 -v --no-cov` shows each pin's `actual vs expected`. + +### 3. Add cert 001479 to the diff test (after 001479 hand-built lands 1e-4) + +```python +def test_from_elmhurst_site_notes_matches_hand_built_001479() -> None: + ... +``` + +Likely RED initially. Close diffs the same way as 000474. + +### 4. API mapper → hand-built diff test (Layer 3) + +```python +def test_from_api_response_matches_hand_built_001479() -> None: + raw = json.loads(Path("packages/domain/src/domain/sap/rdsap/tests/fixtures/golden/0535-9020-6509-0821-6222.json").read_text()) + mapped = EpcPropertyDataMapper.from_api_response(raw) + hand_built = _w001479.build_epc() + # same _diff_load_bearing pattern +``` + +The API JSON is already cached at `packages/domain/src/domain/sap/rdsap/tests/fixtures/golden/0535-9020-6509-0821-6222.json` (Slice 54 era). + +Diffs here will surface API-mapper coverage gaps. Each one is a slice; the API mapper at `from_api_response` / `from_rdsap_schema_21_0_1` paths needs corresponding extraction. + +### 5. The production acceptance test + +Once Layer 3 is green for cert 001479: +- `test_golden_fixtures.py::test_golden_cert_residual_matches_pin[0535-9020-6509-0821-6222]` — add entry. API-mapped EPC cascades to within ±0.5 of API-published `69`. +- And `test_summary_001479_full_chain_sap_matches_worksheet_pdf_exactly` is GREEN at 1e-4. + +That's the production-flow acceptance: API → EpcPropertyData → SAP score within tolerance. ## Conventions you must honour (project memory) -- AAA test convention: every new test uses literal `# Arrange / # Act - / # Assert` headers. -- `abs(diff) <= tol` not `pytest.approx` (strict pyright; pytest.approx - is partially-unknown and costs a pyright error). -- One slice = one commit; stage by name. Don't stage `?? non_intrusive_ - photos/`, `?? kwh_client_for_deletion.pkl`, etc. — pre-existing - untracked junk. -- **1e-4 tolerance for the Elmhurst path; 0.5 tolerance for the API - path.** Memory entry `feedback_e2e_validation_philosophy`: component - pins at <1e-3; SAP integer must hit delta=0 vs PDF; **adaptive - ceilings forbidden**. -- Strict pyright net-zero on every commit (35-error baseline across the - Elmhurst+mapper files). -- The user has firmly rejected widening/xfail in the past. If the - mappers disagree, fix the underlying gap — don't loosen the test. +- AAA test convention: every new test uses literal `# Arrange / # Act / # Assert` headers. +- `abs(diff) <= tol` not `pytest.approx` (strict-pyright partially-unknown). +- One slice = one commit; stage by name. +- 1e-4 tolerance for the Elmhurst path; 0.5 for the API path. No widening, no xfail (`feedback_zero_error_strict`). +- Strict pyright net-zero on every commit (per-file baselines: mapper.py 35, heat_transmission.py 13, cert_to_inputs.py 35). +- The 6 cohort cert hand-builts MUST keep cascading to 1e-4. If a mapper change breaks one, fix the mapper or update the hand-built to match — don't widen. -## What's already done (Slices 47–53) +## Source-data caveats -The Elmhurst extractor + mapper now handle: +- **Cert 001479 age band**: Summary §3 says `Ext1: M 2023 onwards`; worksheet header says `Ext1: L`. Assessor data-entry inconsistency. The 001479 hand-built uses `L` (to mirror the worksheet calc inputs); the Elmhurst mapper trusts the Summary `M`. This will surface as a 1-field diff in the eventual `001479` diff test — document and accept (or override per-cert in the hand-built). -- Multi-bp dwellings (Main + N extensions); per-bp dimensions, walls, - roofs, floors. -- Room-in-Roof (`SapRoomInRoof.detailed_surfaces`) with §3.10 detailed - Flat Ceiling / Stud Wall / Slope / Gable Wall / Gable-Wall-External - surfaces, Decimal-based round-half-up area rounding. -- Window parser handling 3 §11 layout variants (separate frame_type/ - factor; combined `Wood 0.70`; trailing glazing-type on data line; - unprefixed frame_factor-only line). -- Roof-window separation by U > 3.0 with Table 24 raw-U lookup. -- `window_width × window_height = lodged Area` convention to avoid - W×H reconstruction drift. -- Alternative-wall extraction with "Thickness Unknown" → cascade- - default U routing (TF age B uninsulated → U=1.9 for thin timber). -- Secondary heating SAP code from §14.1 Main Heating2 sub-section; - RdSAP §S5 sheltered-sides from built-form; party-wall construction - codes ("U", "S"); suspended-timber-floor heuristic; electric-vs- - mixer shower from outlet_type; `number_baths` lodgement; `main_ - heating_category=2` for pumps_fans; roof "N None" → 0mm thickness. - -If the diff in step 4 surfaces a gap on the **API mapper** side, the -fix may need to mirror one of the above — the API schema fields are -already in `EpcPropertyData` (most paths feed through it), but the -`from_rdsap_schema_21_0_1` mapper may not be wiring everything. - -If the diff surfaces a gap on the **Elmhurst mapper** side, the -recently-landed work probably already covers the analogous field for -one of the 6 cohort fixtures — extend, don't reinvent. - -## Likely outcomes / risks - -- **Best case**: both mappers produce equivalent `EpcPropertyData` for - cert 001479; both cascade to ≈ 69.0094 SAP; the API target (69) is - hit to within 0.5; you write the parity test and ship a clean slice. -- **Likely case**: there are a handful of small mapping divergences - (e.g. one mapper sets a default that the other extracts; one - rounds a 2-d.p. value differently). Each is a slice; close them - systematically using the cohort patterns from Slices 47–53. -- **Worst case (the nightmare the user flagged)**: the API simply - doesn't publish a field that the Elmhurst Summary PDF does (e.g. - measured alt-wall U-values, certain Room-in-Roof gable-type - flags). In that case, document the gap clearly and either accept - the resulting SAP drift (within 0.5) or escalate to the user — - don't paper over with widened tolerances. - -## Probe scripts (regenerate in `/tmp` as needed) - -The Elmhurst session used these heavily; you'll want analogues: - -```bash -# Cohort SAP delta — verify nothing has regressed -python /tmp/probe_all.py - -# Field-level cascade-input diff for a single cert -python /tmp/diff_objects.py 000487 -``` - -For the new workflow, you'll want a probe that: -1. Loads the API JSON + Summary PDF for the same cert. -2. Maps both → `EpcPropertyData`. -3. Diffs them field-by-field. -4. Cascades both and prints both unrounded SAPs alongside the - worksheet PDF's lodged value (69.0094). - -## First actions - -1. Read `backend/epc_client/epc_client_service.py` end-to-end. The - `get_by_certificate_number` entry point is the one you want. -2. Fetch cert `0535-9020-6509-0821-6222`. Save the raw JSON to - `packages/domain/src/domain/sap/rdsap/tests/fixtures/golden/`. -3. Inspect the schema_type and confirm `from_api_response` accepts it. -4. Write the probe script described above; capture the cross-mapper - diff. -5. Triage the diff. Each divergence is a slice. Close them in order. -6. Land the three pin tests as forcing functions: - - Summary_001479 → ≤ 1e-4 vs 69.0094 (new entry in `test_summary_ - pdf_mapper_chain.py`). - - API cert 0535-9020-6509-0821-6222 → within 0.5 of 69 (new entry - in `test_golden_fixtures.py`). - - Cross-mapper parity: `from_api_response` and - `from_elmhurst_site_notes` produce equivalent - `EpcPropertyData` for the same cert; cascade outputs match to - ≤ 1e-4 (new file `test_api_vs_elmhurst_parity.py`). - -## Branch state at handover +## Branch state ``` -$ git log --oneline -12 +$ git log --oneline -15 +035d916d Slice 70: cohort 000474 mapper-vs-hand-built diff is GREEN +d8a37029 Slice 69: 1:1 windows expansion in cohort 000474 (5 → 7) +6baf66cd Slice 68: party-wall "U Unable" + central_heating_pump_age_str → 1 diff left +ca39d072 Slices 66+67: Elmhurst mapper surfaces country_code + heating ints + has_draught_lobby +4997039f Slice 65: add shower_outlets + number_baths to cohort 000474 hand-built +b5cbfe83 Slice 64: bulk-update cohort 000474 hand-built for Cat A diff parity +01d234dd Slice 63: RED tracer-bullet mapper-vs-hand-built diff test for cohort 000474 +7e1269fc Handover: hand-built fixture skeleton landed (Slice 62); 2/11 pins green ee98dbe0 Slice 62: hand-built _elmhurst_worksheet_001479.py — skeleton + 11 RED pins 0e4f4c05 Handover: TDD red-green session — 4 more slices (58-60) + RED chain pin 31c01a7e Slice 60: thermal bridging y is dwelling-wide, not per-bp @@ -240,183 +219,20 @@ ee98dbe0 Slice 62: hand-built _elmhurst_worksheet_001479.py — skeleton + 11 R e3dc0b28 Slice 58: secondary fuel cost routes through lodged secondary_fuel_type a0d9d094 Handover: 4 cert-001479 slices in (54-57); gap at +7.62 SAP; non-fabric next 7a9a8b7e Slice 57: Pre-1950 Elmhurst sloping-ceiling roofs map to thickness=0 -07ed871f Slice 56: Elmhurst floor exposed to external air routes through u_exposed_floor -c89206fc Slice 55: Elmhurst party-wall code "CU" maps to cavity unfilled -4427b58a Slice 54: Elmhurst mapper sets extensions_count from len(survey.extensions) -a756114a Handover: all 6 Elmhurst Summary→SAP chains closed at 1e-4 -58088c10 Slice 53: Summary_000487 chain pins SAP at 1e-4 — last cohort cert closed ``` -Chain pin `test_summary_001479_full_chain_sap_matches_worksheet_pdf_ -exactly` is committed RED (cascade SAP 70.20 vs worksheet 69.0094, -delta 1.19) as the load-bearing TDD forcing function. All other -chain + golden + heat-transmission tests pass. Pyright net-zero on -touched files. +## Cached artefacts (don't re-fetch) -## Resumption notes for cert 001479 (Slices 54–60 in; chain pin RED at delta 1.19) +- `packages/domain/src/domain/sap/rdsap/tests/fixtures/golden/0535-9020-6509-0821-6222.json` — API JSON for cert 001479 (Slice 54 era, fetched via `OPEN_EPC_API_TOKEN` from `backend/.env`). +- `backend/documents_parser/tests/fixtures/Summary_001479.pdf` — site-notes PDF. +- `sap worksheets/lodged example/P960-0001-001479.pdf` — Elmhurst worksheet output for cert 001479. -### What landed across two sessions +## Probe scripts (regenerable in `/tmp`) -**Session 1** (Slices 54-57): fabric mapper gaps from the cross-mapper diff. - -- **Slice 54** — `extensions_count` reads `len(survey.extensions)`. -- **Slice 55** — Elmhurst party-wall code `"CU"` → `WALL_CAVITY=4` - (U=0.5 matching worksheet's `Party walls Main … 0.50`). -- **Slice 56** — Floor location `"E To external air"` routes through - `u_exposed_floor` (Ext2 cantilevered floor at U=1.20). -- **Slice 57** — PS sloping-ceiling roofs at age A-D with "As Built" - thickness map to `thickness=0` → U=2.30 (Ext2 uninsulated roof). - -**Session 2** (Slices 58-60): TDD red-green cycle with the chain pin as -forcing function. Two cascade-level fixes + one mapper fix: - -- **Slice 58** — Secondary fuel cost routing. Mapper derives - `secondary_fuel_type=26` (mains gas) from SAP code 605; cascade - `_fuel_cost` reads `secondary_fuel_type` instead of hardcoding the - electric tariff. Closes a £175/yr ECF distortion ≈ **9 SAP** on - cert 001479. Golden cert 0300-2747 (also mains-gas secondary) - tightens SAP residual −7 → +2 — biggest single golden improvement. -- **Slice 59** — `heat_transmission_from_cert` apportions window - area per `window_location` to each bp's wall deduction (was all-to- - Main). For 001479 Ext1's 6.37 m² window now correctly cuts into - Ext1's wall (U=0.26) instead of Main's (U=0.70). Three golden - certs (6035, 7536, 8135) with non-Main windows tighten all - residuals; cohort certs unaffected (uniform per-bp wall U). -- **Slice 60** — Thermal bridging `y` is dwelling-wide (primary bp's - age band) rather than per-bp. Multi-age dwellings like 001479 - (Main=C, Ext1=M, Ext2=C) and golden 7536 (D, L, F) had Ext1 - bridging under-counted at y=0.08 instead of dwelling's y=0.15. - -**Slice 61 ATTEMPTED + REVERTED**: `SapFloorDimension.floor_lodged_ -u_value` override using Elmhurst Summary §9 "Default U-value". The -override matched 001479's worksheet exactly (Main 0.65, Ext1 0.20, -Ext2 1.20) but broke cohort 000474's 1e-4 pin: that cert's cascade -calibration relied on `u_floor` returning 0.77 for age B + 12.68 m², -while Summary lodges 0.75. The 0.02 U drift × 12.68 m² shifted SAP -beyond 1e-4. **Next session needs a different approach** — either -fix `u_floor` Table 19 cascade for age C (currently 0.60, should be -0.65) without breaking age B, or selectively apply the override. - -### Where the chain stands - -| Cascade SAP | Delta to 69.0094 | After | -|---|---|---| -| 63.17 | +5.84 | Initial (pre-this-workstream) | -| 61.39 | +7.62 | Post-Slice 57 (fabric only) | -| 70.64 | −1.63 | Post-Slice 58 (secondary fuel) | -| 70.38 | −1.37 | Post-Slice 59 (window apportionment) | -| **70.20** | **−1.19** | **Post-Slice 60 (single-y bridging)** | - -The chain pin is committed RED at delta 1.19. **Per-bp fabric U-values -all match worksheet exactly** (Main wall 0.70, Ext1 wall 0.26, Ext2 -wall 0.70, etc.). The remaining 1.19 SAP overshoot maps to ~3 W/K of -extra HLC that the cascade is still under-counting: - -| Line ref | Cascade | Worksheet | Gap | -|---|---|---|---| -| (29a) walls | 39.77 | 39.77 | ✓ | -| (30) roof | 9.53 | 10.34 | −0.81 (Ext2 sloping-ceiling area) | -| (28a) floor | 21.65 | 23.17 | −1.52 (Main floor U 0.60 vs 0.65) | -| (32) party | 17.07 | 17.07 | ✓ | -| (27) windows | 43.60 | 43.60 | ✓ | -| (26) doors | 5.55 | 5.55 | ✓ | -| (36) bridging | 22.27 | 24.35 | −2.08 (driven by (31) under-count) | -| **(37) total** | **156.62** | **163.84** | **−7.22 W/K** | - -### What likely closes the remaining 1.19 SAP - -1. **`u_floor` Table 19 boundary for age C** (cascade returns 0.60; - worksheet expects 0.65 — same as age B). May be a Table 19 row - boundary miss. Need to read the canonical xlsx Sheet `Table 19` - to confirm correct values. If cascade is wrong, fixing it would - affect cohort but probably in the right direction. -2. **Ext2 roof area for PS sloping ceiling** — cascade uses floor - area (1.92) as roof area; worksheet uses 2.22 (slant length × - width). Factor ≈ 1.156 = sec(30°). Cascade-level: multiply - gross_roof_area by an inclination factor when roof_type starts - with "PS". -3. **`(31)` total external area under-count** of 1.13 m² (drives the - bridging gap). Probably the same Ext2 roof area issue (0.30 m²) - plus other accumulations. Fix #2 likely closes most of this. - -### Source-data caveats - -- **Summary PDF vs worksheet age band on Ext1**: Summary §3 says - `M 2023 onwards`; worksheet header says `Property Age Band C, Ext1: L, - Ext2: C`. Trust Summary (mapper does what data says); chain pin - docstring documents the caveat. - -### Probe scripts in /tmp (regenerable) - -- `/tmp/probe_001479.py` — cross-mapper diff + cascade. -- `/tmp/sensitivity_001479.py` — single-field SAP impact probe. -- `/tmp/perbp_001479.py` — per-bp cascade U-value dump vs worksheet. - -Cached cert JSON: `packages/domain/src/domain/sap/rdsap/tests/ -fixtures/golden/0535-9020-6509-0821-6222.json`. Summary PDF in the -chain-test fixtures dir. - -### Suggested next steps - -**User-confirmed plan (rigorous cohort pattern):** the hand-built -fixture `_elmhurst_worksheet_001479.py` (Slice 62) is the ground- -truth EpcPropertyData for this cert. Two parallel workstreams now: - -1. **Iterate the hand-built to 1e-4 against the worksheet.** Current - state: 2/11 cascade pins green (pumps_fans, lighting after the - LED/CFL split). The other 9 pins fail with `sap_score_continuous - = 65.99 vs 69.0094` (~3 SAP gap). Likely slice candidates from - the cascade scalar deltas: - - **HW demand routing**: hand-built may be over-counting hot- - water demand (combi-vs-cylinder path; Tcold model; Appendix J - occupancy). The worksheet's `(219) 2358.31` vs cascade's - `hot_water_kwh_per_yr` is one of the highest-impact deltas. - - **§2 ventilation tuning**: confirm `open_chimneys_count=0`, - `blocked_chimneys_count=0`, `closed_flues_count=0`, - `passive_vents_count=0` are all explicitly lodged on - `SapVentilation` to match the worksheet's §2 zeros. - - **Thermal mass parameter**: worksheet lodges `250.00` — verify - the hand-built's default matches. - - **`multiple_glazed_proportion`**: cascade reads it for solar - gain weighting; hand-built leaves None — check if that path - short-circuits to a less-favourable default. - - **`secondary_heating_fraction`**: cascade may be reading 0.10 - (gas+gas) vs Elmhurst's 0.10 — confirm. (215) delta is ~290 - kWh; worth ~0.2 SAP if mis-routed. - -2. **Once 11 pins green: add `test_elmhurst_mapper_matches_hand_ - built` + `test_api_mapper_matches_hand_built`** parametrized - over both the new cert 001479 and the 6 cohort certs. Every - field diff is a mapper bug; close them slice-by-slice. The - cross-mapper parity test (`test_api_vs_elmhurst_parity`) - collapses to "both produce hand-built-equivalent EpcPropertyData - for cert 001479". - -3. **Current Elmhurst chain pin** (`test_summary_001479_full_chain - _sap_matches_worksheet_pdf_exactly`) is RED at delta 1.19 SAP. - Once the mapper closes its diff vs the hand-built, the chain pin - lands GREEN automatically. - -### Probe scripts in /tmp (regenerable) - -- `/tmp/probe_001479.py` — cross-mapper diff + cascade (rerun - after every cascade change; current diff count: 215 across both - mappers). -- `/tmp/sensitivity_001479.py` — single-field SAP impact probe. +- `/tmp/probe_000474_handbuilt_diff.py` — diff cohort 000474 mapped vs hand-built (un-filtered). +- `/tmp/probe_000474_load_bearing.py` — diff cohort 000474 mapped vs hand-built (load-bearing scope, pre-filter). +- `/tmp/probe_001479.py` — cross-mapper diff + cascade for cert 001479. +- `/tmp/sensitivity_001479.py` — single-field patch SAP impact probe. - `/tmp/perbp_001479.py` — per-bp cascade U-value dump. -### Cohort cascade scalars probe (helpful for hand-built iteration) - -Pin the failing fields against an MCVE probe — easiest workflow: - -```python -from domain.sap.calculator import Sap10Calculator -from domain.sap.worksheet.tests._elmhurst_worksheet_001479 import build_epc -r = Sap10Calculator().calculate(build_epc()) -# Inspect r.hot_water_kwh_per_yr, r.main_heating_fuel_kwh_per_yr, etc. -``` - -Compare against the cohort cert (e.g. 000490 mains-gas+gas secondary) -to find what hand-built field is missing. - -Good luck. +Good luck. Keep the end goal at the front of the work: **API → SAP within ±0.5 of published 69 on cert 001479** is the acceptance test. The cohort + Elmhurst diff layers are the trail of breadcrumbs that will get us there with high confidence.