Handover: Layer-2 cohort 000474 GREEN; reframe with production end-goal first

User reframed the end goal explicitly: the production flow is
`API JSON → EpcPropertyDataMapper.from_api_response → SAP calculator`
landing within ±0.5 of the API-published SAP. The Elmhurst-site-notes
work is the cross-validation route — same dwelling, independent path
into EpcPropertyData. Once both routes agree on cert 001479, the API
mapper is validated by transitivity.

Restructure the handover around four nested validation layers:

  Layer 1 (hand-built cascade pin):  6 cohort certs GREEN; 001479 partial
  Layer 2 (Elmhurst ≡ hand-built):   cohort 000474 GREEN; 5 others pending
  Layer 3 (API ≡ Elmhurst):          test doesn't exist yet
  Layer 4 (API cascade ±0.5):        72.08 vs 69 (delta +3.08)

Each layer validates the one below. Closing inner-most first means
upper layers can lean on it as reference.

Documents tools/patterns built in slices 63-70:
- `_LOAD_BEARING_FIELDS` allow-list (~40 cascade/semantic fields)
- `_NON_LOAD_BEARING_WINDOW_SUBFIELDS` deny-list (descriptive int/str
  encoding noise)
- `_diff_load_bearing` recursive helper (strict-pyright-clean)
- `test_from_elmhurst_site_notes_matches_hand_built_NNNNNN` tracer-
  bullet pattern (000474 is the worked example)

Next-step ordering: parametrize over 5 other cohort certs, complete
001479 hand-built (currently 2/11 cascade pins green; gap −3.02 SAP),
add cert 001479 to diff test, then add API mapper → hand-built diff
test, then the production-flow acceptance pin in test_golden_fixtures
for cert 001479.

Lists source-data caveats (the M-vs-L Ext1 age discrepancy on 001479).
Conventions to honour (AAA, abs(diff)<=tol, one slice=one commit,
1e-4 Elmhurst / 0.5 API, no widening, pyright net-zero). Cached
artefacts (golden JSON, Summary PDF, worksheet PDF) noted.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Khalim Conn-Kowlessar 2026-05-25 17:35:28 +00:00
parent 035d916dd6
commit 86eff23f08

View file

@ -1,238 +1,217 @@
# Handover — wire up API↔Elmhurst↔Calculator parity test for cert 0535-9020-6509-0821-6222
# Handover — API mapper validation via Elmhurst cross-check
You are picking up branch `ara-backend-design-prd` after the 6-fixture
Elmhurst Summary→SAP validation chain landed end-to-end at 1e-4. The
**next workstream** is the project's actual end goal: prove the API
mapper produces the same result as the Elmhurst-site-notes mapper and
both run cleanly through the calculator.
You are picking up branch `ara-backend-design-prd`. The end goal of
this workstream is clear and worth re-stating before anything else.
## The end goal (per the user)
## The end goal (re-confirmed by the user)
> Data from the API → `EpcPropertyData` → SAP10 calculator, matching the
> API-published SAP rating to within ±0.5 (the API publishes rounded
> integer SAPs).
> **Production goal: `API JSON → EpcPropertyDataMapper.from_api_response
> → SAP10 calculator → SAP rating` must match the API-published SAP
> rating to within ±0.5 (the API publishes rounded integer SAPs).**
>
> The work in progress facilitates that by giving us an *independent*
> route to the same dwelling's `EpcPropertyData` — `Summary PDF →
> ElmhurstSiteNotesExtractor → EpcPropertyDataMapper.from_elmhurst_
> site_notes → SAP`. Once both routes produce the same
> `EpcPropertyData` (or a documented superset) for the same cert,
> the API mapper is validated by transitivity.
The Elmhurst Summary→SAP chain is now closed at 1e-4 across 6 fixtures
(`backend/documents_parser/tests/test_summary_pdf_mapper_chain.py`
all 8 tests green; Slices 4753). That gives us a **calibrated
alternate route** into `EpcPropertyData` for the same physical
dwelling, which means we can validate the API mapper by **comparing
its `EpcPropertyData` output against the Elmhurst mapper's output for
the same cert**.
The validation cohort is the 6 U985-surveyor certs (000474, 000477,
000480, 000487, 000490, 000516) — each has a hand-built
`EpcPropertyData` fixture that cascades to the worksheet PDF's lodged
SAP at 1e-4. The 7th cert (001479 / API ref `0535-9020-6509-0821-6222`)
is the first with **both** an Elmhurst site-notes lodgement AND a real
GOV.UK API counterpart — making it the load-bearing cross-mapper
parity-test fixture.
## The new resource (cert 001479)
Once both mappers produce equivalent `EpcPropertyData` for cert
001479, running each through the calculator and comparing the SAP
rating against the API-published `69` is the final acceptance test
for the production flow.
A single dwelling now has **all three** artefacts:
## The workstream layers (current state of each)
| Path | What |
|---|---|
| `sap worksheets/lodged example/Summary_001479.pdf` | Elmhurst Summary site-notes PDF |
| `sap worksheets/lodged example/P960-0001-001479.pdf` | Elmhurst Calculator worksheet output |
| GOV.UK EPB API certificate `0535-9020-6509-0821-6222` | The published cert |
The work is structured as four nested validation layers — each
validates the layer below. Closing the inner-most one first means the
upper layers can rely on it as a reference.
- Worksheet PDF lodges unrounded SAP **69.0094** (line "SAP value") →
rating **C 69** (rounded integer published in §11a + the API).
- Summary PDF current SAP rating: **C 69**, Potential **C 76**, Fuel
Bill £1056, Emissions 2.509 tonnes.
- Surveyor P960-0001 (Richard Matthew Ratcliff); Inspection 29/10/2025;
processed 31/10/2025; postcode PR1 0LX; UPRN A005608690 (note: starts
with `A`, may be a placeholder); 67 Howick Park Drive, Penwortham,
Preston.
- `Lodgement Required: Yes` — distinguishes this cert from the other
6 cohort certs (U985 surveyor) where `Lodgement Required: No`. This
one was actually pushed to the GOV.UK EPB API, hence the cert
reference.
```
Layer 4: API mapper validated end-to-end (production goal)
└── Layer 3: API mapper EpcPropertyData ≡ Elmhurst mapper EpcPropertyData
└── Layer 2: Elmhurst mapper EpcPropertyData ≡ hand-built fixture
└── Layer 1: hand-built fixture → cascade SAP at 1e-4 vs worksheet
```
There's a separate folder `sap worksheets/extended test case/` with
`Summary_000565.pdf` and `U985-0001-000565.pdf` — those are
**not** the right pair for this workstream (no API counterpart). The
user clarified the source mid-handover; the correct location is
`sap worksheets/lodged example/`.
| Layer | Status | Where |
|---|---|---|
| **1 — hand-built cascade pin** | ✅ 6 cohort certs GREEN at 1e-4; cert 001479 hand-built skeleton at 2/11 pins green (Slice 62 unfinished) | `test_e2e_elmhurst_sap_score.py::test_sap_result_pin` |
| **2 — Elmhurst-mapped ≡ hand-built** | ✅ Cohort 000474 fully GREEN (Slice 70); 5 other cohort certs PENDING; cert 001479 PENDING | `test_summary_pdf_mapper_chain.py::test_from_elmhurst_site_notes_matches_hand_built_NNNNNN` |
| **3 — API-mapped ≡ Elmhurst-mapped** | PENDING — no test exists yet | New file `test_api_vs_elmhurst_parity.py` (or extension of the chain test) |
| **4 — API mapper cascade ±0.5 SAP** | RED — cascade SAP 72.08 vs published 69 (delta +3.08, was +9.7 before slices 58-60); golden-fixtures residual pins green | `test_golden_fixtures.py` for cohort + new entry for `0535-9020-6509-0821-6222` |
## The 5-step plan
## What's done (slices 5470 in this branch)
The user is explicit on the workflow:
Cascade-level fixes (help both mappers):
- Slice 58 `e3dc0b28` — secondary fuel cost routes through lodged `secondary_fuel_type` (was hard-coded to electric tariff); closed a 9-SAP-point ECF distortion on gas-secondary certs.
- Slice 59 `175873b4``heat_transmission_from_cert` apportions windows per `window_location` per bp (not all-on-Main); load-bearing for multi-bp dwellings with non-uniform wall U.
- Slice 60 `31c01a7e` — thermal bridging `y` is dwelling-wide (primary bp's age band), not per-bp.
1. **Fetch the API response** for cert `0535-9020-6509-0821-6222`.
The existing client is at `backend/epc_client/epc_client_service.py`:
```python
from backend.epc_client.epc_client_service import EpcClientService
service = EpcClientService(auth_token=os.environ["OPEN_EPC_API_TOKEN"])
epc_from_api = service.get_by_certificate_number("0535-9020-6509-0821-6222")
```
Cache the raw JSON to
`packages/domain/src/domain/sap/rdsap/tests/fixtures/golden/0535-9020-6509-0821-6222.json`
so tests run reproducibly without a token — that's the pattern other
golden fixtures already use (`0240-0200-…`, `0300-2747-…`, etc.).
Elmhurst-mapper fixes (Slice 2 layer):
- Slice 54 `4427b58a``extensions_count` from `len(survey.extensions)`.
- Slice 55 `c89206fc` — party-wall code `"CU"` → 4 (cavity unfilled U=0.5).
- Slice 56 `07ed871f` — floor `"E To external air"``u_exposed_floor` Table 20.
- Slice 57 `7a9a8b7e` — PS sloping-ceiling + As-Built + pre-1950 age → `thickness=0` → U=2.30.
- Slice 66+67 `ca39d072``country_code="ENG"`, `has_draught_lobby` gate, plus 5 heating-detail int surfacings (`boiler_flue_type`, `emitter_temperature`, `central_heating_pump_age`, `main_heating_number`, `water_heating_fuel`).
- Slice 68 `6baf66cd` — Elmhurst party-wall `"U"` → 0 sentinel; cohort hand-built `central_heating_pump_age_str="Unknown"`.
2. **Map the API response to `EpcPropertyData`** via the existing
`EpcPropertyDataMapper.from_api_response(raw_json)`. Only RdSAP-
Schema-21.0.0 / 21.0.1 are supported today; this cert (Elmhurst
RdSAP10, processed Oct 2025) is almost certainly 21.0.1 — verify.
Hand-built fixture work (Slice 1 layer + parity setup):
- Slice 62 `ee98dbe0` — created `_elmhurst_worksheet_001479.py` skeleton; 2/11 cascade pins green (the rest need iteration; `sap_score_continuous=65.99 vs 69.0094`, gap 3.02 SAP).
- Slice 64 `b5cbfe83` — bulk-update cohort 000474 hand-built with Cat A fields (descriptive strings, ventilation zero counts, top-level booleans); 50 → 14 mapper-vs-hand-built diffs.
- Slice 65 `4997039f` — added `shower_outlets` + `number_baths` to cohort 000474 hand-built.
- Slice 69 `d8a37029` — expanded cohort 000474 windows 5 → 7 (1:1 with §11 table).
- Slice 70 `035d916d` — added window-subfield exclusion to diff helper + `frame_factor=0.7` default in `make_window`. **Cohort 000474 diff GREEN**.
3. **Map the Summary PDF to `EpcPropertyData`** via the new chain:
```python
from backend.documents_parser.elmhurst_extractor import ElmhurstSiteNotesExtractor
from datatypes.epc.domain.mapper import EpcPropertyDataMapper
# use _summary_pdf_to_textract_style_pages helper from
# backend/documents_parser/tests/test_summary_pdf_mapper_chain.py
pages = _summary_pdf_to_textract_style_pages(summary_pdf_path)
site_notes = ElmhurstSiteNotesExtractor(pages).extract()
epc_from_site_notes = EpcPropertyDataMapper.from_elmhurst_site_notes(site_notes)
```
Diff test infrastructure (Slice 63 `01d234dd`):
- `_LOAD_BEARING_FIELDS` allow-list in `test_summary_pdf_mapper_chain.py` (~40 top-level fields driving cascade or cross-mapper semantics).
- `_NON_LOAD_BEARING_WINDOW_SUBFIELDS` deny-list (descriptive int/str encodings that don't affect cascade).
- `_diff_load_bearing` recursive helper, strict-pyright-clean (`mapped/hand_built: object`, narrowed via isinstance).
- `test_from_elmhurst_site_notes_matches_hand_built_000474` is the tracer-bullet test.
4. **Compare the two `EpcPropertyData` objects field-by-field.** Any
difference is either (a) a mapper-coverage gap on one side or (b)
data the API doesn't publish (which would be a nightmare — the user
flagged this explicitly). Surface every diff; classify and fix.
## What's RED right now
5. **Pass both through the calculator** and assert:
- `calculate_sap_from_inputs(cert_to_inputs(epc_from_api))`'s
unrounded SAP is within **±0.5** of the API-published rounded
SAP (69). The 0.5 tolerance is the API-cert convention — the
published integer is rounded, so half a SAP point is just
rounding noise.
- `calculate_sap_from_inputs(cert_to_inputs(epc_from_site_notes))`
matches the worksheet PDF's unrounded SAP **69.0094** to **1e-4**
(extending the existing
`test_summary_pdf_mapper_chain.py` cohort pattern to this 7th
fixture).
- **The two cascade outputs match each other to ≤ 1e-4** when the
mappers are fully aligned — this is the load-bearing parity
proof the user is after.
```
$ git log --oneline -1 backed | head -1
035d916d Slice 70: cohort 000474 mapper-vs-hand-built diff is GREEN
```
## Existing infra you should lean on
Two RED forcing functions on the branch:
- **`packages/domain/src/domain/sap/rdsap/tests/test_golden_fixtures.py`**
is the canonical API→SAP residual test pattern. It loads
`fixtures/golden/<cert_number>.json`, runs
`from_api_response → cert_to_inputs → calculate_sap_from_inputs`,
and pins the residual `(calc_sap - lodged_sap)`. The new cert
belongs in this file's `_EXPECTATIONS` tuple.
1. `test_summary_001479_full_chain_sap_matches_worksheet_pdf_exactly` — chain pin for cert 001479; cascade SAP `70.20` vs worksheet `69.0094` (delta `1.19`). 9 of 11 `test_sap_result_pin[001479-*]` fail in the same RED state. Closing requires either:
- Completing the 001479 hand-built (`_elmhurst_worksheet_001479.py` is the Slice 62 skeleton) — encode every worksheet input until 11/11 pins hit 1e-4.
- Or finding the remaining `~3 W/K` cascade gap (likely `u_floor` Table 19 for age C + PS sloping-ceiling roof area inclination factor — see prior handover at commit `0e4f4c05`).
- **`backend/documents_parser/tests/test_summary_pdf_mapper_chain.py`**
is the Elmhurst Summary→SAP pinning pattern. All 6 cohort certs use
this. The new cert (001479) needs a 7th `test_summary_001479_full_
chain_sap_matches_worksheet_pdf_exactly` here pinned at 1e-4 vs
69.0094.
## What's GREEN right now
- **The cross-mapper diff** is genuinely new — there's no existing
test that asserts `from_api_response(json) == from_elmhurst_site_
notes(pdf)` for the same cert. You'll be writing it from scratch.
Consider a dedicated test file
`backend/documents_parser/tests/test_api_vs_elmhurst_parity.py`
asserting field-level equivalence (and cascade-output equivalence)
for the cert `001479 / 0535-9020-6509-0821-6222`.
- All 66 cohort `test_sap_result_pin[NNNNNN-*]` pins (6 certs × 11 fields) at 1e-4.
- 8 golden-fixture residual pins in `test_golden_fixtures.py` (cohort API certs).
- `test_from_elmhurst_site_notes_matches_hand_built_000474` — first parity validation.
- Pyright net-zero on every touched file's baseline.
## Suggested next moves (in priority order)
### 1. Parametrize the diff test over the 5 other cohort certs
The toolchain is in place. For each cert 000477, 000480, 000487, 000490, 000516:
```python
def test_from_elmhurst_site_notes_matches_hand_built_NNNNNN() -> None:
pages = _summary_pdf_to_textract_style_pages(_SUMMARY_NNNNNN_PDF)
site_notes = ElmhurstSiteNotesExtractor(pages).extract()
mapped = EpcPropertyDataMapper.from_elmhurst_site_notes(site_notes)
hand_built = _wNNNNNN.build_epc()
diffs: list[str] = []
for field_name in _LOAD_BEARING_FIELDS:
diffs.extend(_diff_load_bearing(
getattr(mapped, field_name, None),
getattr(hand_built, field_name, None),
field_name,
))
assert not diffs, (
f"{len(diffs)} load-bearing divergence(s) ...\n " +
"\n ".join(diffs)
)
```
Each will RED initially with a similar diff pattern to 000474. Most diffs should close mechanically by the same bulk-update pattern as Slice 64 (descriptive fields, ventilation zeros, top-level booleans, `wall_thickness_measured`, etc.). The unique-to-cert wrinkles need slice-by-slice attention. Could be parametrize-then-bulk-fix-then-iterate, or one cert at a time.
Run diff probe (substitute `NNNNNN`):
```bash
PYTHONPATH=/workspaces/model:/workspaces/model/packages/domain/src python -c "
import sys; sys.path.insert(0, '/workspaces/model')
from backend.documents_parser.tests.test_summary_pdf_mapper_chain import _diff_load_bearing, _LOAD_BEARING_FIELDS, _summary_pdf_to_textract_style_pages
from backend.documents_parser.elmhurst_extractor import ElmhurstSiteNotesExtractor
from datatypes.epc.domain.mapper import EpcPropertyDataMapper
from domain.sap.worksheet.tests import _elmhurst_worksheet_NNNNNN as wHB
from pathlib import Path
pages = _summary_pdf_to_textract_style_pages(Path('/workspaces/model/backend/documents_parser/tests/fixtures/Summary_NNNNNN.pdf'))
sn = ElmhurstSiteNotesExtractor(pages).extract()
mapped = EpcPropertyDataMapper.from_elmhurst_site_notes(sn)
hb = wHB.build_epc()
diffs = []
for f in _LOAD_BEARING_FIELDS:
diffs.extend(_diff_load_bearing(getattr(mapped, f, None), getattr(hb, f, None), f))
print(f'diff count: {len(diffs)}')
for d in diffs: print(f' {d}')
"
```
### 2. Complete cert 001479's hand-built (`_elmhurst_worksheet_001479.py`)
Currently 2/11 cascade pins green. Worksheet target `69.0094`. Cascade output `65.99`. Likely missing inputs (compare against cohort 000490 which has a similar gas-combi+secondary config):
- Hot-water demand routing (Tcold model, occupancy)
- Thermal mass parameter
- Internal gains (appliance + cooking allowance)
- `multiple_glazed_proportion`
- §2 ventilation tuning
Diagnostic: `python -m pytest packages/domain/src/domain/sap/worksheet/tests/test_e2e_elmhurst_sap_score.py::test_sap_result_pin -k 001479 -v --no-cov` shows each pin's `actual vs expected`.
### 3. Add cert 001479 to the diff test (after 001479 hand-built lands 1e-4)
```python
def test_from_elmhurst_site_notes_matches_hand_built_001479() -> None:
...
```
Likely RED initially. Close diffs the same way as 000474.
### 4. API mapper → hand-built diff test (Layer 3)
```python
def test_from_api_response_matches_hand_built_001479() -> None:
raw = json.loads(Path("packages/domain/src/domain/sap/rdsap/tests/fixtures/golden/0535-9020-6509-0821-6222.json").read_text())
mapped = EpcPropertyDataMapper.from_api_response(raw)
hand_built = _w001479.build_epc()
# same _diff_load_bearing pattern
```
The API JSON is already cached at `packages/domain/src/domain/sap/rdsap/tests/fixtures/golden/0535-9020-6509-0821-6222.json` (Slice 54 era).
Diffs here will surface API-mapper coverage gaps. Each one is a slice; the API mapper at `from_api_response` / `from_rdsap_schema_21_0_1` paths needs corresponding extraction.
### 5. The production acceptance test
Once Layer 3 is green for cert 001479:
- `test_golden_fixtures.py::test_golden_cert_residual_matches_pin[0535-9020-6509-0821-6222]` — add entry. API-mapped EPC cascades to within ±0.5 of API-published `69`.
- And `test_summary_001479_full_chain_sap_matches_worksheet_pdf_exactly` is GREEN at 1e-4.
That's the production-flow acceptance: API → EpcPropertyData → SAP score within tolerance.
## Conventions you must honour (project memory)
- AAA test convention: every new test uses literal `# Arrange / # Act
/ # Assert` headers.
- `abs(diff) <= tol` not `pytest.approx` (strict pyright; pytest.approx
is partially-unknown and costs a pyright error).
- One slice = one commit; stage by name. Don't stage `?? non_intrusive_
photos/`, `?? kwh_client_for_deletion.pkl`, etc. — pre-existing
untracked junk.
- **1e-4 tolerance for the Elmhurst path; 0.5 tolerance for the API
path.** Memory entry `feedback_e2e_validation_philosophy`: component
pins at <1e-3; SAP integer must hit delta=0 vs PDF; **adaptive
ceilings forbidden**.
- Strict pyright net-zero on every commit (35-error baseline across the
Elmhurst+mapper files).
- The user has firmly rejected widening/xfail in the past. If the
mappers disagree, fix the underlying gap — don't loosen the test.
- AAA test convention: every new test uses literal `# Arrange / # Act / # Assert` headers.
- `abs(diff) <= tol` not `pytest.approx` (strict-pyright partially-unknown).
- One slice = one commit; stage by name.
- 1e-4 tolerance for the Elmhurst path; 0.5 for the API path. No widening, no xfail (`feedback_zero_error_strict`).
- Strict pyright net-zero on every commit (per-file baselines: mapper.py 35, heat_transmission.py 13, cert_to_inputs.py 35).
- The 6 cohort cert hand-builts MUST keep cascading to 1e-4. If a mapper change breaks one, fix the mapper or update the hand-built to match — don't widen.
## What's already done (Slices 4753)
## Source-data caveats
The Elmhurst extractor + mapper now handle:
- **Cert 001479 age band**: Summary §3 says `Ext1: M 2023 onwards`; worksheet header says `Ext1: L`. Assessor data-entry inconsistency. The 001479 hand-built uses `L` (to mirror the worksheet calc inputs); the Elmhurst mapper trusts the Summary `M`. This will surface as a 1-field diff in the eventual `001479` diff test — document and accept (or override per-cert in the hand-built).
- Multi-bp dwellings (Main + N extensions); per-bp dimensions, walls,
roofs, floors.
- Room-in-Roof (`SapRoomInRoof.detailed_surfaces`) with §3.10 detailed
Flat Ceiling / Stud Wall / Slope / Gable Wall / Gable-Wall-External
surfaces, Decimal-based round-half-up area rounding.
- Window parser handling 3 §11 layout variants (separate frame_type/
factor; combined `Wood 0.70`; trailing glazing-type on data line;
unprefixed frame_factor-only line).
- Roof-window separation by U > 3.0 with Table 24 raw-U lookup.
- `window_width × window_height = lodged Area` convention to avoid
W×H reconstruction drift.
- Alternative-wall extraction with "Thickness Unknown" → cascade-
default U routing (TF age B uninsulated → U=1.9 for thin timber).
- Secondary heating SAP code from §14.1 Main Heating2 sub-section;
RdSAP §S5 sheltered-sides from built-form; party-wall construction
codes ("U", "S"); suspended-timber-floor heuristic; electric-vs-
mixer shower from outlet_type; `number_baths` lodgement; `main_
heating_category=2` for pumps_fans; roof "N None" → 0mm thickness.
If the diff in step 4 surfaces a gap on the **API mapper** side, the
fix may need to mirror one of the above — the API schema fields are
already in `EpcPropertyData` (most paths feed through it), but the
`from_rdsap_schema_21_0_1` mapper may not be wiring everything.
If the diff surfaces a gap on the **Elmhurst mapper** side, the
recently-landed work probably already covers the analogous field for
one of the 6 cohort fixtures — extend, don't reinvent.
## Likely outcomes / risks
- **Best case**: both mappers produce equivalent `EpcPropertyData` for
cert 001479; both cascade to ≈ 69.0094 SAP; the API target (69) is
hit to within 0.5; you write the parity test and ship a clean slice.
- **Likely case**: there are a handful of small mapping divergences
(e.g. one mapper sets a default that the other extracts; one
rounds a 2-d.p. value differently). Each is a slice; close them
systematically using the cohort patterns from Slices 4753.
- **Worst case (the nightmare the user flagged)**: the API simply
doesn't publish a field that the Elmhurst Summary PDF does (e.g.
measured alt-wall U-values, certain Room-in-Roof gable-type
flags). In that case, document the gap clearly and either accept
the resulting SAP drift (within 0.5) or escalate to the user —
don't paper over with widened tolerances.
## Probe scripts (regenerate in `/tmp` as needed)
The Elmhurst session used these heavily; you'll want analogues:
```bash
# Cohort SAP delta — verify nothing has regressed
python /tmp/probe_all.py
# Field-level cascade-input diff for a single cert
python /tmp/diff_objects.py 000487
```
For the new workflow, you'll want a probe that:
1. Loads the API JSON + Summary PDF for the same cert.
2. Maps both → `EpcPropertyData`.
3. Diffs them field-by-field.
4. Cascades both and prints both unrounded SAPs alongside the
worksheet PDF's lodged value (69.0094).
## First actions
1. Read `backend/epc_client/epc_client_service.py` end-to-end. The
`get_by_certificate_number` entry point is the one you want.
2. Fetch cert `0535-9020-6509-0821-6222`. Save the raw JSON to
`packages/domain/src/domain/sap/rdsap/tests/fixtures/golden/`.
3. Inspect the schema_type and confirm `from_api_response` accepts it.
4. Write the probe script described above; capture the cross-mapper
diff.
5. Triage the diff. Each divergence is a slice. Close them in order.
6. Land the three pin tests as forcing functions:
- Summary_001479 → ≤ 1e-4 vs 69.0094 (new entry in `test_summary_
pdf_mapper_chain.py`).
- API cert 0535-9020-6509-0821-6222 → within 0.5 of 69 (new entry
in `test_golden_fixtures.py`).
- Cross-mapper parity: `from_api_response` and
`from_elmhurst_site_notes` produce equivalent
`EpcPropertyData` for the same cert; cascade outputs match to
≤ 1e-4 (new file `test_api_vs_elmhurst_parity.py`).
## Branch state at handover
## Branch state
```
$ git log --oneline -12
$ git log --oneline -15
035d916d Slice 70: cohort 000474 mapper-vs-hand-built diff is GREEN
d8a37029 Slice 69: 1:1 windows expansion in cohort 000474 (5 → 7)
6baf66cd Slice 68: party-wall "U Unable" + central_heating_pump_age_str → 1 diff left
ca39d072 Slices 66+67: Elmhurst mapper surfaces country_code + heating ints + has_draught_lobby
4997039f Slice 65: add shower_outlets + number_baths to cohort 000474 hand-built
b5cbfe83 Slice 64: bulk-update cohort 000474 hand-built for Cat A diff parity
01d234dd Slice 63: RED tracer-bullet mapper-vs-hand-built diff test for cohort 000474
7e1269fc Handover: hand-built fixture skeleton landed (Slice 62); 2/11 pins green
ee98dbe0 Slice 62: hand-built _elmhurst_worksheet_001479.py — skeleton + 11 RED pins
0e4f4c05 Handover: TDD red-green session — 4 more slices (58-60) + RED chain pin
31c01a7e Slice 60: thermal bridging y is dwelling-wide, not per-bp
@ -240,183 +219,20 @@ ee98dbe0 Slice 62: hand-built _elmhurst_worksheet_001479.py — skeleton + 11 R
e3dc0b28 Slice 58: secondary fuel cost routes through lodged secondary_fuel_type
a0d9d094 Handover: 4 cert-001479 slices in (54-57); gap at +7.62 SAP; non-fabric next
7a9a8b7e Slice 57: Pre-1950 Elmhurst sloping-ceiling roofs map to thickness=0
07ed871f Slice 56: Elmhurst floor exposed to external air routes through u_exposed_floor
c89206fc Slice 55: Elmhurst party-wall code "CU" maps to cavity unfilled
4427b58a Slice 54: Elmhurst mapper sets extensions_count from len(survey.extensions)
a756114a Handover: all 6 Elmhurst Summary→SAP chains closed at 1e-4
58088c10 Slice 53: Summary_000487 chain pins SAP at 1e-4 — last cohort cert closed
```
Chain pin `test_summary_001479_full_chain_sap_matches_worksheet_pdf_
exactly` is committed RED (cascade SAP 70.20 vs worksheet 69.0094,
delta 1.19) as the load-bearing TDD forcing function. All other
chain + golden + heat-transmission tests pass. Pyright net-zero on
touched files.
## Cached artefacts (don't re-fetch)
## Resumption notes for cert 001479 (Slices 5460 in; chain pin RED at delta 1.19)
- `packages/domain/src/domain/sap/rdsap/tests/fixtures/golden/0535-9020-6509-0821-6222.json` — API JSON for cert 001479 (Slice 54 era, fetched via `OPEN_EPC_API_TOKEN` from `backend/.env`).
- `backend/documents_parser/tests/fixtures/Summary_001479.pdf` — site-notes PDF.
- `sap worksheets/lodged example/P960-0001-001479.pdf` — Elmhurst worksheet output for cert 001479.
### What landed across two sessions
## Probe scripts (regenerable in `/tmp`)
**Session 1** (Slices 54-57): fabric mapper gaps from the cross-mapper diff.
- **Slice 54**`extensions_count` reads `len(survey.extensions)`.
- **Slice 55** — Elmhurst party-wall code `"CU"``WALL_CAVITY=4`
(U=0.5 matching worksheet's `Party walls Main … 0.50`).
- **Slice 56** — Floor location `"E To external air"` routes through
`u_exposed_floor` (Ext2 cantilevered floor at U=1.20).
- **Slice 57** — PS sloping-ceiling roofs at age A-D with "As Built"
thickness map to `thickness=0` → U=2.30 (Ext2 uninsulated roof).
**Session 2** (Slices 58-60): TDD red-green cycle with the chain pin as
forcing function. Two cascade-level fixes + one mapper fix:
- **Slice 58** — Secondary fuel cost routing. Mapper derives
`secondary_fuel_type=26` (mains gas) from SAP code 605; cascade
`_fuel_cost` reads `secondary_fuel_type` instead of hardcoding the
electric tariff. Closes a £175/yr ECF distortion ≈ **9 SAP** on
cert 001479. Golden cert 0300-2747 (also mains-gas secondary)
tightens SAP residual 7 → +2 — biggest single golden improvement.
- **Slice 59**`heat_transmission_from_cert` apportions window
area per `window_location` to each bp's wall deduction (was all-to-
Main). For 001479 Ext1's 6.37 m² window now correctly cuts into
Ext1's wall (U=0.26) instead of Main's (U=0.70). Three golden
certs (6035, 7536, 8135) with non-Main windows tighten all
residuals; cohort certs unaffected (uniform per-bp wall U).
- **Slice 60** — Thermal bridging `y` is dwelling-wide (primary bp's
age band) rather than per-bp. Multi-age dwellings like 001479
(Main=C, Ext1=M, Ext2=C) and golden 7536 (D, L, F) had Ext1
bridging under-counted at y=0.08 instead of dwelling's y=0.15.
**Slice 61 ATTEMPTED + REVERTED**: `SapFloorDimension.floor_lodged_
u_value` override using Elmhurst Summary §9 "Default U-value". The
override matched 001479's worksheet exactly (Main 0.65, Ext1 0.20,
Ext2 1.20) but broke cohort 000474's 1e-4 pin: that cert's cascade
calibration relied on `u_floor` returning 0.77 for age B + 12.68 m²,
while Summary lodges 0.75. The 0.02 U drift × 12.68 m² shifted SAP
beyond 1e-4. **Next session needs a different approach** — either
fix `u_floor` Table 19 cascade for age C (currently 0.60, should be
0.65) without breaking age B, or selectively apply the override.
### Where the chain stands
| Cascade SAP | Delta to 69.0094 | After |
|---|---|---|
| 63.17 | +5.84 | Initial (pre-this-workstream) |
| 61.39 | +7.62 | Post-Slice 57 (fabric only) |
| 70.64 | 1.63 | Post-Slice 58 (secondary fuel) |
| 70.38 | 1.37 | Post-Slice 59 (window apportionment) |
| **70.20** | **1.19** | **Post-Slice 60 (single-y bridging)** |
The chain pin is committed RED at delta 1.19. **Per-bp fabric U-values
all match worksheet exactly** (Main wall 0.70, Ext1 wall 0.26, Ext2
wall 0.70, etc.). The remaining 1.19 SAP overshoot maps to ~3 W/K of
extra HLC that the cascade is still under-counting:
| Line ref | Cascade | Worksheet | Gap |
|---|---|---|---|
| (29a) walls | 39.77 | 39.77 | ✓ |
| (30) roof | 9.53 | 10.34 | 0.81 (Ext2 sloping-ceiling area) |
| (28a) floor | 21.65 | 23.17 | 1.52 (Main floor U 0.60 vs 0.65) |
| (32) party | 17.07 | 17.07 | ✓ |
| (27) windows | 43.60 | 43.60 | ✓ |
| (26) doors | 5.55 | 5.55 | ✓ |
| (36) bridging | 22.27 | 24.35 | 2.08 (driven by (31) under-count) |
| **(37) total** | **156.62** | **163.84** | **7.22 W/K** |
### What likely closes the remaining 1.19 SAP
1. **`u_floor` Table 19 boundary for age C** (cascade returns 0.60;
worksheet expects 0.65 — same as age B). May be a Table 19 row
boundary miss. Need to read the canonical xlsx Sheet `Table 19`
to confirm correct values. If cascade is wrong, fixing it would
affect cohort but probably in the right direction.
2. **Ext2 roof area for PS sloping ceiling** — cascade uses floor
area (1.92) as roof area; worksheet uses 2.22 (slant length ×
width). Factor ≈ 1.156 = sec(30°). Cascade-level: multiply
gross_roof_area by an inclination factor when roof_type starts
with "PS".
3. **`(31)` total external area under-count** of 1.13 m² (drives the
bridging gap). Probably the same Ext2 roof area issue (0.30 m²)
plus other accumulations. Fix #2 likely closes most of this.
### Source-data caveats
- **Summary PDF vs worksheet age band on Ext1**: Summary §3 says
`M 2023 onwards`; worksheet header says `Property Age Band C, Ext1: L,
Ext2: C`. Trust Summary (mapper does what data says); chain pin
docstring documents the caveat.
### Probe scripts in /tmp (regenerable)
- `/tmp/probe_001479.py` — cross-mapper diff + cascade.
- `/tmp/sensitivity_001479.py` — single-field SAP impact probe.
- `/tmp/perbp_001479.py` — per-bp cascade U-value dump vs worksheet.
Cached cert JSON: `packages/domain/src/domain/sap/rdsap/tests/
fixtures/golden/0535-9020-6509-0821-6222.json`. Summary PDF in the
chain-test fixtures dir.
### Suggested next steps
**User-confirmed plan (rigorous cohort pattern):** the hand-built
fixture `_elmhurst_worksheet_001479.py` (Slice 62) is the ground-
truth EpcPropertyData for this cert. Two parallel workstreams now:
1. **Iterate the hand-built to 1e-4 against the worksheet.** Current
state: 2/11 cascade pins green (pumps_fans, lighting after the
LED/CFL split). The other 9 pins fail with `sap_score_continuous
= 65.99 vs 69.0094` (~3 SAP gap). Likely slice candidates from
the cascade scalar deltas:
- **HW demand routing**: hand-built may be over-counting hot-
water demand (combi-vs-cylinder path; Tcold model; Appendix J
occupancy). The worksheet's `(219) 2358.31` vs cascade's
`hot_water_kwh_per_yr` is one of the highest-impact deltas.
- **§2 ventilation tuning**: confirm `open_chimneys_count=0`,
`blocked_chimneys_count=0`, `closed_flues_count=0`,
`passive_vents_count=0` are all explicitly lodged on
`SapVentilation` to match the worksheet's §2 zeros.
- **Thermal mass parameter**: worksheet lodges `250.00` — verify
the hand-built's default matches.
- **`multiple_glazed_proportion`**: cascade reads it for solar
gain weighting; hand-built leaves None — check if that path
short-circuits to a less-favourable default.
- **`secondary_heating_fraction`**: cascade may be reading 0.10
(gas+gas) vs Elmhurst's 0.10 — confirm. (215) delta is ~290
kWh; worth ~0.2 SAP if mis-routed.
2. **Once 11 pins green: add `test_elmhurst_mapper_matches_hand_
built` + `test_api_mapper_matches_hand_built`** parametrized
over both the new cert 001479 and the 6 cohort certs. Every
field diff is a mapper bug; close them slice-by-slice. The
cross-mapper parity test (`test_api_vs_elmhurst_parity`)
collapses to "both produce hand-built-equivalent EpcPropertyData
for cert 001479".
3. **Current Elmhurst chain pin** (`test_summary_001479_full_chain
_sap_matches_worksheet_pdf_exactly`) is RED at delta 1.19 SAP.
Once the mapper closes its diff vs the hand-built, the chain pin
lands GREEN automatically.
### Probe scripts in /tmp (regenerable)
- `/tmp/probe_001479.py` — cross-mapper diff + cascade (rerun
after every cascade change; current diff count: 215 across both
mappers).
- `/tmp/sensitivity_001479.py` — single-field SAP impact probe.
- `/tmp/probe_000474_handbuilt_diff.py` — diff cohort 000474 mapped vs hand-built (un-filtered).
- `/tmp/probe_000474_load_bearing.py` — diff cohort 000474 mapped vs hand-built (load-bearing scope, pre-filter).
- `/tmp/probe_001479.py` — cross-mapper diff + cascade for cert 001479.
- `/tmp/sensitivity_001479.py` — single-field patch SAP impact probe.
- `/tmp/perbp_001479.py` — per-bp cascade U-value dump.
### Cohort cascade scalars probe (helpful for hand-built iteration)
Pin the failing fields against an MCVE probe — easiest workflow:
```python
from domain.sap.calculator import Sap10Calculator
from domain.sap.worksheet.tests._elmhurst_worksheet_001479 import build_epc
r = Sap10Calculator().calculate(build_epc())
# Inspect r.hot_water_kwh_per_yr, r.main_heating_fuel_kwh_per_yr, etc.
```
Compare against the cohort cert (e.g. 000490 mains-gas+gas secondary)
to find what hand-built field is missing.
Good luck.
Good luck. Keep the end goal at the front of the work: **API → SAP within ±0.5 of published 69 on cert 001479** is the acceptance test. The cohort + Elmhurst diff layers are the trail of breadcrumbs that will get us there with high confidence.