docs: handover for per-cert mapper validation workflow

Rewrites the cert 001479 closure handover into a forward-looking
brief for the new workstream: validating the API EpcPropertyDataMapper
against 9 newly-staged (Summary + worksheet + API) cert triples.

Key contents:

- User's stated workflow (verbatim): Summary path proves itself
  against the worksheet → becomes canonical reference for API parity.
- Folder-structure changes since the prior handover were written
  (packages/domain/ removed; sap10_calculator + sap10_ml now at the
  repo root under a PEP 420 namespace; docs/sap-spec/ moved into
  domain/sap10_calculator/docs/; PCDB data into tables/pcdb/data/).
- New test data layout: `sap worksheets/Additional data with api/
  <cert-ref>/{Summary_NNNNNN.pdf, dr87-0001-NNNNNN.pdf}`.
- Cert reference table with heating type, PCDB index, worksheet SAP,
  TFA, bp count, dwelling type for all 9 triples.
- Major scope discovery: 7 of 9 are Air Source Heat Pumps (PCDB
  104568 / 102421). The mapper has never been validated against HPs;
  cert 0380 pilot showed catastrophic deltas (Summary -70 / API -18
  SAP vs worksheet). Recommended deferring HP certs until boiler
  workflow is proven.
- Cert 0330 (mid-terrace gas boiler) pilot status: fixtures staged
  uncommitted; Summary path +0.47 SAP, API path +2.15 SAP vs
  worksheet 61.5993. Cascade-component diff localises 2 specific
  gaps (windows HLC +6.71 W/K likely from glazing_type=14 missing
  from Slice 93's transmission map; HW kWh +1060 needs §4
  subsystem probe).
- Tooling shortcut: use OPEN_EPC_API_TOKEN (not EPC_AUTH_TOKEN) in
  backend/.env with EpcClientService._fetch_certificate(cert_ref)
  to fetch raw JSON.
- First actions for next agent: confirm baseline, commit cert 0330
  fixtures, add RED Layer 2 test, iterate.

Lesson preserved: cohort hand-builts encode non-spec quirks
(e.g. has_suspended_timber_floor=False to override §(12) spec
inference and match the non-spec worksheet). Cross-check against
spec-inferred mapper output before trusting hand-built fields.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Khalim Conn-Kowlessar 2026-05-26 17:36:56 +00:00
parent 7fba27a791
commit c783a15ff1

View file

@ -1,301 +1,365 @@
# Handover — API mapper at 1e-4 on cert 001479; investigating goldens
# Handover — Per-cert validation workflow, 9 new triples staged
You are picking up branch `ara-backend-design-prd`. The cert 001479 API
path now hits the worksheet's continuous SAP 69.0094 **at < 1e-4**
(Slice 95). Layer 4 production goal is MET. Remaining work: investigate
golden cert residual outliers (especially cert 0240's -15 SAP) and
process any new (Summary + API) cert pairs the user sources.
You are picking up branch `feature/per-cert-mapper-validation`
(off main at `7fba27a7`, where the prior `ara-backend-design-prd`
work was merged via PR #1123). The user has shifted focus from
"close cert 001479 to 1e-4" (done — Slice 95) to "validate the
API mapper against more cert pairs to surface remaining mapping
gaps". 9 new (Summary + worksheet + API) triples have been
provided. The mapping is acknowledged-incomplete; expect many
mapper-completion slices.
## The end goal (re-confirmed by the user)
## The user's stated workflow (verbatim)
> **Production goal: `API JSON → EpcPropertyDataMapper.from_api_
> response → SAP10 calculator → SAP rating` must match the SAP value
> the calculator emitted at lodge time to within 1e-4.**
> we pick one [cert], we then pass the Elmhurst summary document to
> `EpcPropertyDataMapper` to map the site notes data to
> `EpcPropertyData`, we then pass to the SAP calculator. If the
> output of the SAP calculator matches the SAP worksheet correctly,
> we know we have correctly mapped the EpcPropertyData. We then get
> the API response, map to `EpcPropertyData` using
> `EpcPropertyDataMapper`, then check if we have the same
> `EpcPropertyData` as the summary report (or same for the fields we
> care about). We also check we get the same result.
>
> The acceptance tolerance is **1e-4 against the worksheet's
> continuous SAP value**, not ±0.5 against the published integer.
> ±0.5 only applies when no worksheet is available (the 8 cohort
> golden certs we have as API-only); when we have both API + worksheet
> (cert 001479), the 1e-4 bar is the bar.
> The `EpcPropertyData` objects matching is our signal that we've
> done things correctly. So this validates our mapping.
The earlier handover stated ±0.5 — that was wrong. The user
emphasised this twice: the calc is mechanical, identical inputs must
produce identical outputs, so when we have the continuous worksheet
value we should hit it exactly. See the conversation thread that led
to Slice 87.
Translation: Summary path proves itself against the worksheet →
becomes the canonical reference for the API path. This is Layer 2 +
Layer 3 + Layer 4 of the validation stack.
## Validation layers (current state)
## State at session start (this handover's baseline)
Most recent commits (`sap10_calculator` + `sap10_ml` are now at the
repo root; `packages/domain/src/domain/` was removed):
```
Layer 4: API mapper cascade SAP = worksheet SAP at 1e-4 (production goal)
└── Layer 3: API mapper EpcPropertyData ≡ Elmhurst mapper EpcPropertyData
└── Layer 2: Elmhurst-mapped EpcPropertyData → cascade SAP = worksheet SAP at 1e-4
└── Layer 1: hand-built EpcPropertyData → cascade SAP = worksheet SAP at 1e-4
6dc11e4d fix: resolve 10 remaining test_summary_pdf_mapper_chain failures
09fb6f1b fix: address 22 project-wide test failures from previous sweep
a7b08a4e refactor: move docs/sap-spec/ contents into domain/sap10_calculator/
960130b0 deleted redundant packages folder
68401c51 refactor: lift-and-shift packages/domain/src/domain/ml → domain/sap10_ml
29ac35cc refactor: lift-and-shift packages/domain/src/domain/sap → domain/sap10_calculator
... (87b6045c "fixed merge conflicts from main", 168e7f18, 94975f3b deletions)
a75052dc chore: commit cert 001479 fixture + RdSAP/PCDF spec PDFs
f502db8c Slice 95: API mapper TFA from per-bp dims + window area 2dp rounding
```
| Layer | Status |
|---|---|
| **1 — hand-built cascade pin** | ✅ 6 cohort certs (000474, 000477, 000480, 000487, 000490, 000516) GREEN at 1e-4; cert 001479 hand-built skeleton (Slice 62) still RED (2 of 11 pins green, hand-built has its own bugs — orthogonal to the production path) |
| **2 — Elmhurst-mapped path** | ✅ **Cert 001479 GREEN at 1e-4** (Slice 89); cohort: 2 GREEN (000477, 000516), 4 RED (000474, 000480, 000487, 000490 — Elmhurst U985 worksheets violate the RdSAP 10 §5 (12) spec; orthogonal to the production goal) |
| **3 — API-mapped ≡ Elmhurst-mapped (field-level)** | 🟡 Cascade outputs match at 1e-4 (Slice 95); field-level diff test not yet written but lower priority since cascade-output gate exists |
| **4 — API path cascade SAP** | ✅ **Cert 001479 GREEN at 1e-4** (Slice 95). `test_api_001479_full_chain_sap_matches_worksheet_pdf_exactly` formalises the gate. 8 other golden certs pinned at residual-from-integer at tolerance 0 |
## Cumulative API SAP delta progression (cert 001479)
The big breakthrough: implementing the RdSAP 10 §5 (12) spec rule
(`Floor infiltration (suspended timber ground floor only)` — page 29
of `domain/sap10_calculator/docs/specs/RdSAP 10 Specification 10-06-2025.pdf`) revealed a
series of API-mapper coverage gaps that all needed fixing for the
spec rule's premise to be met. Each slice closed one gap:
| Slice | Fix | API SAP delta |
|---|---|---|
| baseline | broken party wall enum, no descriptive strings | **+3.0752** |
| 87 | RdSAP 10 §5 (12) spec rule + Elmhurst-mapper switch to None | — |
| 88 | thread `bp.floor_construction_type` into `u_floor` cascade | — |
| 89 | PS pitched-sloping-ceiling roof area `÷ cos(30°)` (added `roof_construction_type` field on `SapBuildingPart`) | — |
| 90 | API `party_wall_construction` enum → SAP10 `u_party_wall` codes (1→3 Solid, 2→4 Cavity, etc.) | +1.5298 |
| 91 | descriptive strings via int→str lookups (`floor_construction_type`, `roof_construction_type`) + pre-1950 PS sloping → thickness=0 + per-bp roof description fix | +1.0970 |
| 92 | upper-floor `room_height_m += 0.25` + `is_exposed_floor` from `floor_heat_loss==1` + `floor_insulation_thickness="NI"→None` | +1.0022 |
| 93 | `window_transmission_details` from `glazing_type` int (code 3 → U=2.8/g=0.76, code 13 → U=1.4/g=0.72) | +1.1846 |
| 94 | `sheltered_sides` from API `built_form` + `floor_type` from `floor_heat_loss==7` | +0.0006 |
| 95 | API mapper `total_floor_area_m2` = Σ per-bp dims (worksheet-precise 68.51 not lodged-rounded 69) + RdSAP 10 §15 p.66 window 2dp area rounding in solar_gains/internal_gains | **< 1e-4** |
Fabric breakdown for cert 001479 API path is now COMPLETELY EXACT
(all 6 components match worksheet to 4 d.p.):
| Component | Cascade | Worksheet target |
|---|---|---|
| walls | 39.7652 | 39.7652 ✓ |
| party walls | 17.0700 | 17.0700 ✓ |
| roof | 10.3438 | 10.3438 ✓ |
| floor | 23.1705 | 23.1705 ✓ |
| windows | 43.5962 | 43.5962 ✓ |
| doors | 5.5500 | 5.5500 ✓ |
| **fabric total** | **139.4957** | **139.4957 ✓** |
## What's left (queue, in priority order)
### 1. Close cert 001479's residual 0.0006 SAP gap (1-3 slices)
The remaining gap is non-fabric. Diff against the Summary path's
intermediate cascade values (which lands at 1e-4 GREEN):
Folder structure post-migration:
```
Σ internal_gains_monthly_w: API 5339.27 Sum 5313.55 delta +25.72
Σ solar_gains_monthly_w: API 5510.10 Sum 5508.60 delta +1.50
Σ mean_internal_temp_monthly_c: API 214.87 Sum 213.51 delta +1.35
Σ monthly_infiltration_ach: API 8.95 Sum 10.91 delta -1.96
hot_water_kwh_per_yr: API 2365.00 Sum 2358.31 delta +6.69
domain/ (PEP 420 namespace; no __init__.py)
├── addresses/, postcode.py, tasks/
├── sap10_calculator/ ← was packages/domain/src/domain/sap/
│ ├── calculator.py, climate/, rdsap/, tables/, validation/, worksheet/
│ ├── docs/ ← was docs/sap-spec/
│ │ ├── HANDOVER_NEXT.md, SAP_CALCULATOR.md
│ │ ├── NEXT_AGENT_PROMPT.md ← this file
│ │ └── specs/ ← RdSAP 10, SAP 10.2 + 10.3, PCDF spec PDFs
│ └── tables/pcdb/data/ ← pcdb10.dat + 7× pcdb_table_*.jsonl
└── sap10_ml/ ← was packages/domain/src/domain/ml/
```
Specifically:
- **Infiltration is still under by ~2 ACH/year**. The (12) spec rule
applies on both paths now (after Slice 87), so it's something else
— possibly `has_draught_lobby` (API=None, Summary=False; cascade
treats both as False so it shouldn't matter; verify) or `(13)
draught_lobby_ach`. Or storey count. Probe with
`ventilation_from_cert(api_mapped)` vs `ventilation_from_cert(sum_
mapped)`.
- **HW kWh +6.7** suggests a small Appendix J §1a occupancy
difference, or a different Tcold series, or shower outlets.
- **Internal gains +25.7 W·months** — probably a pumps_fans count or
lighting bulb count mismatch.
`Path(__file__).parents[N]` indices were rebased through the move
(delta of 3); see `Dockerfile.test` (poppler-utils now installed for
test_summary_pdf_mapper_chain.py).
## Test baselines you should see at HEAD `6dc11e4d`
Run the diff probe (the one from the conversation) to localise:
```bash
PYTHONPATH=/workspaces/model:/workspaces/model/packages/domain/src python -c "
from backend.documents_parser.tests.test_summary_pdf_mapper_chain import _diff_load_bearing, _LOAD_BEARING_FIELDS, _summary_pdf_to_textract_style_pages
from backend.documents_parser.elmhurst_extractor import ElmhurstSiteNotesExtractor
from datatypes.epc.domain.mapper import EpcPropertyDataMapper
import json, dataclasses
from pathlib import Path
api = json.loads(Path('/workspaces/model/domain/sap10_calculator/rdsap/tests/fixtures/golden/0535-9020-6509-0821-6222.json').read_text())
api_mapped = EpcPropertyDataMapper.from_api_response(api)
pages = _summary_pdf_to_textract_style_pages(Path('/workspaces/model/backend/documents_parser/tests/fixtures/Summary_001479.pdf'))
sn = ElmhurstSiteNotesExtractor(pages).extract()
sum_mapped = EpcPropertyDataMapper.from_elmhurst_site_notes(sn)
diffs = []
for f in _LOAD_BEARING_FIELDS:
diffs.extend(_diff_load_bearing(getattr(api_mapped, f, None), getattr(sum_mapped, f, None), f))
print(f'{len(diffs)} load-bearing divergences')
for d in diffs[:40]: print(f' {d}')
"
PYTHONPATH=/workspaces/model python -m pytest \
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
--no-cov -q
# Expect: 17/0 in mapper-chain + Layer 1 baseline + golden residual baseline
```
(NB: the original `_diff_load_bearing` was written for cohort
diff tests; the helper signature is `mapped, hand_built, path` — pass
api_mapped as `mapped` and sum_mapped as `hand_built` to surface API
gaps.)
Wider domain sweep (1654 / 20 baseline): 9 hand-built 001479
skeleton + 10 cohort Layer 1 pins + 1 heat_transmission edge case
= 20 RED, all pre-existing and orthogonal to mapper work.
### 2. Layer 3 — write the API ≡ Elmhurst diff test (1 slice)
**Layer 4 production gate**:
`test_api_001479_full_chain_sap_matches_worksheet_pdf_exactly`
**GREEN at < 1e-4**. Keep it green.
Add `test_from_api_response_matches_from_elmhurst_site_notes_001479`
in `backend/documents_parser/tests/test_summary_pdf_mapper_chain.py`,
mirroring the cohort `test_from_elmhurst_site_notes_matches_hand_
built_NNNNNN` pattern. Use `_diff_load_bearing` with `_LOAD_BEARING_
FIELDS`. This formalises Layer 3 as a 1e-4 gate (zero load-bearing
divergences between the two mapper outputs).
## The new test data
This test will start RED with the residual diffs from step 1; closing
those slices brings it to GREEN.
Location: `sap worksheets/Additional data with api/<cert-ref>/`
### 3. More cert pairs (user is sourcing — pause for new data)
Each folder is named by the GOV.UK EPB certificate number. Contains:
The user has agreed to source 2-3 more (Elmhurst worksheet + GOV.UK
API JSON) pairs to validate the mapper isn't 001479-overfit.
Suggested diversity:
- `Summary_NNNNNN.pdf` — Elmhurst-format site notes
- `dr87-0001-NNNNNN.pdf` — worksheet (`dr87-` prefix is a Domna-tool
variant; same shape as the `P960-` worksheet for cert 001479)
- **Detached + RR** (would fix cert 0240's -14 residual which has a
Type-1 RR the mapper doesn't extract).
- **Mid-terrace with cavity-filled party walls** (API party_wall_
construction=3 → spec U=0.2; currently mapped to SAP10 code 4
which gives U=0.5; needs cascade extension at
`u_party_wall`).
- **Flat / maisonette** (party wall U=0 path; cert 9390 is one but
no worksheet).
- **Different age band** (E, J, K, L) to exercise the (12) spec
rule's age boundaries.
The API JSON is **not** in the folder — fetch from GOV.UK EPB using
the cert-ref:
Each new pair lands as a 1e-4 cascade-pin test. Pattern: ~3-5 new
mapper bugs per cert pair (similar to Slice 87-94 on 001479). Each
becomes its own slice. Stage by name; one slice = one commit.
### 4. Investigate goldens with shifted residuals after Slices 87-95
Slices 87-94 shifted residuals on 7 of 10 API-only golden certs;
Slice 95 (precise TFA + window 2dp area rounding) shifted 5 more
(0240, 6035, 8135, 2130, 0390-2254). All residuals are re-pinned.
Current outliers and what we now know:
- **0240** (-15 SAP, +17.8 PE): Detached age J + RR + 11 windows. The
earlier handover claim of "RR mapper gap" is **partly stale**:
- `room_in_roof_type_1.gable_wall_length_1/2` ARE extracted by the
21.0.1 mapper (see mapper.py:1349-1369 — must have landed in
Slices 71-86). Cert 0240's RR cascades through with floor_area=
83.2, gables 6.4 + 6.4, age J → U_RR = 0.30 W/m²K.
- `'Roof room(s), insulated (assumed)'` description NOT parsed —
but the spec basis for parsing it is unclear: age J's Table 18
col(4) default already models insulation (U=0.30), and unlike
the regular-roof "insulated (assumed)" → 50 mm bucket rule
(RdSAP §5.11.4), no equivalent rule for RR has been identified.
- The -15 SAP residual is a mix, not a single RR gap. Subsystem
breakdown for cert 0240 (via cert_to_inputs cascade):
- walls 22.95, party 0, roof 76.93 (incl RR ~18.5), floor 29.43,
windows 41.55, doors 11.10, bridging 39.64; total HLC 221.6 W/K
- **windows_w_per_k = 41.55 is the most leverageable**: 11
windows × 18.28 m² × U_default ≈ 2.27 W/m²K. Cert lodges
`glazing_type=2` for all windows but Slice 93's
`_API_GLAZING_TYPE_TO_TRANSMISSION` only covers codes 3 and 13;
surfacing code 2 would land a measurable U (likely ~1.8-2.0)
and close several W/K of fabric loss.
- Other potential gains: BP[0] non-RR ceiling lodges "Pitched,
400+ mm loft insulation" (should U ~0.10); verify cascade
gives it that.
- **Net**: cert 0240 is not a single-slice fix; it's 3-5
progressive mapper improvements (glazing_type 2 surfacing,
possibly more glazing codes, possibly RR description nuance).
- **0390-2954** (-6 SAP, -26.5 PE): large detached F (TFA 360), oil
PCDB-listed. Undocumented. PE going more negative than SAP suggests
the cost cascade is hitting harder than energy — possibly oil
price/efficiency interaction.
- **6035** (-6 SAP, +49.5 PE): mid-terrace age A + RR. Probably has
the same glazing_type-default-U issue as 0240 plus an age-A-
specific gap.
### 5. (deferred) Cohort chain test RED triage
4 cohort chain tests (000474, 000480, 000487, 000490) are RED
because the Elmhurst U985 worksheets emit (12) values that don't
follow RdSAP 10 §5 — see the conversation re: identical Summary §9
lodgements producing different worksheet (12) for cohort 000477 vs
000480. The cascade is now spec-correct; the Elmhurst tool isn't.
Options: (a) mark as known-Elmhurst-non-spec, (b) add per-cert
override field, (c) wait for more cert pairs to confirm pattern.
**Not blocking the production goal.**
## Key conventions (project memory)
- **AAA test convention** — every new test uses literal `# Arrange /
# Act / # Assert` headers.
- **`abs(diff) <= tol`** not `pytest.approx` (strict-pyright partial-
unknown).
- **One slice = one commit** — stage by name (`git add <path>`).
- **1e-4 tolerance** for the worksheet-comparable paths (Elmhurst
Summary + API both have worksheets for cert 001479). No widening,
no xfail.
- **Strict pyright net-zero** per file. Baselines: `mapper.py` 33,
`heat_transmission.py` 13, `cert_to_inputs.py` 35,
`epc_property_data.py` 0.
- **Spec citation in commit messages** — when a slice implements a
spec rule, quote the spec text (RdSAP 10 page reference). User
asked us to confirm against docs.
## Cached artefacts
- `domain/sap10_calculator/rdsap/tests/fixtures/golden/0535-
9020-6509-0821-6222.json` — API JSON for cert 001479 (RdSAP-Schema-
21.0.1).
- `backend/documents_parser/tests/fixtures/Summary_001479.pdf`
Elmhurst site-notes PDF for cert 001479.
- `sap worksheets/lodged example/P960-0001-001479.pdf` — Domna's
worksheet output for cert 001479 (Continuous SAP 69.0094).
- `sap worksheets/U985-0001-NNNNNN.pdf` × 6 — cohort Elmhurst
worksheets (000474, 000477, 000480, 000487, 000490, 000516).
- `sap worksheets/U985-0001-NNNNNN.txt` × 6 — text exports of above.
## Recent slice history (Slices 87-95, current branch)
```
f502db8c Slice 95: API mapper TFA from per-bp dims + window area 2dp rounding — cert 001479 to 1e-4
03203418 Slice 94: API mapper sheltered_sides + floor_type — cert 001479 to 1e-3
7281b7b3 Slice 93: API mapper window_transmission_details from glazing_type
8e752e57 Slice 92: API mapper floor dimensions (SAP +0.25m + exposed-floor + NI→None)
2cebba28 Slice 91: API mapper descriptive strings + roof description per-bp fix
fbbdca49 Slice 90: API mapper translates party_wall_construction → SAP10 enum
006e9842 Slice 89: PS pitched-sloping-ceiling roof area uses inclined surface
c40679d1 Slice 88: thread bp.floor_construction_type into u_floor cascade
aff331ff Slice 87: implement RdSAP 10 §5 (12) spec rule for suspended timber floor
2d3355ee Slice 86: 1:1 windows expansion in cohort 000516 (2 → 5 entries)
f863598d Slice 85: bulk-update cohort 000516 hand-built for Cat A diff parity
```python
from backend.epc_client.epc_client_service import EpcClientService
from dotenv import load_dotenv
import os
load_dotenv('/workspaces/model/backend/.env') # OPEN_EPC_API_TOKEN
svc = EpcClientService(auth_token=os.environ['OPEN_EPC_API_TOKEN'])
raw = svc._fetch_certificate('<cert-ref>') # raw JSON dict
```
Earlier slice context (71-86 closed cohort Layer 2) is in the prior
handover at commit `86eff23f` (`domain/sap10_calculator/docs/NEXT_AGENT_PROMPT.md`
before this rewrite).
Note: use `OPEN_EPC_API_TOKEN` not `EPC_AUTH_TOKEN` (the latter is
for a different/legacy API).
## First action
### 9 cert references + heating type + worksheet SAP
1. Confirm branch state — Slice 95 (`f502db8c`) closed cert 001479 to
< 1e-4 (was +0.0006 after Slice 94). Layer 4 is GREEN.
2. Run the full sweep:
| Cert ref | Worksheet | Heating | PCDB idx | Worksheet SAP | TFA | bps | Dwelling |
|---|---|---|---|---|---|---|---|
| `0330-2249-8150-2326-4121` | 000897 | **Mains gas boiler** | 10241 | 61.5993 | 69.14 | 2 | Mid-terrace house |
| `0350-2968-2650-2796-5255` | 000903 | ASHP | 104568 | 84.1367 | 90.54 | 2 | Mid-terrace house |
| `0380-2471-3250-2596-8761` | 000899 | ASHP | 104568 | 88.5104 | 60.43 | 1 | Semi-detached bungalow |
| `2225-3062-8205-2856-7204` | 000900 | ASHP | 104568 | 88.7921 | 82.49 | 1 | End-terrace house |
| `2636-0525-2600-0401-2296` | 000901 | ASHP | 104568 | 86.2641 | 82.10 | 1 | Mid-terrace house |
| `3800-8515-0922-3398-3563` | 000898 | ASHP | 104568 | 86.1458 | 81.34 | 2 | Mid-terrace house |
| `9285-3062-0205-7766-7200` | 000902 | ASHP | 104568 | 84.1369 | 85.90 | 1 | End-terrace house |
| `9418-3062-8205-3566-7200` | 000896 | ASHP | 102421 | 84.6305 | 74.37 | 3 | End-terrace house |
| `9501-3059-8202-7356-0204` | (RR cert — newest, added late in session) | **Mains gas boiler** | 19007 | (not measured) | — | — | Top-floor flat |
**Heating-type split**:
- 2 mains gas boilers: 0330, 9501 (validated mapper territory)
- 7 ASHPs: 0350, 0380, 2225, 2636, 3800, 9285, 9418 (**brand-new
mapper territory — never validated**)
One earlier mismatch — cert 0330's folder originally held the wrong
property's Summary/worksheet (17 vs 21 Summerfield Road); the user
fixed mid-session and Summary_000897/dr87-0001-000897 now match
cert ref 0330 correctly. The other 8 were audited and match.
## Major scope discovery — Heat Pumps
7 of the 9 new certs are Air Source Heat Pumps (predominantly PCDB
index 104568, one model 102421). The mapper has never been
validated against a heat-pump cert — cohort certs + cert 001479 are
all mains-gas boilers.
**Cert 0380 (initial pilot attempt) showed catastrophic failures**:
| Path | Cascade SAP | Δ vs worksheet 88.5104 |
|---|---|---|
| Summary mapper | 18.08 | **-70.43** |
| API mapper | 70.14 | **-18.37** |
Diff: Summary identified the heat pump as an 80%-efficient boiler
(catastrophic); API correctly identified it as a heat pump with
COP=2.3 but cascade output still 18 SAP below worksheet (fabric
HLC 104 vs probably ~50 needed). The Summary mapper is
fundamentally broken on heat pumps; the API mapper is
partially-broken.
**Recommendation**: defer the heat-pump certs until the boiler
workflow is proven. Closing 7 ASHP certs is plausibly a 15-30 slice
workstream (new mapper plumbing for PCDB COP, electric tariff
costing for HW + space heating, Appendix N heat-pump efficiency
adjustments, etc.). Cert 0380 (smallest TFA bungalow, single bp)
is the pilot HP cert once boiler workflow is proven.
## Pilot status — cert 0330 (mains-gas mid-terrace boiler)
Same shape as cert 001479 (proven). API JSON staged at
`domain/sap10_calculator/rdsap/tests/fixtures/golden/
0330-2249-8150-2326-4121.json` (**uncommitted**). Summary PDF
copied to
`backend/documents_parser/tests/fixtures/Summary_000897.pdf`
(**uncommitted**).
### Cascade SAP comparison
| Path | Cascade SAP | Δ vs worksheet 61.5993 |
|---|---|---|
| Summary mapper | 62.0660 | **+0.4667** (just over 0.5) |
| API mapper | 63.7446 | **+2.1453** (≥2 SAP off) |
| Δ API↔Summary | +1.6786 | (mapper paths disagree) |
### Cascade-component diff (API vs Summary)
```
TFA: 90.56 = 90.56 ✓
storeys: 2 = 2 ✓
HLC walls: 113.535 ≈ 113.520 (Δ +0.015 — negligible)
HLC roof: 7.323 = 7.323 ✓
HLC floor: 30.705 = 30.705 ✓
HLC windows: 36.455 vs 29.741 (Δ +6.71 ← BIG)
HLC doors: 11.100 = 11.100 ✓
HLC party: 11.357 = 11.357 ✓
HLC bridge: 28.347 = 28.347 ✓
HLC total: 238.822 vs 232.093 (Δ +6.73 — all from windows)
Inf ACH: 0.7382 = 0.7382 ✓
HW kWh: 3172.65 vs 2112.00 (Δ +1060 ← BIG)
Lighting kWh: 207.92 = 207.92 ✓
Main eff: 0.8850 = 0.8850 ✓
```
Two specific gaps to investigate as separate slices:
1. **Windows HLC +6.71 W/K** — likely `glazing_type=14` (cert 0330)
not in Slice 93's `_API_GLAZING_TYPE_TO_TRANSMISSION` (only codes
3 and 13 are mapped). Same shape as cert 001479's
`glazing_type=2` issue; extending the dict should close this.
Affects multiple certs that use code 14.
2. **HW kWh +1060 (API 3172 vs Summary 2112)** — substantial
divergence in §4 hot water cascade. Needs probe of which
subsystem (occupancy N, shower outlets, electric_shower_count,
cylinder, etc.) the API mapper is reading wrong. Cert 0330
doesn't have the +0.5m upper-storey adjustment quirk cert 001479
needed (Slice 92), so different root cause likely.
(The user observed: "the mapping is very much incomplete (hence we
have some non 0 matches to elmhurst summary matches)" — non-1e-4
matches are expected and tractable.)
### 116 field-level divergences (API vs Summary)
Most are cascade-equivalent surfacing differences (Slice 91-era
descriptive strings + int/None vs explicit-bool patterns) — the
same shape `_is_excluded_path` already handles for the cohort
certs. New specific concrete diffs that DO affect the cascade:
- `sap_windows[*].window_transmission_details` — Summary has
explicit U/g/data_source; API has None for `glazing_type=14`
(cascade falls back to default U → too high)
- `sap_windows[*].frame_factor` — Summary 0.7, API None
- `sap_windows[*].window_width / window_height` — same w*h area
rounding pattern as cert 001479 (handled in Slice 95)
## Workflow recommendation for next slice queue
For each new cert (after cert 0330 pilot lands):
1. **Stage**: fetch API JSON, copy Summary PDF into fixtures
2. **Probe**: run the cascade-component diff (recreate the inline
pattern; the probe takes both `summary_epc` and `api_epc`, lowers
via `cert_to_inputs`, diffs each subsystem)
3. **Localise** the biggest cascade-component delta
4. **Fix** the mapper to close it; one fix = one slice
5. **Add Layer 4 1e-4 test** when both Summary and API paths hit
worksheet at 1e-4 (cert may pass Summary path first, then
iterate API mapper to catch up)
6. **Commit**: stage by name (`git add <path>`), cite spec page
when implementing a spec rule
### Cohort-style fixture pattern
If a cert benefits from a hand-built fixture (Layer 1), mirror the
cohort pattern at
`domain/sap10_calculator/worksheet/tests/_elmhurst_worksheet_NNNNNN.py`
— with prefix `_dr87_worksheet_NNNNNN.py` for the new Domna-tool
worksheet variant.
**WARNING (lesson from previous session)**: the cohort hand-builts
encode non-spec quirks (e.g. `has_suspended_timber_floor=False` to
mirror the worksheet's non-spec §(12) behaviour for 4 certs). Don't
blindly trust the hand-builts as spec-correct; cross-check against
the mapper's spec-inference output before committing.
## Conventions (preserved from previous handover)
- **One slice = one commit** — stage by name.
- **AAA test convention** — literal `# Arrange / # Act / # Assert`
headers in every new test.
- **`abs(diff) <= tol`** not `pytest.approx` (strict-pyright clean).
- **1e-4 worksheet tolerance** when worksheet is available; ±0.5
fallback only for API-only goldens.
- **Spec citation** in commit messages when a slice implements a
spec rule (quote RdSAP 10 / SAP 10.2/10.3 page reference).
- **Pyright net-zero per file**. Baselines (re-verify at session
start):
- `datatypes/epc/domain/mapper.py`: 33
- `domain/sap10_calculator/worksheet/heat_transmission.py`: 13
- `domain/sap10_calculator/rdsap/cert_to_inputs.py`: 35
- `datatypes/epc/domain/epc_property_data.py`: 0
## First actions for the next agent
1. Confirm HEAD: `git log --oneline -1``6dc11e4d`.
2. Re-baseline:
```bash
PYTHONPATH=/workspaces/model:/workspaces/model/packages/domain/src \
python -m pytest backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
--no-cov -q
PYTHONPATH=/workspaces/model python -m pytest \
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
--no-cov -q
```
Expect **99 passed / 19 failed**. All 19 failures pre-existing:
9× hand-built 001479 skeleton (`test_sap_result_pin[001479-*]`),
6× cohort diff (`test_from_elmhurst_site_notes_matches_hand_built_*`),
4× cohort chain (000474/000480/000487/000490 — Elmhurst non-spec).
3. Production goal is met for cert 001479. Next work focuses on the
golden cert residual outliers (§4 above) and new (Summary + API)
cert pairs from the user. The diff-probe methodology from Slice 95
(cascade-component diff API vs Summary path; localise; fix mapper)
works for any new (Summary + API) pair — worksheet not required
when Summary path is established as canonical.
4. Don't lose sight of Layer 4: **API → SAP within 1e-4 of worksheet
continuous on cert 001479** is the production goal. **MET as of
Slice 95** — `test_api_001479_full_chain_sap_matches_worksheet_pdf_
exactly` formalises this gate.
3. Pick up cert 0330 pilot. Either continue from where I left off
(fixtures staged uncommitted, 2 specific gaps identified above)
OR pivot to a different boiler cert if 0330 turns out
problematic (cert 9501 is the other boiler — top-floor flat with
PCDB idx 19007).
4. Commit cert 0330's fixtures (API JSON + Summary PDF) as the
foundation slice before working any mapper fixes:
```bash
git add domain/sap10_calculator/rdsap/tests/fixtures/golden/0330-2249-8150-2326-4121.json
git add backend/documents_parser/tests/fixtures/Summary_000897.pdf
git commit -m "chore: stage cert 0330 fixtures (boiler pilot, worksheet SAP 61.5993)"
```
5. Add a RED Layer 2 test (Summary mapper cascade SAP at 1e-4
vs 61.5993) — establishes the failing target. Then fix the
Summary path mapper bugs slice-by-slice.
6. Once Summary path is GREEN, do the same for the API path (Layer
4). The API mapper may need additional fixes Summary doesn't
need — they're independent paths into the same `EpcPropertyData`
shape.
7. After cert 0330 lands as a clean Layer 4 1e-4 pin, repeat for
cert 9501 (the other boiler). 2 boiler certs proven is much
stronger evidence than 1.
8. Then plan the heat-pump workstream. The 7 ASHP certs share a
PCDB index (104568) so much of the fix is likely shared. Write
a follow-up handover for that workstream specifically.
The user is sourcing more cert pairs in parallel; when they arrive,
each one will surface ~3-5 mapper bugs along the same pattern as
Slices 87-95. The diagnostic methodology (diff Summary-mapper vs
API-mapper; localise by cascade component; fix the API mapper to
mirror the Summary's surfacing) works for any new (Summary + API)
pair — worksheet not required when Summary path is canonical (cert
001479 proves it is).
## Heat-pump workstream sketch (deferred)
When the user gives the go-ahead, work order:
1. **API mapper**: surface `main_heating_index_number`, set
`main_heating_category` for HPs, `main_fuel_type=29` (electric
heat pump).
2. **Cascade**: ensure `cert_to_inputs._main_heating_efficiency`
reads PCDB HP COP correctly. Investigate Table 4a/4b vs PCDB
precedence for HPs.
3. **Fuel cost**: HW + space heating on electricity tariffs
(Table 12) — check if the cascade has electric-tariff fuel-cost
plumbing wired up.
4. **Appendix N**: HP-specific efficiency adjustments (climate +
flow temperature). Likely the biggest cascade-side gap.
5. **Summary mapper**: separate slice — needs to identify HPs from
the Summary PDF's heating section.
## Open items / known gaps not yet addressed
- 8 API-only golden cert residuals still range from 0 to -15 SAP
delta (cert 0240 is the outlier — see prior handover §4 and
`test_golden_fixtures.py` notes). The user's stated end goal is
<0.5 SAP error on all goldens; cert 0240 needs RR-description
parsing (or Room-in-Roof mapping investigation) + glazing_type=2
surfacing.
- Layer 3 field-parity test
(`test_from_api_response_matches_from_elmhurst_site_notes_001479`)
still not written. Lower priority since cascade-output Layer 4
already gates parity.
- The 4 cohort chain tests for non-spec §(12) certs were deleted
this session; if the user later sources spec-compliant
worksheets for 000474/000480/000487/000490, those tests can be
restored (with the spec-correct hand-builts).
## Tooling shortcuts
- **EPC fetch**: `OPEN_EPC_API_TOKEN` (NOT `EPC_AUTH_TOKEN`) in
`backend/.env`. `EpcClientService._fetch_certificate(cert_ref)`
returns the raw JSON dict.
- **Worksheet SAP extract**: `pdftotext -layout <worksheet.pdf> -`
then `grep -E "SAP value\s+[0-9]+\.[0-9]+"`. Works for all
`dr87-`, `P960-`, and `U985-` worksheet variants.
- **Cascade-component probe template**: see the cert-0330 probe
inline above; same shape as the cert-001479 probe.
Good luck. The methodology is proven on cert 001479 and partially
on cert 0330 (boiler pilot 95% closed). Each new cert pair should
land in 1-5 mapper slices. Stage by name; one slice = one commit;
cite spec when implementing a spec rule.