mirror of
https://github.com/Hestia-Homes/Model.git
synced 2026-06-08 11:17:27 +00:00
docs: handover for per-cert mapper validation workflow
Rewrites the cert 001479 closure handover into a forward-looking
brief for the new workstream: validating the API EpcPropertyDataMapper
against 9 newly-staged (Summary + worksheet + API) cert triples.
Key contents:
- User's stated workflow (verbatim): Summary path proves itself
against the worksheet → becomes canonical reference for API parity.
- Folder-structure changes since the prior handover were written
(packages/domain/ removed; sap10_calculator + sap10_ml now at the
repo root under a PEP 420 namespace; docs/sap-spec/ moved into
domain/sap10_calculator/docs/; PCDB data into tables/pcdb/data/).
- New test data layout: `sap worksheets/Additional data with api/
<cert-ref>/{Summary_NNNNNN.pdf, dr87-0001-NNNNNN.pdf}`.
- Cert reference table with heating type, PCDB index, worksheet SAP,
TFA, bp count, dwelling type for all 9 triples.
- Major scope discovery: 7 of 9 are Air Source Heat Pumps (PCDB
104568 / 102421). The mapper has never been validated against HPs;
cert 0380 pilot showed catastrophic deltas (Summary -70 / API -18
SAP vs worksheet). Recommended deferring HP certs until boiler
workflow is proven.
- Cert 0330 (mid-terrace gas boiler) pilot status: fixtures staged
uncommitted; Summary path +0.47 SAP, API path +2.15 SAP vs
worksheet 61.5993. Cascade-component diff localises 2 specific
gaps (windows HLC +6.71 W/K likely from glazing_type=14 missing
from Slice 93's transmission map; HW kWh +1060 needs §4
subsystem probe).
- Tooling shortcut: use OPEN_EPC_API_TOKEN (not EPC_AUTH_TOKEN) in
backend/.env with EpcClientService._fetch_certificate(cert_ref)
to fetch raw JSON.
- First actions for next agent: confirm baseline, commit cert 0330
fixtures, add RED Layer 2 test, iterate.
Lesson preserved: cohort hand-builts encode non-spec quirks
(e.g. has_suspended_timber_floor=False to override §(12) spec
inference and match the non-spec worksheet). Cross-check against
spec-inferred mapper output before trusting hand-built fields.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
7fba27a791
commit
c783a15ff1
1 changed files with 333 additions and 269 deletions
|
|
@ -1,301 +1,365 @@
|
|||
# Handover — API mapper at 1e-4 on cert 001479; investigating goldens
|
||||
# Handover — Per-cert validation workflow, 9 new triples staged
|
||||
|
||||
You are picking up branch `ara-backend-design-prd`. The cert 001479 API
|
||||
path now hits the worksheet's continuous SAP 69.0094 **at < 1e-4**
|
||||
(Slice 95). Layer 4 production goal is MET. Remaining work: investigate
|
||||
golden cert residual outliers (especially cert 0240's -15 SAP) and
|
||||
process any new (Summary + API) cert pairs the user sources.
|
||||
You are picking up branch `feature/per-cert-mapper-validation`
|
||||
(off main at `7fba27a7`, where the prior `ara-backend-design-prd`
|
||||
work was merged via PR #1123). The user has shifted focus from
|
||||
"close cert 001479 to 1e-4" (done — Slice 95) to "validate the
|
||||
API mapper against more cert pairs to surface remaining mapping
|
||||
gaps". 9 new (Summary + worksheet + API) triples have been
|
||||
provided. The mapping is acknowledged-incomplete; expect many
|
||||
mapper-completion slices.
|
||||
|
||||
## The end goal (re-confirmed by the user)
|
||||
## The user's stated workflow (verbatim)
|
||||
|
||||
> **Production goal: `API JSON → EpcPropertyDataMapper.from_api_
|
||||
> response → SAP10 calculator → SAP rating` must match the SAP value
|
||||
> the calculator emitted at lodge time to within 1e-4.**
|
||||
> we pick one [cert], we then pass the Elmhurst summary document to
|
||||
> `EpcPropertyDataMapper` to map the site notes data to
|
||||
> `EpcPropertyData`, we then pass to the SAP calculator. If the
|
||||
> output of the SAP calculator matches the SAP worksheet correctly,
|
||||
> we know we have correctly mapped the EpcPropertyData. We then get
|
||||
> the API response, map to `EpcPropertyData` using
|
||||
> `EpcPropertyDataMapper`, then check if we have the same
|
||||
> `EpcPropertyData` as the summary report (or same for the fields we
|
||||
> care about). We also check we get the same result.
|
||||
>
|
||||
> The acceptance tolerance is **1e-4 against the worksheet's
|
||||
> continuous SAP value**, not ±0.5 against the published integer.
|
||||
> ±0.5 only applies when no worksheet is available (the 8 cohort
|
||||
> golden certs we have as API-only); when we have both API + worksheet
|
||||
> (cert 001479), the 1e-4 bar is the bar.
|
||||
> The `EpcPropertyData` objects matching is our signal that we've
|
||||
> done things correctly. So this validates our mapping.
|
||||
|
||||
The earlier handover stated ±0.5 — that was wrong. The user
|
||||
emphasised this twice: the calc is mechanical, identical inputs must
|
||||
produce identical outputs, so when we have the continuous worksheet
|
||||
value we should hit it exactly. See the conversation thread that led
|
||||
to Slice 87.
|
||||
Translation: Summary path proves itself against the worksheet →
|
||||
becomes the canonical reference for the API path. This is Layer 2 +
|
||||
Layer 3 + Layer 4 of the validation stack.
|
||||
|
||||
## Validation layers (current state)
|
||||
## State at session start (this handover's baseline)
|
||||
|
||||
Most recent commits (`sap10_calculator` + `sap10_ml` are now at the
|
||||
repo root; `packages/domain/src/domain/` was removed):
|
||||
|
||||
```
|
||||
Layer 4: API mapper cascade SAP = worksheet SAP at 1e-4 (production goal)
|
||||
└── Layer 3: API mapper EpcPropertyData ≡ Elmhurst mapper EpcPropertyData
|
||||
└── Layer 2: Elmhurst-mapped EpcPropertyData → cascade SAP = worksheet SAP at 1e-4
|
||||
└── Layer 1: hand-built EpcPropertyData → cascade SAP = worksheet SAP at 1e-4
|
||||
6dc11e4d fix: resolve 10 remaining test_summary_pdf_mapper_chain failures
|
||||
09fb6f1b fix: address 22 project-wide test failures from previous sweep
|
||||
a7b08a4e refactor: move docs/sap-spec/ contents into domain/sap10_calculator/
|
||||
960130b0 deleted redundant packages folder
|
||||
68401c51 refactor: lift-and-shift packages/domain/src/domain/ml → domain/sap10_ml
|
||||
29ac35cc refactor: lift-and-shift packages/domain/src/domain/sap → domain/sap10_calculator
|
||||
... (87b6045c "fixed merge conflicts from main", 168e7f18, 94975f3b deletions)
|
||||
a75052dc chore: commit cert 001479 fixture + RdSAP/PCDF spec PDFs
|
||||
f502db8c Slice 95: API mapper TFA from per-bp dims + window area 2dp rounding
|
||||
```
|
||||
|
||||
| Layer | Status |
|
||||
|---|---|
|
||||
| **1 — hand-built cascade pin** | ✅ 6 cohort certs (000474, 000477, 000480, 000487, 000490, 000516) GREEN at 1e-4; cert 001479 hand-built skeleton (Slice 62) still RED (2 of 11 pins green, hand-built has its own bugs — orthogonal to the production path) |
|
||||
| **2 — Elmhurst-mapped path** | ✅ **Cert 001479 GREEN at 1e-4** (Slice 89); cohort: 2 GREEN (000477, 000516), 4 RED (000474, 000480, 000487, 000490 — Elmhurst U985 worksheets violate the RdSAP 10 §5 (12) spec; orthogonal to the production goal) |
|
||||
| **3 — API-mapped ≡ Elmhurst-mapped (field-level)** | 🟡 Cascade outputs match at 1e-4 (Slice 95); field-level diff test not yet written but lower priority since cascade-output gate exists |
|
||||
| **4 — API path cascade SAP** | ✅ **Cert 001479 GREEN at 1e-4** (Slice 95). `test_api_001479_full_chain_sap_matches_worksheet_pdf_exactly` formalises the gate. 8 other golden certs pinned at residual-from-integer at tolerance 0 |
|
||||
|
||||
## Cumulative API SAP delta progression (cert 001479)
|
||||
|
||||
The big breakthrough: implementing the RdSAP 10 §5 (12) spec rule
|
||||
(`Floor infiltration (suspended timber ground floor only)` — page 29
|
||||
of `domain/sap10_calculator/docs/specs/RdSAP 10 Specification 10-06-2025.pdf`) revealed a
|
||||
series of API-mapper coverage gaps that all needed fixing for the
|
||||
spec rule's premise to be met. Each slice closed one gap:
|
||||
|
||||
| Slice | Fix | API SAP delta |
|
||||
|---|---|---|
|
||||
| baseline | broken party wall enum, no descriptive strings | **+3.0752** |
|
||||
| 87 | RdSAP 10 §5 (12) spec rule + Elmhurst-mapper switch to None | — |
|
||||
| 88 | thread `bp.floor_construction_type` into `u_floor` cascade | — |
|
||||
| 89 | PS pitched-sloping-ceiling roof area `÷ cos(30°)` (added `roof_construction_type` field on `SapBuildingPart`) | — |
|
||||
| 90 | API `party_wall_construction` enum → SAP10 `u_party_wall` codes (1→3 Solid, 2→4 Cavity, etc.) | +1.5298 |
|
||||
| 91 | descriptive strings via int→str lookups (`floor_construction_type`, `roof_construction_type`) + pre-1950 PS sloping → thickness=0 + per-bp roof description fix | +1.0970 |
|
||||
| 92 | upper-floor `room_height_m += 0.25` + `is_exposed_floor` from `floor_heat_loss==1` + `floor_insulation_thickness="NI"→None` | +1.0022 |
|
||||
| 93 | `window_transmission_details` from `glazing_type` int (code 3 → U=2.8/g=0.76, code 13 → U=1.4/g=0.72) | +1.1846 |
|
||||
| 94 | `sheltered_sides` from API `built_form` + `floor_type` from `floor_heat_loss==7` | +0.0006 |
|
||||
| 95 | API mapper `total_floor_area_m2` = Σ per-bp dims (worksheet-precise 68.51 not lodged-rounded 69) + RdSAP 10 §15 p.66 window 2dp area rounding in solar_gains/internal_gains | **< 1e-4** |
|
||||
|
||||
Fabric breakdown for cert 001479 API path is now COMPLETELY EXACT
|
||||
(all 6 components match worksheet to 4 d.p.):
|
||||
|
||||
| Component | Cascade | Worksheet target |
|
||||
|---|---|---|
|
||||
| walls | 39.7652 | 39.7652 ✓ |
|
||||
| party walls | 17.0700 | 17.0700 ✓ |
|
||||
| roof | 10.3438 | 10.3438 ✓ |
|
||||
| floor | 23.1705 | 23.1705 ✓ |
|
||||
| windows | 43.5962 | 43.5962 ✓ |
|
||||
| doors | 5.5500 | 5.5500 ✓ |
|
||||
| **fabric total** | **139.4957** | **139.4957 ✓** |
|
||||
|
||||
## What's left (queue, in priority order)
|
||||
|
||||
### 1. Close cert 001479's residual 0.0006 SAP gap (1-3 slices)
|
||||
|
||||
The remaining gap is non-fabric. Diff against the Summary path's
|
||||
intermediate cascade values (which lands at 1e-4 GREEN):
|
||||
Folder structure post-migration:
|
||||
|
||||
```
|
||||
Σ internal_gains_monthly_w: API 5339.27 Sum 5313.55 delta +25.72
|
||||
Σ solar_gains_monthly_w: API 5510.10 Sum 5508.60 delta +1.50
|
||||
Σ mean_internal_temp_monthly_c: API 214.87 Sum 213.51 delta +1.35
|
||||
Σ monthly_infiltration_ach: API 8.95 Sum 10.91 delta -1.96
|
||||
hot_water_kwh_per_yr: API 2365.00 Sum 2358.31 delta +6.69
|
||||
domain/ (PEP 420 namespace; no __init__.py)
|
||||
├── addresses/, postcode.py, tasks/
|
||||
├── sap10_calculator/ ← was packages/domain/src/domain/sap/
|
||||
│ ├── calculator.py, climate/, rdsap/, tables/, validation/, worksheet/
|
||||
│ ├── docs/ ← was docs/sap-spec/
|
||||
│ │ ├── HANDOVER_NEXT.md, SAP_CALCULATOR.md
|
||||
│ │ ├── NEXT_AGENT_PROMPT.md ← this file
|
||||
│ │ └── specs/ ← RdSAP 10, SAP 10.2 + 10.3, PCDF spec PDFs
|
||||
│ └── tables/pcdb/data/ ← pcdb10.dat + 7× pcdb_table_*.jsonl
|
||||
└── sap10_ml/ ← was packages/domain/src/domain/ml/
|
||||
```
|
||||
|
||||
Specifically:
|
||||
- **Infiltration is still under by ~2 ACH/year**. The (12) spec rule
|
||||
applies on both paths now (after Slice 87), so it's something else
|
||||
— possibly `has_draught_lobby` (API=None, Summary=False; cascade
|
||||
treats both as False so it shouldn't matter; verify) or `(13)
|
||||
draught_lobby_ach`. Or storey count. Probe with
|
||||
`ventilation_from_cert(api_mapped)` vs `ventilation_from_cert(sum_
|
||||
mapped)`.
|
||||
- **HW kWh +6.7** suggests a small Appendix J §1a occupancy
|
||||
difference, or a different Tcold series, or shower outlets.
|
||||
- **Internal gains +25.7 W·months** — probably a pumps_fans count or
|
||||
lighting bulb count mismatch.
|
||||
`Path(__file__).parents[N]` indices were rebased through the move
|
||||
(delta of 3); see `Dockerfile.test` (poppler-utils now installed for
|
||||
test_summary_pdf_mapper_chain.py).
|
||||
|
||||
## Test baselines you should see at HEAD `6dc11e4d`
|
||||
|
||||
Run the diff probe (the one from the conversation) to localise:
|
||||
```bash
|
||||
PYTHONPATH=/workspaces/model:/workspaces/model/packages/domain/src python -c "
|
||||
from backend.documents_parser.tests.test_summary_pdf_mapper_chain import _diff_load_bearing, _LOAD_BEARING_FIELDS, _summary_pdf_to_textract_style_pages
|
||||
from backend.documents_parser.elmhurst_extractor import ElmhurstSiteNotesExtractor
|
||||
from datatypes.epc.domain.mapper import EpcPropertyDataMapper
|
||||
import json, dataclasses
|
||||
from pathlib import Path
|
||||
|
||||
api = json.loads(Path('/workspaces/model/domain/sap10_calculator/rdsap/tests/fixtures/golden/0535-9020-6509-0821-6222.json').read_text())
|
||||
api_mapped = EpcPropertyDataMapper.from_api_response(api)
|
||||
pages = _summary_pdf_to_textract_style_pages(Path('/workspaces/model/backend/documents_parser/tests/fixtures/Summary_001479.pdf'))
|
||||
sn = ElmhurstSiteNotesExtractor(pages).extract()
|
||||
sum_mapped = EpcPropertyDataMapper.from_elmhurst_site_notes(sn)
|
||||
diffs = []
|
||||
for f in _LOAD_BEARING_FIELDS:
|
||||
diffs.extend(_diff_load_bearing(getattr(api_mapped, f, None), getattr(sum_mapped, f, None), f))
|
||||
print(f'{len(diffs)} load-bearing divergences')
|
||||
for d in diffs[:40]: print(f' {d}')
|
||||
"
|
||||
PYTHONPATH=/workspaces/model python -m pytest \
|
||||
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
|
||||
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
|
||||
--no-cov -q
|
||||
# Expect: 17/0 in mapper-chain + Layer 1 baseline + golden residual baseline
|
||||
```
|
||||
|
||||
(NB: the original `_diff_load_bearing` was written for cohort
|
||||
diff tests; the helper signature is `mapped, hand_built, path` — pass
|
||||
api_mapped as `mapped` and sum_mapped as `hand_built` to surface API
|
||||
gaps.)
|
||||
Wider domain sweep (1654 / 20 baseline): 9 hand-built 001479
|
||||
skeleton + 10 cohort Layer 1 pins + 1 heat_transmission edge case
|
||||
= 20 RED, all pre-existing and orthogonal to mapper work.
|
||||
|
||||
### 2. Layer 3 — write the API ≡ Elmhurst diff test (1 slice)
|
||||
**Layer 4 production gate**:
|
||||
`test_api_001479_full_chain_sap_matches_worksheet_pdf_exactly` —
|
||||
**GREEN at < 1e-4**. Keep it green.
|
||||
|
||||
Add `test_from_api_response_matches_from_elmhurst_site_notes_001479`
|
||||
in `backend/documents_parser/tests/test_summary_pdf_mapper_chain.py`,
|
||||
mirroring the cohort `test_from_elmhurst_site_notes_matches_hand_
|
||||
built_NNNNNN` pattern. Use `_diff_load_bearing` with `_LOAD_BEARING_
|
||||
FIELDS`. This formalises Layer 3 as a 1e-4 gate (zero load-bearing
|
||||
divergences between the two mapper outputs).
|
||||
## The new test data
|
||||
|
||||
This test will start RED with the residual diffs from step 1; closing
|
||||
those slices brings it to GREEN.
|
||||
Location: `sap worksheets/Additional data with api/<cert-ref>/`
|
||||
|
||||
### 3. More cert pairs (user is sourcing — pause for new data)
|
||||
Each folder is named by the GOV.UK EPB certificate number. Contains:
|
||||
|
||||
The user has agreed to source 2-3 more (Elmhurst worksheet + GOV.UK
|
||||
API JSON) pairs to validate the mapper isn't 001479-overfit.
|
||||
Suggested diversity:
|
||||
- `Summary_NNNNNN.pdf` — Elmhurst-format site notes
|
||||
- `dr87-0001-NNNNNN.pdf` — worksheet (`dr87-` prefix is a Domna-tool
|
||||
variant; same shape as the `P960-` worksheet for cert 001479)
|
||||
|
||||
- **Detached + RR** (would fix cert 0240's -14 residual which has a
|
||||
Type-1 RR the mapper doesn't extract).
|
||||
- **Mid-terrace with cavity-filled party walls** (API party_wall_
|
||||
construction=3 → spec U=0.2; currently mapped to SAP10 code 4
|
||||
which gives U=0.5; needs cascade extension at
|
||||
`u_party_wall`).
|
||||
- **Flat / maisonette** (party wall U=0 path; cert 9390 is one but
|
||||
no worksheet).
|
||||
- **Different age band** (E, J, K, L) to exercise the (12) spec
|
||||
rule's age boundaries.
|
||||
The API JSON is **not** in the folder — fetch from GOV.UK EPB using
|
||||
the cert-ref:
|
||||
|
||||
Each new pair lands as a 1e-4 cascade-pin test. Pattern: ~3-5 new
|
||||
mapper bugs per cert pair (similar to Slice 87-94 on 001479). Each
|
||||
becomes its own slice. Stage by name; one slice = one commit.
|
||||
|
||||
### 4. Investigate goldens with shifted residuals after Slices 87-95
|
||||
|
||||
Slices 87-94 shifted residuals on 7 of 10 API-only golden certs;
|
||||
Slice 95 (precise TFA + window 2dp area rounding) shifted 5 more
|
||||
(0240, 6035, 8135, 2130, 0390-2254). All residuals are re-pinned.
|
||||
Current outliers and what we now know:
|
||||
|
||||
- **0240** (-15 SAP, +17.8 PE): Detached age J + RR + 11 windows. The
|
||||
earlier handover claim of "RR mapper gap" is **partly stale**:
|
||||
- `room_in_roof_type_1.gable_wall_length_1/2` ARE extracted by the
|
||||
21.0.1 mapper (see mapper.py:1349-1369 — must have landed in
|
||||
Slices 71-86). Cert 0240's RR cascades through with floor_area=
|
||||
83.2, gables 6.4 + 6.4, age J → U_RR = 0.30 W/m²K.
|
||||
- `'Roof room(s), insulated (assumed)'` description NOT parsed —
|
||||
but the spec basis for parsing it is unclear: age J's Table 18
|
||||
col(4) default already models insulation (U=0.30), and unlike
|
||||
the regular-roof "insulated (assumed)" → 50 mm bucket rule
|
||||
(RdSAP §5.11.4), no equivalent rule for RR has been identified.
|
||||
- The -15 SAP residual is a mix, not a single RR gap. Subsystem
|
||||
breakdown for cert 0240 (via cert_to_inputs cascade):
|
||||
- walls 22.95, party 0, roof 76.93 (incl RR ~18.5), floor 29.43,
|
||||
windows 41.55, doors 11.10, bridging 39.64; total HLC 221.6 W/K
|
||||
- **windows_w_per_k = 41.55 is the most leverageable**: 11
|
||||
windows × 18.28 m² × U_default ≈ 2.27 W/m²K. Cert lodges
|
||||
`glazing_type=2` for all windows but Slice 93's
|
||||
`_API_GLAZING_TYPE_TO_TRANSMISSION` only covers codes 3 and 13;
|
||||
surfacing code 2 would land a measurable U (likely ~1.8-2.0)
|
||||
and close several W/K of fabric loss.
|
||||
- Other potential gains: BP[0] non-RR ceiling lodges "Pitched,
|
||||
400+ mm loft insulation" (should U ~0.10); verify cascade
|
||||
gives it that.
|
||||
- **Net**: cert 0240 is not a single-slice fix; it's 3-5
|
||||
progressive mapper improvements (glazing_type 2 surfacing,
|
||||
possibly more glazing codes, possibly RR description nuance).
|
||||
- **0390-2954** (-6 SAP, -26.5 PE): large detached F (TFA 360), oil
|
||||
PCDB-listed. Undocumented. PE going more negative than SAP suggests
|
||||
the cost cascade is hitting harder than energy — possibly oil
|
||||
price/efficiency interaction.
|
||||
- **6035** (-6 SAP, +49.5 PE): mid-terrace age A + RR. Probably has
|
||||
the same glazing_type-default-U issue as 0240 plus an age-A-
|
||||
specific gap.
|
||||
|
||||
### 5. (deferred) Cohort chain test RED triage
|
||||
|
||||
4 cohort chain tests (000474, 000480, 000487, 000490) are RED
|
||||
because the Elmhurst U985 worksheets emit (12) values that don't
|
||||
follow RdSAP 10 §5 — see the conversation re: identical Summary §9
|
||||
lodgements producing different worksheet (12) for cohort 000477 vs
|
||||
000480. The cascade is now spec-correct; the Elmhurst tool isn't.
|
||||
Options: (a) mark as known-Elmhurst-non-spec, (b) add per-cert
|
||||
override field, (c) wait for more cert pairs to confirm pattern.
|
||||
**Not blocking the production goal.**
|
||||
|
||||
## Key conventions (project memory)
|
||||
|
||||
- **AAA test convention** — every new test uses literal `# Arrange /
|
||||
# Act / # Assert` headers.
|
||||
- **`abs(diff) <= tol`** not `pytest.approx` (strict-pyright partial-
|
||||
unknown).
|
||||
- **One slice = one commit** — stage by name (`git add <path>`).
|
||||
- **1e-4 tolerance** for the worksheet-comparable paths (Elmhurst
|
||||
Summary + API both have worksheets for cert 001479). No widening,
|
||||
no xfail.
|
||||
- **Strict pyright net-zero** per file. Baselines: `mapper.py` 33,
|
||||
`heat_transmission.py` 13, `cert_to_inputs.py` 35,
|
||||
`epc_property_data.py` 0.
|
||||
- **Spec citation in commit messages** — when a slice implements a
|
||||
spec rule, quote the spec text (RdSAP 10 page reference). User
|
||||
asked us to confirm against docs.
|
||||
|
||||
## Cached artefacts
|
||||
|
||||
- `domain/sap10_calculator/rdsap/tests/fixtures/golden/0535-
|
||||
9020-6509-0821-6222.json` — API JSON for cert 001479 (RdSAP-Schema-
|
||||
21.0.1).
|
||||
- `backend/documents_parser/tests/fixtures/Summary_001479.pdf` —
|
||||
Elmhurst site-notes PDF for cert 001479.
|
||||
- `sap worksheets/lodged example/P960-0001-001479.pdf` — Domna's
|
||||
worksheet output for cert 001479 (Continuous SAP 69.0094).
|
||||
- `sap worksheets/U985-0001-NNNNNN.pdf` × 6 — cohort Elmhurst
|
||||
worksheets (000474, 000477, 000480, 000487, 000490, 000516).
|
||||
- `sap worksheets/U985-0001-NNNNNN.txt` × 6 — text exports of above.
|
||||
|
||||
## Recent slice history (Slices 87-95, current branch)
|
||||
|
||||
```
|
||||
f502db8c Slice 95: API mapper TFA from per-bp dims + window area 2dp rounding — cert 001479 to 1e-4
|
||||
03203418 Slice 94: API mapper sheltered_sides + floor_type — cert 001479 to 1e-3
|
||||
7281b7b3 Slice 93: API mapper window_transmission_details from glazing_type
|
||||
8e752e57 Slice 92: API mapper floor dimensions (SAP +0.25m + exposed-floor + NI→None)
|
||||
2cebba28 Slice 91: API mapper descriptive strings + roof description per-bp fix
|
||||
fbbdca49 Slice 90: API mapper translates party_wall_construction → SAP10 enum
|
||||
006e9842 Slice 89: PS pitched-sloping-ceiling roof area uses inclined surface
|
||||
c40679d1 Slice 88: thread bp.floor_construction_type into u_floor cascade
|
||||
aff331ff Slice 87: implement RdSAP 10 §5 (12) spec rule for suspended timber floor
|
||||
2d3355ee Slice 86: 1:1 windows expansion in cohort 000516 (2 → 5 entries)
|
||||
f863598d Slice 85: bulk-update cohort 000516 hand-built for Cat A diff parity
|
||||
```python
|
||||
from backend.epc_client.epc_client_service import EpcClientService
|
||||
from dotenv import load_dotenv
|
||||
import os
|
||||
load_dotenv('/workspaces/model/backend/.env') # OPEN_EPC_API_TOKEN
|
||||
svc = EpcClientService(auth_token=os.environ['OPEN_EPC_API_TOKEN'])
|
||||
raw = svc._fetch_certificate('<cert-ref>') # raw JSON dict
|
||||
```
|
||||
|
||||
Earlier slice context (71-86 closed cohort Layer 2) is in the prior
|
||||
handover at commit `86eff23f` (`domain/sap10_calculator/docs/NEXT_AGENT_PROMPT.md`
|
||||
before this rewrite).
|
||||
Note: use `OPEN_EPC_API_TOKEN` not `EPC_AUTH_TOKEN` (the latter is
|
||||
for a different/legacy API).
|
||||
|
||||
## First action
|
||||
### 9 cert references + heating type + worksheet SAP
|
||||
|
||||
1. Confirm branch state — Slice 95 (`f502db8c`) closed cert 001479 to
|
||||
< 1e-4 (was +0.0006 after Slice 94). Layer 4 is GREEN.
|
||||
2. Run the full sweep:
|
||||
| Cert ref | Worksheet | Heating | PCDB idx | Worksheet SAP | TFA | bps | Dwelling |
|
||||
|---|---|---|---|---|---|---|---|
|
||||
| `0330-2249-8150-2326-4121` | 000897 | **Mains gas boiler** | 10241 | 61.5993 | 69.14 | 2 | Mid-terrace house |
|
||||
| `0350-2968-2650-2796-5255` | 000903 | ASHP | 104568 | 84.1367 | 90.54 | 2 | Mid-terrace house |
|
||||
| `0380-2471-3250-2596-8761` | 000899 | ASHP | 104568 | 88.5104 | 60.43 | 1 | Semi-detached bungalow |
|
||||
| `2225-3062-8205-2856-7204` | 000900 | ASHP | 104568 | 88.7921 | 82.49 | 1 | End-terrace house |
|
||||
| `2636-0525-2600-0401-2296` | 000901 | ASHP | 104568 | 86.2641 | 82.10 | 1 | Mid-terrace house |
|
||||
| `3800-8515-0922-3398-3563` | 000898 | ASHP | 104568 | 86.1458 | 81.34 | 2 | Mid-terrace house |
|
||||
| `9285-3062-0205-7766-7200` | 000902 | ASHP | 104568 | 84.1369 | 85.90 | 1 | End-terrace house |
|
||||
| `9418-3062-8205-3566-7200` | 000896 | ASHP | 102421 | 84.6305 | 74.37 | 3 | End-terrace house |
|
||||
| `9501-3059-8202-7356-0204` | (RR cert — newest, added late in session) | **Mains gas boiler** | 19007 | (not measured) | — | — | Top-floor flat |
|
||||
|
||||
**Heating-type split**:
|
||||
- 2 mains gas boilers: 0330, 9501 (validated mapper territory)
|
||||
- 7 ASHPs: 0350, 0380, 2225, 2636, 3800, 9285, 9418 (**brand-new
|
||||
mapper territory — never validated**)
|
||||
|
||||
One earlier mismatch — cert 0330's folder originally held the wrong
|
||||
property's Summary/worksheet (17 vs 21 Summerfield Road); the user
|
||||
fixed mid-session and Summary_000897/dr87-0001-000897 now match
|
||||
cert ref 0330 correctly. The other 8 were audited and match.
|
||||
|
||||
## Major scope discovery — Heat Pumps
|
||||
|
||||
7 of the 9 new certs are Air Source Heat Pumps (predominantly PCDB
|
||||
index 104568, one model 102421). The mapper has never been
|
||||
validated against a heat-pump cert — cohort certs + cert 001479 are
|
||||
all mains-gas boilers.
|
||||
|
||||
**Cert 0380 (initial pilot attempt) showed catastrophic failures**:
|
||||
|
||||
| Path | Cascade SAP | Δ vs worksheet 88.5104 |
|
||||
|---|---|---|
|
||||
| Summary mapper | 18.08 | **-70.43** |
|
||||
| API mapper | 70.14 | **-18.37** |
|
||||
|
||||
Diff: Summary identified the heat pump as an 80%-efficient boiler
|
||||
(catastrophic); API correctly identified it as a heat pump with
|
||||
COP=2.3 but cascade output still −18 SAP below worksheet (fabric
|
||||
HLC 104 vs probably ~50 needed). The Summary mapper is
|
||||
fundamentally broken on heat pumps; the API mapper is
|
||||
partially-broken.
|
||||
|
||||
**Recommendation**: defer the heat-pump certs until the boiler
|
||||
workflow is proven. Closing 7 ASHP certs is plausibly a 15-30 slice
|
||||
workstream (new mapper plumbing for PCDB COP, electric tariff
|
||||
costing for HW + space heating, Appendix N heat-pump efficiency
|
||||
adjustments, etc.). Cert 0380 (smallest TFA bungalow, single bp)
|
||||
is the pilot HP cert once boiler workflow is proven.
|
||||
|
||||
## Pilot status — cert 0330 (mains-gas mid-terrace boiler)
|
||||
|
||||
Same shape as cert 001479 (proven). API JSON staged at
|
||||
`domain/sap10_calculator/rdsap/tests/fixtures/golden/
|
||||
0330-2249-8150-2326-4121.json` (**uncommitted**). Summary PDF
|
||||
copied to
|
||||
`backend/documents_parser/tests/fixtures/Summary_000897.pdf`
|
||||
(**uncommitted**).
|
||||
|
||||
### Cascade SAP comparison
|
||||
|
||||
| Path | Cascade SAP | Δ vs worksheet 61.5993 |
|
||||
|---|---|---|
|
||||
| Summary mapper | 62.0660 | **+0.4667** (just over 0.5) |
|
||||
| API mapper | 63.7446 | **+2.1453** (≥2 SAP off) |
|
||||
| Δ API↔Summary | +1.6786 | (mapper paths disagree) |
|
||||
|
||||
### Cascade-component diff (API vs Summary)
|
||||
|
||||
```
|
||||
TFA: 90.56 = 90.56 ✓
|
||||
storeys: 2 = 2 ✓
|
||||
HLC walls: 113.535 ≈ 113.520 (Δ +0.015 — negligible)
|
||||
HLC roof: 7.323 = 7.323 ✓
|
||||
HLC floor: 30.705 = 30.705 ✓
|
||||
HLC windows: 36.455 vs 29.741 (Δ +6.71 ← BIG)
|
||||
HLC doors: 11.100 = 11.100 ✓
|
||||
HLC party: 11.357 = 11.357 ✓
|
||||
HLC bridge: 28.347 = 28.347 ✓
|
||||
HLC total: 238.822 vs 232.093 (Δ +6.73 — all from windows)
|
||||
Inf ACH: 0.7382 = 0.7382 ✓
|
||||
HW kWh: 3172.65 vs 2112.00 (Δ +1060 ← BIG)
|
||||
Lighting kWh: 207.92 = 207.92 ✓
|
||||
Main eff: 0.8850 = 0.8850 ✓
|
||||
```
|
||||
|
||||
Two specific gaps to investigate as separate slices:
|
||||
|
||||
1. **Windows HLC +6.71 W/K** — likely `glazing_type=14` (cert 0330)
|
||||
not in Slice 93's `_API_GLAZING_TYPE_TO_TRANSMISSION` (only codes
|
||||
3 and 13 are mapped). Same shape as cert 001479's
|
||||
`glazing_type=2` issue; extending the dict should close this.
|
||||
Affects multiple certs that use code 14.
|
||||
|
||||
2. **HW kWh +1060 (API 3172 vs Summary 2112)** — substantial
|
||||
divergence in §4 hot water cascade. Needs probe of which
|
||||
subsystem (occupancy N, shower outlets, electric_shower_count,
|
||||
cylinder, etc.) the API mapper is reading wrong. Cert 0330
|
||||
doesn't have the +0.5m upper-storey adjustment quirk cert 001479
|
||||
needed (Slice 92), so different root cause likely.
|
||||
|
||||
(The user observed: "the mapping is very much incomplete (hence we
|
||||
have some non 0 matches to elmhurst summary matches)" — non-1e-4
|
||||
matches are expected and tractable.)
|
||||
|
||||
### 116 field-level divergences (API vs Summary)
|
||||
|
||||
Most are cascade-equivalent surfacing differences (Slice 91-era
|
||||
descriptive strings + int/None vs explicit-bool patterns) — the
|
||||
same shape `_is_excluded_path` already handles for the cohort
|
||||
certs. New specific concrete diffs that DO affect the cascade:
|
||||
|
||||
- `sap_windows[*].window_transmission_details` — Summary has
|
||||
explicit U/g/data_source; API has None for `glazing_type=14`
|
||||
(cascade falls back to default U → too high)
|
||||
- `sap_windows[*].frame_factor` — Summary 0.7, API None
|
||||
- `sap_windows[*].window_width / window_height` — same w*h area
|
||||
rounding pattern as cert 001479 (handled in Slice 95)
|
||||
|
||||
## Workflow recommendation for next slice queue
|
||||
|
||||
For each new cert (after cert 0330 pilot lands):
|
||||
|
||||
1. **Stage**: fetch API JSON, copy Summary PDF into fixtures
|
||||
2. **Probe**: run the cascade-component diff (recreate the inline
|
||||
pattern; the probe takes both `summary_epc` and `api_epc`, lowers
|
||||
via `cert_to_inputs`, diffs each subsystem)
|
||||
3. **Localise** the biggest cascade-component delta
|
||||
4. **Fix** the mapper to close it; one fix = one slice
|
||||
5. **Add Layer 4 1e-4 test** when both Summary and API paths hit
|
||||
worksheet at 1e-4 (cert may pass Summary path first, then
|
||||
iterate API mapper to catch up)
|
||||
6. **Commit**: stage by name (`git add <path>`), cite spec page
|
||||
when implementing a spec rule
|
||||
|
||||
### Cohort-style fixture pattern
|
||||
|
||||
If a cert benefits from a hand-built fixture (Layer 1), mirror the
|
||||
cohort pattern at
|
||||
`domain/sap10_calculator/worksheet/tests/_elmhurst_worksheet_NNNNNN.py`
|
||||
— with prefix `_dr87_worksheet_NNNNNN.py` for the new Domna-tool
|
||||
worksheet variant.
|
||||
|
||||
**WARNING (lesson from previous session)**: the cohort hand-builts
|
||||
encode non-spec quirks (e.g. `has_suspended_timber_floor=False` to
|
||||
mirror the worksheet's non-spec §(12) behaviour for 4 certs). Don't
|
||||
blindly trust the hand-builts as spec-correct; cross-check against
|
||||
the mapper's spec-inference output before committing.
|
||||
|
||||
## Conventions (preserved from previous handover)
|
||||
|
||||
- **One slice = one commit** — stage by name.
|
||||
- **AAA test convention** — literal `# Arrange / # Act / # Assert`
|
||||
headers in every new test.
|
||||
- **`abs(diff) <= tol`** not `pytest.approx` (strict-pyright clean).
|
||||
- **1e-4 worksheet tolerance** when worksheet is available; ±0.5
|
||||
fallback only for API-only goldens.
|
||||
- **Spec citation** in commit messages when a slice implements a
|
||||
spec rule (quote RdSAP 10 / SAP 10.2/10.3 page reference).
|
||||
- **Pyright net-zero per file**. Baselines (re-verify at session
|
||||
start):
|
||||
- `datatypes/epc/domain/mapper.py`: 33
|
||||
- `domain/sap10_calculator/worksheet/heat_transmission.py`: 13
|
||||
- `domain/sap10_calculator/rdsap/cert_to_inputs.py`: 35
|
||||
- `datatypes/epc/domain/epc_property_data.py`: 0
|
||||
|
||||
## First actions for the next agent
|
||||
|
||||
1. Confirm HEAD: `git log --oneline -1` → `6dc11e4d`.
|
||||
2. Re-baseline:
|
||||
```bash
|
||||
PYTHONPATH=/workspaces/model:/workspaces/model/packages/domain/src \
|
||||
python -m pytest backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
|
||||
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
|
||||
--no-cov -q
|
||||
PYTHONPATH=/workspaces/model python -m pytest \
|
||||
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
|
||||
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
|
||||
--no-cov -q
|
||||
```
|
||||
Expect **99 passed / 19 failed**. All 19 failures pre-existing:
|
||||
9× hand-built 001479 skeleton (`test_sap_result_pin[001479-*]`),
|
||||
6× cohort diff (`test_from_elmhurst_site_notes_matches_hand_built_*`),
|
||||
4× cohort chain (000474/000480/000487/000490 — Elmhurst non-spec).
|
||||
3. Production goal is met for cert 001479. Next work focuses on the
|
||||
golden cert residual outliers (§4 above) and new (Summary + API)
|
||||
cert pairs from the user. The diff-probe methodology from Slice 95
|
||||
(cascade-component diff API vs Summary path; localise; fix mapper)
|
||||
works for any new (Summary + API) pair — worksheet not required
|
||||
when Summary path is established as canonical.
|
||||
4. Don't lose sight of Layer 4: **API → SAP within 1e-4 of worksheet
|
||||
continuous on cert 001479** is the production goal. **MET as of
|
||||
Slice 95** — `test_api_001479_full_chain_sap_matches_worksheet_pdf_
|
||||
exactly` formalises this gate.
|
||||
3. Pick up cert 0330 pilot. Either continue from where I left off
|
||||
(fixtures staged uncommitted, 2 specific gaps identified above)
|
||||
OR pivot to a different boiler cert if 0330 turns out
|
||||
problematic (cert 9501 is the other boiler — top-floor flat with
|
||||
PCDB idx 19007).
|
||||
4. Commit cert 0330's fixtures (API JSON + Summary PDF) as the
|
||||
foundation slice before working any mapper fixes:
|
||||
```bash
|
||||
git add domain/sap10_calculator/rdsap/tests/fixtures/golden/0330-2249-8150-2326-4121.json
|
||||
git add backend/documents_parser/tests/fixtures/Summary_000897.pdf
|
||||
git commit -m "chore: stage cert 0330 fixtures (boiler pilot, worksheet SAP 61.5993)"
|
||||
```
|
||||
5. Add a RED Layer 2 test (Summary mapper cascade SAP at 1e-4
|
||||
vs 61.5993) — establishes the failing target. Then fix the
|
||||
Summary path mapper bugs slice-by-slice.
|
||||
6. Once Summary path is GREEN, do the same for the API path (Layer
|
||||
4). The API mapper may need additional fixes Summary doesn't
|
||||
need — they're independent paths into the same `EpcPropertyData`
|
||||
shape.
|
||||
7. After cert 0330 lands as a clean Layer 4 1e-4 pin, repeat for
|
||||
cert 9501 (the other boiler). 2 boiler certs proven is much
|
||||
stronger evidence than 1.
|
||||
8. Then plan the heat-pump workstream. The 7 ASHP certs share a
|
||||
PCDB index (104568) so much of the fix is likely shared. Write
|
||||
a follow-up handover for that workstream specifically.
|
||||
|
||||
The user is sourcing more cert pairs in parallel; when they arrive,
|
||||
each one will surface ~3-5 mapper bugs along the same pattern as
|
||||
Slices 87-95. The diagnostic methodology (diff Summary-mapper vs
|
||||
API-mapper; localise by cascade component; fix the API mapper to
|
||||
mirror the Summary's surfacing) works for any new (Summary + API)
|
||||
pair — worksheet not required when Summary path is canonical (cert
|
||||
001479 proves it is).
|
||||
## Heat-pump workstream sketch (deferred)
|
||||
|
||||
When the user gives the go-ahead, work order:
|
||||
|
||||
1. **API mapper**: surface `main_heating_index_number`, set
|
||||
`main_heating_category` for HPs, `main_fuel_type=29` (electric
|
||||
heat pump).
|
||||
2. **Cascade**: ensure `cert_to_inputs._main_heating_efficiency`
|
||||
reads PCDB HP COP correctly. Investigate Table 4a/4b vs PCDB
|
||||
precedence for HPs.
|
||||
3. **Fuel cost**: HW + space heating on electricity tariffs
|
||||
(Table 12) — check if the cascade has electric-tariff fuel-cost
|
||||
plumbing wired up.
|
||||
4. **Appendix N**: HP-specific efficiency adjustments (climate +
|
||||
flow temperature). Likely the biggest cascade-side gap.
|
||||
5. **Summary mapper**: separate slice — needs to identify HPs from
|
||||
the Summary PDF's heating section.
|
||||
|
||||
## Open items / known gaps not yet addressed
|
||||
|
||||
- 8 API-only golden cert residuals still range from 0 to -15 SAP
|
||||
delta (cert 0240 is the outlier — see prior handover §4 and
|
||||
`test_golden_fixtures.py` notes). The user's stated end goal is
|
||||
<0.5 SAP error on all goldens; cert 0240 needs RR-description
|
||||
parsing (or Room-in-Roof mapping investigation) + glazing_type=2
|
||||
surfacing.
|
||||
- Layer 3 field-parity test
|
||||
(`test_from_api_response_matches_from_elmhurst_site_notes_001479`)
|
||||
still not written. Lower priority since cascade-output Layer 4
|
||||
already gates parity.
|
||||
- The 4 cohort chain tests for non-spec §(12) certs were deleted
|
||||
this session; if the user later sources spec-compliant
|
||||
worksheets for 000474/000480/000487/000490, those tests can be
|
||||
restored (with the spec-correct hand-builts).
|
||||
|
||||
## Tooling shortcuts
|
||||
|
||||
- **EPC fetch**: `OPEN_EPC_API_TOKEN` (NOT `EPC_AUTH_TOKEN`) in
|
||||
`backend/.env`. `EpcClientService._fetch_certificate(cert_ref)`
|
||||
returns the raw JSON dict.
|
||||
- **Worksheet SAP extract**: `pdftotext -layout <worksheet.pdf> -`
|
||||
then `grep -E "SAP value\s+[0-9]+\.[0-9]+"`. Works for all
|
||||
`dr87-`, `P960-`, and `U985-` worksheet variants.
|
||||
- **Cascade-component probe template**: see the cert-0330 probe
|
||||
inline above; same shape as the cert-001479 probe.
|
||||
|
||||
Good luck. The methodology is proven on cert 001479 and partially
|
||||
on cert 0330 (boiler pilot 95% closed). Each new cert pair should
|
||||
land in 1-5 mapper slices. Stage by name; one slice = one commit;
|
||||
cite spec when implementing a spec rule.
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue