From 92fc4f4f16c8b1ea7c74c3fadbd6540a89004246 Mon Sep 17 00:00:00 2001 From: Khalim Conn-Kowlessar Date: Wed, 27 May 2026 22:22:13 +0000 Subject: [PATCH] =?UTF-8?q?docs:=20handover=20=E2=80=94=20Summary=20+=20AP?= =?UTF-8?q?I=20cohort=20expansion=20to=2038=20additional=20certs?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Hands off the next workstream: the 38 cert subdirs at `sap worksheets/additional with api 2/`. Each subdir is named after the 20-digit EPC cert reference and contains a Summary PDF + dr87 worksheet PDF. API JSONs are NOT in the dataset but ARE fetchable via the existing `EpcClientService` (token in `backend/.env` as `OPEN_EPC_API_TOKEN`). User's stated ordering: Elmhurst Summary mapping FIRST, API path SECOND. Folder names = cert refs; need to verify the matching before bulk-pinning (any mis-filed PDF would silently invalidate slice work). Handover ships with verified dataset and first-attempt baselines: - Folder-vs-cert sweep: **38/38 match** at handover (postcode parity check between Summary PDF and Open EPC API). - First-attempt Summary-path probe across 38 certs: 24 ✅ closed at ±0.07 (first-try, zero new slices needed) 9 ~ small gap (<1 SAP) — likely 1 slice each 3 ✗ big gap (>1 SAP) — multi-slice investigation 2 RAISES UnmappedElmhurstLabel: cylinder_size='Normal' The two `Normal` cylinder raises are the immediate Phase 1 slice — Slice S0380.15's strict-enum pattern paid off on its first new cohort by surfacing the gap at extraction time instead of as a downstream SAP delta. Workstream phases documented in the handover: Phase 0: folder-vs-cert sweep (already done — 38/38) Phase 1: fix 'Normal' cylinder unmapped-label raise Phase 2: bulk-pin the 24 first-try-closures as chain tests Phase 3: close the 9 small-gap certs one slice each Phase 4: investigate the 3 big-gap certs (likely HP-routing) Phase 5: fetch + persist API JSON for all 38, run API path tests Phase 6: cross-mapper EPC parity (Summary EPC ≡ API EPC) — the user's stated north-star Includes: - Paste-able diagnostic probe scripts (Summary path + folder-vs- cert sweep + .env loader + EpcClientService usage example). - Full table of first-attempt deltas per cert with classifications. - All 15 prior-session slice commits indexed. - Memory references to the slicing / methodology conventions. - Per-cert diagnostic recipe template. Co-Authored-By: Claude Opus 4.7 --- .../docs/HANDOVER_38_CERT_COHORT_EXPANSION.md | 448 ++++++++++++++++++ 1 file changed, 448 insertions(+) create mode 100644 domain/sap10_calculator/docs/HANDOVER_38_CERT_COHORT_EXPANSION.md diff --git a/domain/sap10_calculator/docs/HANDOVER_38_CERT_COHORT_EXPANSION.md b/domain/sap10_calculator/docs/HANDOVER_38_CERT_COHORT_EXPANSION.md new file mode 100644 index 00000000..a5509106 --- /dev/null +++ b/domain/sap10_calculator/docs/HANDOVER_38_CERT_COHORT_EXPANSION.md @@ -0,0 +1,448 @@ +# Handover — Summary + API cohort expansion to 38 additional certs + +Branch `feature/per-cert-mapper-validation`. Previous session shipped 15 slices +(S0380.1 → S0380.15) closing the 7-cert ASHP cohort Summary path at the ±0.07 +Appendix N3.6 PSR-precision floor and establishing the strict-enum pattern. +This handover opens the **38-cert cohort expansion** workstream. + +**HEAD at handover start:** `d7ca179e` (Slice S0380.15: strict-enum raising +on unmapped cylinder labels). + +## User's stated goal (preserved verbatim) + +> Awesome - could you write a handover for a new agent to pick this up. +> I've added some more test cases, in the same format, in here: +> `sap worksheets/additional with api 2` +> We should check that the Elmhurst mapping works and then the api + +> the folder name is the certificate number. We can use the EPC api to get +> the api responses. We should check I've matched correctly. The api token +> is in backend/.env and is OPEN_EPC_API_TOKEN + +**Ordering:** Elmhurst Summary mapping FIRST (Summary PDFs + dr87 worksheets +ship in each folder), API path SECOND (fetched live via `EpcClientService`). +Along the way: **verify the folder name actually matches the cert** (it does +for the 5 spot-checks I ran — postcode parity — but the full 38 needs a +sweep before mapping work compounds errors on a mis-filed cert). + +## The new dataset + +`/workspaces/model/sap worksheets/additional with api 2/` — 38 cert subdirs. +Each subdir is named after the **20-digit EPC certificate reference** (e.g. +`0036-6325-1100-0063-1226`) and contains: + + - `Summary_NNNNNN.pdf` — Elmhurst Summary PDF (drives the Summary path) + - `dr87-0001-NNNNNN.pdf` — dr87 worksheet PDF (spec anchor; lodges + `SAP value` + every cascade line ref) + +The 6-digit suffix is the Elmhurst worksheet number, NOT the cert ref. + +**Folder-name verification — full 38-cert sweep at handover time: 38/38 ✅** +All postcode-extracted-from-Summary-PDF values match the Open EPC API +postcode for the folder-name cert reference. Dataset is clean. + +(Caveat: the sweep iterator picked up a `.DS_Store` macOS metadata file. +Skip non-directory entries in your iterators: `for cd in sorted(src.iterdir()) if cd.is_dir() and not cd.name.startswith('.')`.) + +## First-attempt Summary-path probe (run at HEAD `d7ca179e`) + +24 of 38 certs (63%) close first-try at ±0.07 — strong validation that the +ASHP-cohort mapper work amortizes. Distribution: + +| Status | Count | Disposition | +|---|---|---| +| ✅ Closed at ±0.07 | **24** | Add chain tests; zero new slices needed | +| ~ Small gap (<1 SAP) | 9 | 1–2 slices each, similar to certs 0350 / 2225 | +| ✗ Big gap (>1 SAP) | 3 | Multi-slice investigation per cert | +| RAISES UnmappedElmhurstLabel | **2** | First strict-enum catches — fix immediately | + +### Detailed first-attempt Summary deltas + +``` +cert WS SAP Summary delta result + 0036-6325-1100-0063-1226 62.7471 62.3734 -0.3737 ~ small + 0100-5141-0522-4696-3463 85.8332 85.8668 +0.0336 ✅ + 0200-3155-0122-2602-3563 80.8674 80.8674 -0.0000 ✅ + 0300-2403-2650-2206-0235 76.6541 76.6541 +0.0000 ✅ + 0310-2763-5450-2506-3501 78.3593 77.6061 -0.7532 ~ small + 0320-2126-2150-2326-6161 71.7224 71.7224 +0.0000 ✅ + 0320-2756-8640-2296-1101 89.9458 89.9879 +0.0421 ✅ + 0330-2257-3640-2196-3145 84.6541 84.6966 +0.0425 ✅ + 0360-2266-5650-2106-8285 80.4680 80.4680 +0.0000 ✅ + 0380-2530-6150-2326-4161 65.7795 65.7795 +0.0000 ✅ + 0390-2066-4250-2026-4555 65.3253 64.9942 -0.3311 ~ small + 0464-3032-0205-4276-3204 80.4533 79.9249 -0.5284 ~ small + 0652-3022-1205-2826-1200 70.9577 72.8813 +1.9236 ✗ big + 1536-9325-5100-0433-1226 65.8928 65.8928 -0.0000 ✅ + 2007-3011-9205-8136-3204 68.3914 68.3914 -0.0000 ✅ + 2031-3007-0205-1296-3204 64.1734 64.1734 +0.0000 ✅ + 2102-3018-0205-7886-5204 63.8732 48.0657 -15.8075 ✗ big (HW or HP?) + 2130-3018-4205-4686-5204 71.3158 71.3158 +0.0000 ✅ + 2336-3124-3600-0517-1292 83.4955 83.5381 +0.0426 ✅ + 2536-2525-0600-0788-2292 79.7264 RAISES Unmapped: cylinder_size='Normal' + 2590-3025-7205-9066-0200 65.9194 65.9194 -0.0000 ✅ + 2699-3025-5205-8066-0200 68.7535 68.7535 +0.0000 ✅ + 2800-7999-0322-4594-3563 78.1408 78.1665 +0.0257 ✅ + 3136-7925-4500-0246-6202 77.8872 77.1341 -0.7531 ~ small + 3336-2825-9400-0512-8292 78.3739 78.4413 +0.0674 ✅ + 4536-5424-8600-0109-1226 82.4974 82.5412 +0.0438 ✅ + 4536-8325-3100-0409-1222 65.6000 65.1680 -0.4320 ~ small + 4800-3992-0422-0599-3563 86.7192 86.7688 +0.0496 ✅ + 6835-3920-2509-0933-5226 80.1977 65.6387 -14.5590 ✗ big (HW or HP?) + 7700-3362-0922-7022-3563 63.4425 63.0024 -0.4401 ~ small + 7800-1501-0922-7127-3563 64.7504 64.5072 -0.2432 ~ small + 7836-3125-0600-0526-2202 80.1792 80.1389 -0.0403 ✅ + 9036-0824-3500-0420-8222 84.2727 84.3227 +0.0500 ✅ + 9370-3060-1205-3546-4204 87.8687 87.8946 +0.0259 ✅ + 9380-2957-7490-2595-3141 74.5902 74.6175 +0.0273 ✅ + 9421-3045-3205-1646-6200 87.4495 RAISES Unmapped: cylinder_size='Normal' + 9796-3058-6205-0346-9200 90.1318 90.6983 +0.5665 ~ small + 9836-7525-9500-0575-1202 75.2223 75.2203 -0.0020 ✅ +``` + +Run the probe yourself to confirm the baseline before slicing — script in +"Diagnostic probe script" below. + +## API path is fetchable, not deferred + +The Open EPC API is reachable via the existing client +[`backend/epc_client/epc_client_service.py`](../../../backend/epc_client/epc_client_service.py). +Token sits in `backend/.env` as `OPEN_EPC_API_TOKEN`. Minimal example +(confirmed working at handover time): + +```python +import os +from pathlib import Path +# Load .env (no python-dotenv assumption — manual parse works) +for line in Path('/workspaces/model/backend/.env').read_text().splitlines(): + line = line.strip() + if not line or line.startswith('#') or '=' not in line: continue + k, v = line.split('=', 1) + os.environ[k.strip()] = v.strip().strip('"').strip("'") + +from backend.epc_client.epc_client_service import EpcClientService +svc = EpcClientService(auth_token=os.environ["OPEN_EPC_API_TOKEN"]) + +# Returns the raw API JSON dict (the same shape that +# `EpcPropertyDataMapper.from_api_response` consumes): +raw_json = svc._fetch_certificate("0036-6325-1100-0063-1226") + +# Or skip straight to the mapped EPC: +epc = svc.get_by_certificate_number("0036-6325-1100-0063-1226") +``` + +For the 38-cert sweep, persist the raw JSON to disk so future runs are +offline + deterministic: + +```bash +mkdir -p /workspaces/model/domain/sap10_calculator/rdsap/tests/fixtures/golden +# write each `raw_json` to .json — matches the existing +# golden/.json convention used by the 7-cert ASHP cohort. +``` + +Rate-limit caveat: the client raises `EpcRateLimitError` with a +`retry_after` hint on HTTP 429. The existing `call_with_retry` wrapper at +`backend/epc_client/_retry.py` handles backoff. Be polite — sleep 0.5s +between fetches on the bulk sweep. + +## Recommended workstream order + +### Phase 0 — Folder-vs-cert sweep (already done at handover time — clean) + +Already run at handover: **38/38 match**. Re-run if the dataset has +changed since handover. Fail loudly on any new mismatch. If mismatches +exist, audit the cert dir (likely a typo'd folder name or a misplaced +PDF) before sinking slice work into a wrong-cert mapping. + +```python +# (uses the .env loader + svc from above) +import re +from pathlib import Path +src = Path('/workspaces/model/sap worksheets/additional with api 2') +from backend.documents_parser.tests.test_summary_pdf_mapper_chain import _summary_pdf_to_textract_style_pages +mismatches = [] +for cd in sorted(src.iterdir()): + cert_ref = cd.name + sp = next(cd.glob("Summary_*.pdf"), None) + if sp is None: + mismatches.append((cert_ref, "no Summary PDF")) + continue + text = "\n".join(_summary_pdf_to_textract_style_pages(sp)) + m = re.search(r"\b([A-Z]{1,2}[0-9][0-9A-Z]?\s?[0-9][A-Z]{2})\b", text) + pdf_pc = (m.group(1) if m else "").replace(" ","").upper() + try: + api_pc = (svc._fetch_certificate(cert_ref).get("postcode","") or "").replace(" ","").upper() + if pdf_pc != api_pc: + mismatches.append((cert_ref, f"PDF={pdf_pc!r} vs API={api_pc!r}")) + except Exception as e: + mismatches.append((cert_ref, f"API ERROR: {type(e).__name__}")) +print(f"{len(mismatches)} mismatches:", mismatches) +``` + +### Phase 1 — Strict-enum catches (immediate, lowest-investigation) + +**First slice:** `cylinder_size='Normal'` → cascade code. Two certs raise +on this label (2536, 9421). Look up the worksheet `Cylinder Volume` for +cert 2536 (`sap worksheets/additional with api 2/2536-2525-0600-0788-2292/dr87-0001-NNNNNN.pdf`) +to determine the correct cascade enum. The cascade lookup is at +[`domain/sap10_calculator/rdsap/cert_to_inputs.py:1878`](../../../domain/sap10_calculator/rdsap/cert_to_inputs.py#L1878): +`_CYLINDER_SIZE_CODE_TO_LITRES: Final[dict[int, float]] = {3: 160.0, 4: 210.0}`. +If 'Normal' maps to a volume not in this dict, the cascade itself needs an +entry too — but most likely 'Normal' is a different size band the cascade +already knows about (check RdSAP cylinder-size enums: Small/Normal/Medium/ +Large/Very Large). After the fix, the +`test_all_seven_ashp_cohort_certs_extract_without_unmapped_label_raise` +test should be extended to include the new cohort certs. + +### Phase 2 — Bulk-pin the 24 already-closed certs + +Add `test_summary__full_chain_sap_within_spec_floor_of_worksheet` +tests for all 24 first-try-closures. Mostly mechanical: copy Summary PDFs +to `backend/documents_parser/tests/fixtures/Summary_NNNNNN.pdf`, add +path constants, register chain tests using `_ASHP_COHORT_CHAIN_TOLERANCE += 0.07`. Probably 2–3 slices grouped by batch. + +Chain-test body pattern — see +[`backend/documents_parser/tests/test_summary_pdf_mapper_chain.py`](../../../backend/documents_parser/tests/test_summary_pdf_mapper_chain.py) +`test_summary_3800_full_chain_sap_within_spec_floor_of_worksheet` +(zero-slice closure precedent). + +### Phase 3 — Close the 9 small-gap certs + +In delta order (smallest first, easier to debug): +- 7836 (Δ -0.04) — already inside ±0.07 on closer inspection? Re-run + probe; pin if so. +- 0036 (Δ -0.37), 0390 (Δ -0.33), 7800 (Δ -0.24), 4536-8325 (Δ -0.43), + 9796 (Δ +0.57), 7700 (Δ -0.44), 0464 (Δ -0.53), 3136 (Δ -0.75), + 0310 (Δ -0.75) — likely 1 fix each per the cohort precedent. + +For each, follow the [[feedback-worksheet-not-api-reference]] methodology: +extract worksheet line refs (26)..(39), (64), (216) for the cert, diff +against Summary cascade output. The dominant residual line ref points to +the missing mapper field. + +### Phase 4 — Investigate the 3 big-gap certs + +- **cert 2102** (Δ -15.81) and **cert 6835** (Δ -14.56) — both ~-15 SAP. + Magnitude similar to cert 0380 starting point pre-Slice 2 (HP mis- + routing) was -54 SAP. -15 SAP suggests partial HP mis-routing or major + HW/cylinder mis-config. Probe `main_heating_index_number` / + `main_heating_category` on the Summary EPC first. +- **cert 0652** (Δ +1.92) — moderate over-prediction. Could be PV + multi-array / extension / unusual fabric variant. + +### Phase 5 — API path closure + +Once Elmhurst is closed for all 38, run the **same** chain tests against +the API path: + +1. Fetch raw JSON for each cert (see `_fetch_certificate` snippet above). +2. Persist to `domain/sap10_calculator/rdsap/tests/fixtures/golden/.json`. +3. Run the API path: `EpcPropertyDataMapper.from_api_response(json) → + cert_to_inputs → calculate_sap_from_inputs`. +4. Pin against worksheet at ±0.07 (HPs) or 1e-4 (boilers). +5. Pattern existing `test_api__full_chain_sap_within_spec_floor_of_worksheet` + live in the same `test_summary_pdf_mapper_chain.py` file (yes, + confusing — but that's where the slice 102f-prep series put them). + +Per the prior session's prediction memory: many API-path certs should +close first-try because Elmhurst's first pass paid down most cascade- +side gaps. Per-cert convergence should be ≤1 slice each for the API path +once Elmhurst is done. + +### Phase 6 — Cross-mapper parity (Summary EPC ≡ API EPC) + +The user's longstanding north-star ("the EPC objects matching is our +signal that we've done things correctly"). For each cert with both +Summary + API EPCs, diff load-bearing fields. Existing pattern: +`test_from_elmhurst_site_notes_matches_hand_built_*` family. Extend or +adapt to compare Summary EPC vs API EPC directly. Any divergence is +either (a) a mapper gap on one side or (b) a real Summary-vs-API source +discrepancy worth flagging. + +## Methodology — preserved conventions + +All from prior session memory: + +- **Worksheet, not API, is the target** ([[feedback-worksheet-not-api-reference]]). + The dr87 worksheet's `SAP value` line is the pin. The API path is a + *signal* (useful for "what should the EPC field look like?") but never + the target. +- **One slice = one commit; stage by name** ([[feedback-commit-per-slice]]). +- **AAA test convention** with literal `# Arrange / # Act / # Assert` + headers ([[feedback-aaa-test-convention]]). +- **`abs(diff) <= tol`** not `pytest.approx` ([[feedback-abs-diff-over-pytest-approx]]). +- **±0.07 spec-floor tolerance** for HP cohort chain tests; **1e-4** for + boiler cohort chain tests. +- **Spec citation in commit messages** ([[feedback-spec-citation-in-commits]]). +- **Pyright net-zero per file**. +- **Worksheet-shape fidelity** ([[feedback-worksheet-shape-fidelity]]) when + adding new dataclass fields — mirror existing patterns, full structure + even without immediate consumer. +- **Strict-enum raises on unmapped labels** (Slice S0380.15 — currently + only cylinder helpers; extend to other label-mapping helpers as their + dicts get exercised). Exception is `UnmappedElmhurstLabel` from + `datatypes.epc.domain.mapper`. + +## Diagnostic probe script + +Paste-able first-attempt probe (run from repo root): + +```python +PYTHONPATH=/workspaces/model python <<'PY' +import re, subprocess +from pathlib import Path +from backend.documents_parser.tests.test_summary_pdf_mapper_chain import _summary_pdf_to_textract_style_pages +from backend.documents_parser.elmhurst_extractor import ElmhurstSiteNotesExtractor +from datatypes.epc.domain.mapper import EpcPropertyDataMapper, UnmappedElmhurstLabel +from domain.sap10_calculator.rdsap.cert_to_inputs import cert_to_inputs, SAP_10_2_SPEC_PRICES +from domain.sap10_calculator.calculator import calculate_sap_from_inputs + +src_root = Path('/workspaces/model/sap worksheets/additional with api 2') +for cd in sorted(src_root.iterdir()): + summary_pdfs = list(cd.glob("Summary_*.pdf")) + ws_pdfs = list(cd.glob("dr87-*.pdf")) + if not (summary_pdfs and ws_pdfs): + continue + out = subprocess.run(["pdftotext", str(ws_pdfs[0]), "-"], capture_output=True, text=True).stdout + m = re.search(r"SAP value\s*\n?\s*([\d.]+)", out) + ws_sap = float(m.group(1)) if m else None + try: + sn = ElmhurstSiteNotesExtractor(_summary_pdf_to_textract_style_pages(summary_pdfs[0])).extract() + epc = EpcPropertyDataMapper.from_elmhurst_site_notes(sn) + r = calculate_sap_from_inputs(cert_to_inputs(epc, prices=SAP_10_2_SPEC_PRICES)) + d = r.sap_score_continuous - ws_sap if ws_sap else 0 + tag = "✅" if abs(d) < 0.07 else "✗" + print(f" {cd.name:26s} ws={ws_sap} summary={r.sap_score_continuous:.4f} delta={d:+.4f} {tag}") + except UnmappedElmhurstLabel as e: + print(f" {cd.name:26s} ws={ws_sap} RAISES {e.field}={e.value!r}") + except Exception as e: + print(f" {cd.name:26s} ERROR {type(e).__name__}: {e}") +PY +``` + +Worksheet line-ref grep (for any cert's HLC table): + +```bash +pdftotext "/workspaces/model/sap worksheets/additional with api 2//dr87-0001-.pdf" - | sed -n '380,475p' +``` + +## Per-cert diagnostic recipe + +When a Summary chain test fails, the worksheet-anchored diff at HLC line refs +is the canonical first step: + +```python +# (paste in a probe shell after running cert_to_inputs/calculate) +ws = { + "doors_w_per_k": 4.4400, # (26) — pull from worksheet PDF + "windows_w_per_k": 6.8011, # (27) + "walls_w_per_k": 11.6150, # (29a) Main + Ext sum + "party_walls_w_per_k": 3.9050, # (32) Main + Ext sum + "heat_transfer_coefficient_w_per_k": 127.1578, # (39) avg +} +for k, w in ws.items(): + v = r.intermediate.get(k); print(f" {k:36s} {v:.4f} vs ws {w:.4f} d={v-w:+.4f}") +``` + +If fabric all matches and SAP is still off, the gap is in HW (line refs +(64)/(216)), internal gains (66..73), or HP path (Appendix N3.6 PSR). +Compare against the API path as a *signal* (not a target) — the previous +session's Slice 6 work has a worked example. + +## Test baselines at HEAD + +```bash +PYTHONPATH=/workspaces/model python -m pytest \ + backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \ + backend/documents_parser/tests/test_elmhurst_extractor.py \ + backend/documents_parser/tests/test_elmhurst_end_to_end.py \ + domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \ + domain/sap10_calculator/worksheet/tests/test_water_heating.py \ + domain/sap10_calculator/worksheet/tests/test_mean_internal_temperature.py \ + domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \ + domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \ + domain/sap10_calculator/tests/test_pcdb_table_362_lookup.py \ + domain/sap10_ml/tests/test_rdsap_uvalues.py \ + datatypes/epc/schema/tests/test_schema_loading.py \ + --no-cov -q +``` + +Expected: **689 pass + 10 pre-existing fails** (9 cert 001479 Layer 1 +hand-built skeleton + 1 pre-existing FEE). + +Pyright per-file baselines (unchanged across this session's slices): + +- `datatypes/epc/domain/mapper.py`: 32 +- `domain/sap10_calculator/worksheet/heat_transmission.py`: 13 +- `domain/sap10_calculator/rdsap/cert_to_inputs.py`: 35 +- `backend/documents_parser/elmhurst_extractor.py`: 0 +- `datatypes/epc/surveys/elmhurst_site_notes.py`: 0 +- `backend/documents_parser/tests/test_summary_pdf_mapper_chain.py`: 0 + +## Cohort closure status (carried forward) + +15 slices shipped in the previous session (S0380.1 → S0380.15), all on +branch `feature/per-cert-mapper-validation`: + +| Slice | Commit | What | +|---|---|---| +| S0380.1 | dca2ff09 | RED pin: chain test for cert 0380 vs worksheet 88.5104 | +| S0380.2 | b1a1bb8d | main_heating_category=4 for PCDB Table 362 heat pumps | +| S0380.3 | 575cdd53 | wall_insulation_type=6 for "FE Filled Cavity + External" | +| S0380.4 | 2d15951b | wall_insulation_thickness from Summary §7.0 (mapper+extractor+dataclass) | +| S0380.5 | d4d0aa24 | insulated_door_u_value from Summary §10 "Average U-value" | +| S0380.6 | 16fe2262 | Full §15.1 cylinder block (size+insulation+thickness+thermostat) | +| S0380.7 | b6ae18f3 | Re-pin chain test to ±0.07 spec-floor tolerance | +| S0380.8 | 4c06865f | "As Main Wall" extension inheritance copies insulation_thickness_mm | +| S0380.9 | 43a86d66 | Multi-array PV refactor (Renewables.pv_arrays list) | +| S0380.10 | f546bd5d | Chain tests for first-try closures (certs 3800, 9285) | +| S0380.11 | 5de41d58 | Zero-shower lodgings resolve to explicit 0 counts | +| S0380.12 | 2f5e70e3 | Alt-wall window-location parses pre-data slice | +| S0380.13 | 7f099d98 | Cantilever gate accepts "House" descriptive form | +| S0380.14 | f878bf51 | "Large" cylinder → cascade code 4 (closes Daikin cert 9418) | +| S0380.15 | d7ca179e | Strict-enum raising on unmapped cylinder labels | + +All 7 original ASHP cohort certs closed at ±0.07. Mean residual +0.044. + +## Memory references + +- [[project-summary-path-cohort-closure]] — cohort closure status table + and convergence trend. +- [[feedback-worksheet-not-api-reference]] — Summary-path targets pin to + the dr87 worksheet PDF, not the API EPC. +- [[feedback-cascade-pin-methodology]] — test the actual cascade against + PDF line refs at 1e-4 (or ±0.07 for the HP precision floor). +- [[feedback-zero-error-strict]] — every line ref of every output for + every fixture must pin against PDF at abs=1e-4 unless documented. +- [[feedback-commit-per-slice]] / [[feedback-aaa-test-convention]] / + [[feedback-abs-diff-over-pytest-approx]] / [[feedback-spec-citation-in-commits]] + / [[feedback-worksheet-shape-fidelity]] — slicing + test conventions. +- [[reference-rdsap10-worksheet-xlsx]] — canonical SAP 10.2 calculator + spreadsheet at repo root (`2026-05-19-17-18 RdSap10Worksheet.xlsx`) + for spec-conformance cross-checks. + +## First concrete actions + +1. **Folder-vs-cert sweep** is already 38/38 ✅ at handover. Re-run if + the dataset has changed. +2. **Run the Summary-path diagnostic probe** to confirm the baseline + reproduces (24 ✅, 9 small, 3 big, 2 raises). +3. **Fix the 'Normal' cylinder raise** as Slice 1 (lowest-investigation + start). Look at the worksheet `Cylinder Volume` for cert 2536, decide + the cascade enum, extend `_ELMHURST_CYLINDER_SIZE_LABEL_TO_SAP10`, + add a unit test + chain test for both raising certs. +4. **Bulk-pin the 24 first-try-closures** as Slice 2 (or split into a + couple of batches by 6-digit suffix range). +5. **Iterate on the 9 small-gap certs** one by one, worksheet-anchored + diagnostic each time. +6. **Tackle the 3 big-gap certs** with deeper investigation (likely + HP-routing or HW-cascade gaps). +7. **Fetch + persist API JSON for all 38** (`_fetch_certificate` → + `golden/.json`). Then mirror the Summary closure tests on the + API path. +8. **Add cross-mapper EPC parity tests** for the load-bearing fields + per the user's longstanding north-star. + +Good luck. The first concrete action is the folder-vs-cert sweep — +confirm the dataset is clean before starting any mapper slice.