# Handover — precision floors closed, only cantilever residual + cohort-2 tail remain Branch `feature/per-cert-mapper-validation`. This session shipped **5 slices (S0380.26 → S0380.30)** that closed the entire "spec-precision floor" cluster the prior handover ([HANDOVER_COHORT_2_PRECISION_FLOOR.md](HANDOVER_COHORT_2_PRECISION_FLOOR.md)) described. Two of those — the η interpolation bug and the glazing code table — were real spec-citation cascade bugs, not vendor precision drift. The user's [[feedback-one-e-minus-4-across-the-board]] posture (skeptical of "precision floor" framing) was correct on both. **HEAD at handover start:** `faf116bd` (Slice S0380.30). ## User's stated goal (carried forward verbatim) > I've added some more test cases, in the same format, in here: > `sap worksheets/additional with api 2` > We should check that the Elmhurst mapping works and then the api Target: **1e-4 across the board** for every cert per [[feedback-one-e-minus-4-across-the-board]] — HPs included. ## Slices shipped this session | Slice | Commit | What | |---|---|---| | **S0380.26** | `c144d444` | RdSAP10 §5.8 + Table 14 dry-lining R=0.17 adjustment on alt walls. Closes cert 7700 -0.44 → +5e-5. New `AlternativeWall.dry_lined: bool`, Elmhurst extractor reads "Alternative Wall N Dry-lining: Yes/No", mapper threads `wall_dry_lined="Y"`, `u_wall(dry_lined=True)` applies §5.8 R=0.17 at as-built bucket only. | | **S0380.27** | `012cbd18` | Thread `floor_construction_type` into `_main_floor_u_value` per heat_transmission's `effective_floor_description` rule. Closes cert 9796 +0.55 → +0.00174. Cert 8135 golden PE -4.96 → -0.07 kWh/m² (same broken-helper mechanism). | | **S0380.28** | `081bb8fd` | SAP 10.2 Appendix N footnote 43 (PDF p.101 line 7053) **reciprocal-linear** PSR η interpolation: `1/η = (1−t)/η_low + t/η_high`. Cascade was using linear-on-η directly. Closes the +0.03..+0.06 ASHP cluster across cohort-1 + cohort-2. | | **S0380.29** | `e27b923b` | Tighten `_ASHP_COHORT_CHAIN_TOLERANCE` 0.07 → 0.04 (~30% headroom over worst residual). | | **S0380.30** | `faf116bd` | Extend `_G_LIGHT_BY_GLAZING_CODE` + `_G_PERPENDICULAR_BY_GLAZING_TYPE` to cover RdSAP 21 codes 8-15 (per `datatypes/epc/domain/epc_codes.csv`). Closes the cohort-1 API path +0.014..+0.031 cluster (5 of 6 certs to <1e-4) — cohort uses code 14 (triple 2022+) which pre-slice fell to the DG default. | All on branch `feature/per-cert-mapper-validation`. Each includes unit tests, pyright net-zero on touched files. ## Cohort distributions at HEAD ### Cohort-2 (38-cert dataset, Summary path) | Bucket (\|Δ\|) | Session start | Now | Δ | |---|---|---|---| | exact (<1e-4) | 22 | **33** | **+11** | | 1e-4..0.07 | 14 | **5** | -9 | | 0.07..0.5 | 1 | **0** | -1 | | 0.5..1 | 1 | **0** | -1 | | 1..5 | 0 | 0 | = | | >5 | 0 | 0 | = | | RAISES | 0 | 0 | = | Cohort-2 ≤0.07 residuals remaining: | Cert | Δ SAP | Pattern | |---|---|---| | `2536-2525-0600-0788-2292` | +0.00072 | Shared 3-cert +0.0007 pattern | | `2800-7999-0322-4594-3563` | +0.00068 | (same) | | `4800-3992-0422-0599-3563` | +0.00068 | (same) | | `6835-3920-2509-0933-5226` | +0.01453 | PV cert (slices S0380.23+S0380.25 closed bulk; tail remains) | | `9380-2957-7490-2595-3141` | +0.02732 | Gas cert; unrelated to ASHP cluster | ### Cohort-1 ASHP cohort (7-cert dataset, Summary + API paths) | Cert | Summary delta | API delta | Notes | |---|---|---|---| | 0380 | +1e-6 | +9e-7 | EXACT both paths | | 0350 | +2.2e-5 | +2.2e-5 | EXACT both paths | | 2225 | -4.8e-5 | -4.8e-5 | EXACT both paths | | 2636 | **-0.01495** | **-0.01495** | Cantilever fixture — same residual on both paths | | 3800 | -2e-5 | -2e-5 | EXACT both paths | | 9285 | -3.4e-5 | -3.4e-5 | EXACT both paths | | 9418 | -4e-7 | -4e-7 | EXACT both paths | **Summary EPC ≡ API EPC** for the cascade outputs on 6 of 7 ASHP cohort certs (cross-mapper parity validated end-to-end). Cert 2636 is the same residual both ways — the bug is path-agnostic, in the cantilever cascade. ## ★ Open threads with diagnoses (priority order) ### 1. Cert 2636 cantilever residual (-0.01495 SAP, both paths) **Setup**: Mid-Terrace house age D, alt-wall + **cantilever** (3.74 m² / 9.5% of ground floor, first-floor-over-passageway). PCDB 104568 ASHP. Mid-terrace bungalow cantilever is the most complex geometry in the ASHP cohort. Worksheet "SAP value" 86.2641. **Diagnosis (NOT done this session — fresh investigation needed):** Cohort-1 ASHP cohort closes to <1e-4 on 6 of 7 certs after S0380.28 (reciprocal η) + S0380.30 (glazing codes). Cert 2636 stays at -0.015 on **both paths identically** — the cascade outputs are the same on Summary EPC and API EPC. So: - This is NOT a mapper bug (path-symmetric). - This is NOT η interpolation (PSR matches worksheet). - This is NOT a glazing-code bug (already closes the post-S0380.30 cluster). Likely candidates (worth probing in order): 1. **Cantilever exposed-floor U-value** — Table 20 lookup at cert 2636's geometry (3.74 m² cantilever / age D ground floor). Slice 102f-prep.9 added RdSAP cantilever exposed-floor detection; verify Table 20 row + insulation thickness routing. 2. **Cantilever in (31) total external area** — used for thermal bridging. The 3.74 m² should add to (31) once (heat_transmission.py:828-837 includes `cantilever_area` in `part_external_area`). 3. **Alt-wall window allocation** — cert 2636's §11 has the 1.19 m² alt-wall window (S0380.12 closed the window-location parser). Verify the area deduction lands on the alt wall, not the main wall. **Probe recipe** (analogous to the cert 9796 / cert 3336 probes earlier this session): ```python # Compare cascade line-by-line vs worksheet for cert 2636 # heat_transmission components (33)/(31)/(36)/(37), monthly (38)/(39)/(40), # (94) η_whole, (98)m space heating, and trace where the -0.015 enters. # If a non-zero delta appears between cascade and worksheet for any single # section line ref, that's the gap. If every component matches at 1e-4, # the residual must come from the η_main_heating step (post-N3.6 in-use # factor or similar). ``` ### 2. Cohort-2 cert 9380 (+0.027) and cert 6835 (+0.015) Both gas certs (no ASHP precision-floor mechanism). Likely cohort-2-specific mapper details surfaced after the ASHP cluster closed. - Cert 6835 had two prior slices (S0380.23 PV %-of-roof, S0380.25 SAP code 2111/2113 control type). Remaining +0.015 may be a small lighting/HW detail. - Cert 9380 hasn't had a dedicated slice yet — first place to look: Summary §11 windows lodgement, §14 heating controls, §15 thermal mass. Standard probe: compare cascade end-state (SAP, ECF, total_fuel_cost, main_heating_fuel_kwh, hot_water_kwh, lighting_kwh) vs worksheet section 1 readouts → isolate which line ref diverges. ### 3. Cohort-2 certs 2536 / 2800 / 4800 (+0.0007 shared pattern) Three certs at +0.00068..+0.00072 SAP — suspiciously consistent. Likely a shared small artifact (rounding step, fuel-cost decimal precision, internal gains rounding, etc.). Could close as one slice if the shared cause is found. ### 4. API path closure for cohort-2 (all 38 certs) Longstanding goal from the prior handover, NOT addressed this session. Process: 1. Fetch + persist JSON via `EpcClientService._fetch_certificate` (token in `backend/.env` as `OPEN_EPC_API_TOKEN`). 2. Mirror Summary chain tests on the API path. Pattern: see `backend/documents_parser/tests/test_summary_pdf_mapper_chain.py` `test_api_*` family. 3. Cross-mapper EPC parity (Summary EPC ≡ API EPC for load-bearing fields) — user's longstanding north star. **After S0380.30, the cohort-1 ASHP cohort already passes this parity at <1e-4 cascade output on 6 of 7 certs.** Cohort-2 should be similar but needs verification. ### 5. Tighten `_ASHP_COHORT_CHAIN_TOLERANCE` 0.04 → smaller Once cert 2636 closes (thread 1) the tolerance can drop to ~0.001 or similar. Current 0.04 sits at ~30% headroom over cert 2636's -0.015. ## Test baseline at HEAD ```bash PYTHONPATH=/workspaces/model python -m pytest \ backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \ backend/documents_parser/tests/test_elmhurst_extractor.py \ backend/documents_parser/tests/test_elmhurst_end_to_end.py \ domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \ domain/sap10_calculator/worksheet/tests/test_water_heating.py \ domain/sap10_calculator/worksheet/tests/test_mean_internal_temperature.py \ domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \ domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \ domain/sap10_calculator/tests/test_pcdb_table_362_lookup.py \ domain/sap10_ml/tests/test_rdsap_uvalues.py \ datatypes/epc/schema/tests/test_schema_loading.py \ --no-cov -q ``` Expected: **711 pass + 10 pre-existing fails** (9 × cert 001479 Layer 1 hand-built skeleton + 1 × pre-existing FEE round-trip). ## Diagnostic probe script Cohort-2 Summary path sweep (full distribution): ```bash PYTHONPATH=/workspaces/model python <<'PY' import re, subprocess from collections import defaultdict from pathlib import Path from backend.documents_parser.tests.test_summary_pdf_mapper_chain import _summary_pdf_to_textract_style_pages from backend.documents_parser.elmhurst_extractor import ElmhurstSiteNotesExtractor from datatypes.epc.domain.mapper import EpcPropertyDataMapper, UnmappedElmhurstLabel from domain.sap10_calculator.rdsap.cert_to_inputs import ( cert_to_inputs, SAP_10_2_SPEC_PRICES, UnresolvedPcdbCombiLoss, ) from domain.sap10_calculator.calculator import calculate_sap_from_inputs src_root = Path('/workspaces/model/sap worksheets/additional with api 2') buckets = defaultdict(list) def bucket(d): a = abs(d) if a < 1e-4: return "exact" if a < 0.07: return "<=0.07" if a < 0.5: return "0.07..0.5" if a < 1: return "0.5..1" if a < 5: return "1..5" return "5+" for cd in sorted(src_root.iterdir()): if not cd.is_dir() or cd.name.startswith('.'): continue sp = next(cd.glob("Summary_*.pdf"), None) ws_pdf = next(cd.glob("dr87-*.pdf"), None) if not (sp and ws_pdf): continue out = subprocess.run(["pdftotext", str(ws_pdf), "-"], capture_output=True, text=True).stdout m = re.search(r"SAP value\s*\n?\s*([\d.]+)", out) ws_sap = float(m.group(1)) if m else None try: sn = ElmhurstSiteNotesExtractor(_summary_pdf_to_textract_style_pages(sp)).extract() epc = EpcPropertyDataMapper.from_elmhurst_site_notes(sn) r = calculate_sap_from_inputs(cert_to_inputs(epc, prices=SAP_10_2_SPEC_PRICES)) d = r.sap_score_continuous - ws_sap buckets[bucket(d)].append((cd.name, d)) except UnresolvedPcdbCombiLoss as e: buckets["RAISES (Pcdb)"].append((cd.name, e.pcdf_index)) except UnmappedElmhurstLabel as e: buckets["RAISES (Elm)"].append((cd.name, str(e))) for b in ("exact", "<=0.07", "0.07..0.5", "0.5..1", "1..5", "5+", "RAISES (Pcdb)", "RAISES (Elm)"): if b in buckets: print(f"\n[{b}] {len(buckets[b])}:") for c, d in buckets[b]: print(f" {c} {d}") PY ``` ## Methodology — preserved conventions Carried forward unchanged from prior sessions: - **1e-4 across the board** ([[feedback-one-e-minus-4-across-the-board]]) - **Worksheet, not API, is the target** ([[feedback-worksheet-not-api-reference]]) - **One slice = one commit; stage by name** ([[feedback-commit-per-slice]]) - **AAA test convention** with literal `# Arrange / # Act / # Assert` ([[feedback-aaa-test-convention]]) - **`abs(diff) <= tol`** not `pytest.approx` ([[feedback-abs-diff-over-pytest-approx]]) - **Spec citation in commit messages** ([[feedback-spec-citation-in-commits]]) - **Strict-enum raises on unmapped labels / unresolved cascade dispatch** - **Pyright net-zero per file** ## Method that worked this session — verbatim The "spec-precision floor" framing from the prior handover was wrong on both bugs found this session. The pattern that worked: 1. **Pick the worst-residual cert** in the open thread. 2. **Probe cascade vs worksheet line-by-line** for every numbered line ref in the path (section 2 ventilation, section 3 fabric, section 7 MIT/η, section 8 space heating, section 9 fuel, section 10 cost). When every line matches except one, that line's input is the gap. 3. **Back-solve the worksheet to identify the implied parameter** (cert 3336: cascade η_space=237.31 vs ws-implied 236.74 → linear vs reciprocal interpolation; cert 9796: cascade (12)=0.1 vs ws (12)=0.2 → sealed vs unsealed verdict). 4. **Verify against spec** before claiming a fix. Both S0380.27 (RdSAP10 §5.8 + Table 14) and S0380.28 (SAP 10.2 Appendix N fn 43) found explicit spec citations matching the worksheet behavior — neither was reverse-engineering vendor implementation. The prior handover claimed "no public spec or BRE data field would distinguish [the +0.04 cluster]" — that was wrong. SAP 10.2 footnote 43 is explicit about reciprocal interpolation. **Be skeptical of "spec precision floor" framing.** ## Pyright baselines (post-S0380.30; net-zero per slice) - `datatypes/epc/domain/mapper.py`: 32 - `datatypes/epc/surveys/elmhurst_site_notes.py`: 0 - `backend/documents_parser/elmhurst_extractor.py`: 0 - `backend/documents_parser/tests/test_summary_pdf_mapper_chain.py`: 0 - `domain/sap10_calculator/rdsap/cert_to_inputs.py`: 35 - `domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py`: 12 - `domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py`: 1 - `domain/sap10_calculator/tables/pcdb/parser.py`: 0 - `domain/sap10_calculator/tests/test_pcdb_table_362_lookup.py`: 0 - `domain/sap10_calculator/worksheet/heat_transmission.py`: 13 - `domain/sap10_calculator/worksheet/internal_gains.py`: 0 - `domain/sap10_calculator/worksheet/solar_gains.py`: 0 - `domain/sap10_calculator/worksheet/tests/test_heat_transmission.py`: 71 - `domain/sap10_calculator/worksheet/tests/test_solar_gains.py`: 22 - `domain/sap10_calculator/worksheet/tests/test_water_heating.py`: 94 - `domain/sap10_ml/rdsap_uvalues.py`: 0 - `domain/sap10_ml/tests/test_rdsap_uvalues.py`: 66 ## Memory references Cross-session memories load automatically. Key ones for this work: - [[feedback-one-e-minus-4-across-the-board]] — user target is 1e-4 for HPs too. - [[feedback-worksheet-not-api-reference]] — Summary path pins to worksheet, not API. - [[feedback-cascade-pin-methodology]] — test the actual cascade against PDF line refs. - [[reference-sap10-spec-docs]] — full BRE technical paper set at `domain/sap10_calculator/docs/specs/`. - [[feedback-commit-per-slice]] / [[feedback-aaa-test-convention]] / [[feedback-abs-diff-over-pytest-approx]] / [[feedback-spec-citation-in-commits]] / [[feedback-worksheet-shape-fidelity]] / [[feedback-zero-error-strict]] — slicing + test conventions. - [[project-cohort-2-summary-path-closure]] — pre-S0380.26 cohort-2 state (now superseded by this handover). - [[project-summary-path-cohort-closure]] — cohort-1 ASHP closure context. ## First concrete actions for next agent 1. **Re-run the diagnostic probe** to confirm baseline reproduces (33 exact + 5 ≤0.07 + 0 elsewhere + 0 RAISES on cohort-2; 6/7 ASHP cohort at <1e-4 both paths; cert 2636 -0.015 both paths). 2. **Investigate cert 2636 cantilever residual** (thread 1): - Probe line-by-line cascade vs worksheet for cert 2636. The fact that Summary EPC and API EPC produce the same cascade output means this is in the cascade itself, not the mapper. - First section to check: `(28b)` / `(31)` cantilever floor area contribution → thermal bridging factor `y × (31)` → (36) → (37). - Second: alt-wall window allocation (cert 2636's §11 lodges one alt-wall window per S0380.12). 3. **Cohort-2 tail closure** (threads 2-3): - Cert 9380 +0.027 — fresh cert, hasn't had a dedicated slice. - Cert 6835 +0.015 — partially closed by S0380.23/S0380.25; tail remains. - Certs 2536/2800/4800 +0.0007 shared pattern — likely single shared cause. 4. **API path** for cohort-2 (thread 4) — fetch + persist 38 cert JSON, mirror Summary chain tests, add cross-mapper parity probes. Good luck. The Summary-path cohort is in excellent shape (33/38 exact at 1e-4). The ASHP cohort is essentially closed at the cascade level (6/7 both paths at <1e-4). The remaining work is small cohort-2 residuals + cert 2636 cantilever + API-path closure for cohort-2.