# Handover — Cohort-2 API path 38/38 closed; golden-residuals front next Branch `feature/per-cert-mapper-validation`. This session shipped **5 slices** (S0380.39 → S0380.43) that closed the **entire cohort-2 API-path cluster**. The branch is now at **750 pass + 0 fail** — the 3-cert +0.42..+0.44 cluster (0300/9380/1536) closed via two spec citations + the Decimal HALF_UP pattern, and cert 2102's -6.30 residual closed via the SAP 4a heating-type → spec fuel dispatch. **HEAD at handover start:** `6dccb15b` (Slice S0380.43). ## User's stated goal carried forward (from prior handover) > Tackle Thread 4 — API-path closure for cohort-2. … Tolerance: 1e-4 > vs each cert's worksheet SAP value. … Bigger slices are appropriate > here. … Drive golden-fixture residuals to ~0. Threads 4 (cohort-2 API path closure) is **DONE**. The next thread — **golden-fixture residuals → ~0** — is now the open front. ## Slices shipped this session (handover-doc → HEAD) | Slice | Commit | Closes | Spec citation | |---|---|---|---| | **S0380.39** | `22ae6f4d` | Bulk-fetched 38 cohort-2 API JSONs via `scripts/fetch_cohort2_api_jsons.py` | (infra) | | **S0380.40** | `ff25746f` | Parametrized API-path chain test mirroring Summary sweep; 34/38 immediate | (test infra) | | **S0380.41** | `a96e6765` | Closed 0300/9380 (+0.43/+0.42 → <1e-4); 1536 partial close | RdSAP-Schema-21.0.0 glazed_type=1 = "DG installed before 2002 EAW" → SAP 10.2 Table 6b cascade code 2 (DG pre-2002, g_L=0.80, NOT single 0.90). RdSAP 10 Table 24 row 2 (PVC/wooden, 16+) → U=2.7 | | **S0380.42** | `e1b7b30c` | Cert 1536 +0.0015 → -1e-6 | RdSAP 10 §15 p.66 — Decimal HALF_UP per-window area at the 0.005 boundary (0.65 × 0.70 = 0.4550 exact / 0.45499... float drops to 0.45) | | **S0380.43** | `6dccb15b` | Cert 2102 -6.30 → +5e-5 | SAP 10.2 Appendix M Table 4a code 631 ("Open fire in grate") + BS EN 13229:2001 inset-appliance class — solid fuel; Elmhurst Summary maps to Table 32 code 11 (House coal) | All on branch `feature/per-cert-mapper-validation`. Each includes spec citation in commit message, unit-level diff probes, AAA test convention, pyright net-zero per touched file. ## Cohort distributions at HEAD `6dccb15b` ### Cohort-2 (38-cert dataset, API path) | Bucket (\|Δ\|) | Session start | Now | Δ | |---|---|---|---| | exact (<1e-4) | 34 | **38** | **+4** | | 1e-4..0.07 | 0 | **0** | = | | 0.07..0.5 | 3 | **0** | -3 | | 0.5..1 | 0 | **0** | = | | 1..5 | 0 | **0** | = | | >5 | 1 | **0** | -1 | | RAISES | 0 | **0** | = | ### Cohort-2 Summary path (unchanged) 38/38 < 1e-4 — closed in prior session's S0380.31..38. ### Cohort-1 ASHP (9 certs, both paths) 9/9 < 1e-4 on both paths. Worst residual: cert 2225 −4.8e-5 (binding constraint on `_ASHP_COHORT_CHAIN_TOLERANCE` tightening — see below). ## Cross-mapper parity at the cascade — established [[feedback-cross-mapper-parity-via-cascade]] now holds for all 38 cohort-2 certs: API and Summary paths both produce SAP within 1e-4 of each other AND of the worksheet, at the cascade output. The underlying EpcPropertyData may differ structurally between mappers (noise on cosmetic fields, schema-version int/str encoding), but the cascade output is the load-bearing equivalence check, and it's fully agreed. ## Tolerance tightening — deferred The prior handover proposed tightening `_ASHP_COHORT_CHAIN_TOLERANCE` from 1e-4 to ~1e-5. **Not viable at HEAD.** The cohort-wide worst residuals are: - Cohort-1 ASHP API path: cert 2225 -4.8e-5 - Cohort-2 Summary path: cert 2102 -4.9e-5 (matches API) - Cohort-2 API path: cert 2102 +4.9e-5 So 1e-5 has no headroom. Realistic next floor is ~5e-5 (binding on cert 2225's -4.8e-5). Tightening to 5e-5 gives ~4% headroom — too thin to be robust to unrelated cascade drift. Tightening to ~6e-5 gives ~25% headroom but is an awkward number. **Decision:** leave `_ASHP_COHORT_CHAIN_TOLERANCE = 1e-4` and the cohort-2 strict tests at inline `1e-4`. Tightening below 1e-4 requires closing cert 2225 specifically (per-cert investigation). ## ★ Open front: golden-residuals → ~0 [`test_golden_cert_residual_matches_pin`](../rdsap/tests/test_golden_fixtures.py) pins **PE Δ and CO2 Δ** vs the gov.uk-lodged values (NOT the worksheet — this is a different reference point from the chain tests). Pins currently sit at: | Cert | actual_sap | sap_resid | pe_resid (kWh/m²) | co2_resid (t/yr) | Notes | |---|---:|---:|---:|---:|---| | 0240 | 73 | -14 | +12.49 | +0.70 | RR extraction, multi-subsystem gaps | | 0300 | 78 | 0 | +8.28 | -0.25 | DSP showers + flue (closed at HEAD) | | 0390 | 60 | -7 | -26.01 | -2.52 | Firebird oil combi PCDF 9005 | | 0535 | ... | ... | ... | ... | cert 001479 fixture | | 2130 | ... | ... | -38.63 | +0.30 | Largest pre-existing residual | | 6035 | ... | ... | +46.76 | +1.07 | Largest pre-existing residual | | **ASHP cohort (the highest-value cluster)** | | | | | | | 0350 | 88 | 0 | -7.78 | +0.17 | Mitsubishi PUZ-WM50VHA | | 0380 | 88 | 0 | -14.60 | +0.28 | Mitsubishi PUZ-WM50VHA | | 2225 | 89 | 0 | -11.77 | +0.26 | Mitsubishi PUZ-WM50VHA | | 2636 | 86 | 0 | -9.65 | +0.22 | Mitsubishi PUZ-WM50VHA | | 3800 | 86 | 0 | -9.61 | +0.26 | Mitsubishi PUZ-WM50VHA | | 9285 | 84 | 0 | -7.96 | +0.16 | Mitsubishi PUZ-WM50VHA | | 9418 | 84 | 0 | -7.30 | +0.16 | Daikin EDLQ05CAV3 | The ASHP cluster shape: - All 7 certs hit `sap_resid=0` (chain-test work closed this). - PE residual: -7..-15 kWh/m² UNDER-count (cascade < lodged). - CO2 residual: +0.16..+0.28 t/yr OVER-count (cascade > lodged). - Same magnitudes across 7 certs with the same PCDB heat pump strongly suggests a single shared cascade gap in the PE/CO2 factor cascade for ASHP electricity. ### Diagnostic probe for cert 0380 at HEAD ``` Cert 0380 (60.43 m² TFA): Lodged PE: 56 kWh/m² CO2: 0.3 t/yr Calc demand: PE=41.40 kWh/m² CO2=0.578 t/yr PE residual: -14.60 CO2 residual: +0.28 Main fuel: 29 (Electricity, mains) Main heating category: 4 (Heat pump) Secondary fuel: 29 (Electricity) Secondary heating: 691 (Portable electric heater default) ``` ### Hypotheses The user's prior diagnosis (from earlier handover): > This smells like a single cascade gap in either the SAP 10.2 > Appendix L1 primary-energy lookup for electricity (likely a missing > distribution-loss factor or wrong tariff routing) or in the §12 > Table 12d monthly electricity factor cascade for heat pumps. Additional shape evidence: - PE under-count + CO2 over-count for the same fuel is structurally unusual. If both were PE-factor-driven, they'd move in the same direction. The split direction suggests the lodged values are using **different factors** than the cascade (possibly an older SAP factor vs current SAP 10.2). - 14.6 kWh/m² × 60.43 m² = **882 kWh/yr** PE shortfall on cert 0380. - 0.28 t/yr × 1000 = **280 kg/yr** CO2 over-count. ### Slice plan for the ASHP PE cluster **Probe 1 — Inspect the SAP 10.2 Table 12 PE factor lookup.** Find where the cascade resolves PE-factor-for-electricity (likely in `internal_gains.py` or `cert_to_inputs.py` `_effective_monthly_pe_ factor` or similar). Verify the factor used matches the lodged EPC's expected value (1.501 standard / 1.500 SAP 2012 / etc). **Probe 2 — Diff cert 0380 calc vs PCDB-listed heat-pump efficiency.** The heat pump (Mitsubishi PUZ-WM50VHA PCDB 104568) has a documented SPF (seasonal performance factor). Check whether the cascade applies the correct SPF and the lodged-vs-cascade electricity-consumption delta accounts for the PE shortfall. **Probe 3 — Worksheet PE check.** The cert 0380 worksheet PDF (likely `dr87-0001-000899.pdf` in the cohort-2 dir) lodges the worksheet's PE value at the bottom. Compare cascade PE to worksheet PE — if they agree, the lodgement is wrong (gov.uk computed differently); if they disagree, the cascade has a real gap. ### Pre-existing large residuals (lower priority) - Cert 6035 PE +46.76 — handover claim of multi-subsystem gaps; not the same cluster cause as ASHP. - Cert 2130 PE -38.63 — also pre-existing; likely RR + PV + electricity. These should be closed AFTER the ASHP cluster (which has a single clean root cause). ## Conventions preserved (carry forward) - **1e-4 across the board** ([[feedback-one-e-minus-4-across-the-board]]) - **Worksheet, not API, is the target** for chain tests ([[feedback-worksheet-not-api-reference]]) — except for the golden fixtures, which pin against gov.uk-lodged PE/CO2. - **Cross-mapper parity via cascade equivalence** ([[feedback-cross-mapper-parity-via-cascade]]). Now fully established for cohort-2. - **Spec-floor skepticism** ([[feedback-spec-floor-skepticism]]). - **Bigger slices OK for uniform-cohort work** ([[feedback-bigger-slices-for-uniform-work]]). - **Golden residuals → ~0** ([[feedback-golden-residuals-near-zero]]). The 0.01 PE / 0.001 CO2 absolute tolerances stay; what changes is the **expected residual itself** (pinning at the actual delta vs zero). - **AAA test convention** with literal `# Arrange / # Act / # Assert` ([[feedback-aaa-test-convention]]). - **`abs(diff) <= tol`** not `pytest.approx` ([[feedback-abs-diff-over-pytest-approx]]). - **Spec citation in commit messages** ([[feedback-spec-citation-in-commits]]). - **One slice = one commit; stage by name** ([[feedback-commit-per-slice]]). - **Strict-enum raises** on unmapped labels / unresolved dispatch. - **Pyright net-zero per touched file**. ## Lesson learned: GOV.UK RdSAP 21 enum ≠ cascade enum The cascade's `_G_LIGHT_BY_GLAZING_CODE` table in `internal_gains.py` is keyed on the SAP 10.2 Table 6b enum that the **Elmhurst extractor** produces (`_ELMHURST_GLAZING_LABEL_TO_SAP10`). The API mapper currently passes the raw GOV.UK RdSAP 21 enum straight through. For codes 2/3/13/14 this coincidentally works (both enums agree on g_L for those codes); for code 1 it doesn't (GOV.UK 1 = DG pre-2002, SAP 10.2 1 = single). Slice S0380.41 added `_API_TO_SAP10_CASCADE_GLAZING_CODE` to remap RdSAP 21 codes to SAP 10.2 codes for the SapWindow.glazing_type field that drives daylight g_L. Currently only code 1 remaps; other codes pass through. **Future cert lodgements may surface analogous divergences** (e.g. RdSAP 21 code 5 = single, but cascade code 5 gets 0.80 — a similar mismatch waiting to happen). Add remap entries as those codes appear in fixtures. ## Lesson learned: Decimal HALF_UP extends to per-window areas S0380.34/35 closed the Σ-then-round Decimal pattern (gross wall, party wall, kWp, living area). S0380.42 closed the round-per-then-Σ pattern for per-window areas: `_decimal_round_half_up_product` was added at three cascade sites (heat_transmission's windows_w_per_k + per-bp window-area accumulation; internal_gains' daylight g_L; solar_gains' window solar). Any future +0.0007-scale residual in per-window areas — or analogous Decimal boundary cases for OTHER elements (doors, alt-walls, RR sub-areas) — is the same class of bug, fixed the same way. ## Lesson learned: SAP heating-type → spec fuel dispatch S0380.43 added `_API_SECONDARY_HEATING_SPEC_FUEL` for SAP 631 ("Open fire in grate"). The pattern is incremental: a per-code dispatch dict that overrides the lodged fuel ONLY when (a) the heating type implies a specific fuel category, AND (b) the lodged fuel is incompatible (electric for a solid-fuel heater). Future cohort certs surfacing other inconsistencies (e.g. SAP 632 "Open fire" with electric fuel) can extend the dispatch without touching the routing logic. ## Test baseline at HEAD ```bash PYTHONPATH=/workspaces/model python -m pytest \ backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \ backend/documents_parser/tests/test_elmhurst_extractor.py \ backend/documents_parser/tests/test_elmhurst_end_to_end.py \ domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \ domain/sap10_calculator/worksheet/tests/test_water_heating.py \ domain/sap10_calculator/worksheet/tests/test_mean_internal_temperature.py \ domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \ domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \ domain/sap10_calculator/tests/test_pcdb_table_362_lookup.py \ domain/sap10_ml/tests/test_rdsap_uvalues.py \ datatypes/epc/schema/tests/test_schema_loading.py \ --no-cov -q ``` Expected: **750 pass + 0 fails**. ## First concrete actions for the next agent 1. **Re-run the diagnostic probe** to confirm baseline reproduces (38/38 cohort-2 both paths < 1e-4; 9/9 ASHP cohort-1 < 1e-4; 750 pass + 0 fails). 2. **Probe 1 (PE factor lookup)** — find the cascade's PE-factor resolution for electricity heat pumps. The most likely entry points: search `cert_to_inputs.py` for `primary_energy`, `pe_factor`, `effective_monthly_pe_factor`. Compare the resolved factor against SAP 10.2 Table 12 "Standard electricity" (PE = 1.501) and ASHP-specific entries. 3. **Probe 2 (worksheet vs cascade PE)** — extract the PE value from cert 0380's worksheet PDF (`dr87-0001-000899.pdf` under `sap worksheets/additional with api 2/0380-2530-6150-2326-4161/`). Compare against cascade output 41.40 kWh/m² and lodged 56 kWh/m². This isolates "cascade vs spec" from "lodgement vs spec". 4. **Probe 3 (CO2 factor)** — similar probe for CO2 factor cascade. The cluster's +0.16..+0.28 t/yr over-count is the same shape as PE under-count, suggesting both come from the same factor lookup. 5. **If the cluster has a single root cause** (likely per the uniform shape), close it in ONE slice. Re-pin all 7 ASHP fixture `expected_pe_resid_kwh_per_m2` and `expected_co2_resid_tonnes_per_yr` values to the new residuals (which should drop to ~0.01). 6. **Then move to the pre-existing residual cluster** (certs 6035, 2130, 0240) — these have multi-subsystem gaps that need per-cert investigation. Less uniform than the ASHP cluster. Good luck. The cohort-2 API closure is COMPLETE; the chain-test infrastructure is robust and battle-tested across 38 + 9 certs spanning gas/oil/heat-pump main heating, all RdSAP 21 schema variants, and multiple lodgement-source quirks. The golden-residuals front is the next high-value workstream, and the ASHP cluster is the cleanest single thread.