From 73fedc0ecde27fc61fda67c008ca6243ac941b4d Mon Sep 17 00:00:00 2001 From: Khalim Conn-Kowlessar Date: Thu, 28 May 2026 10:33:17 +0000 Subject: [PATCH] docs: handover for cohort-2 closure + precision-floor next steps MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Captures 5 slices shipped this session (S0380.21..25): - Table 3a rows 1+4 + PCDB keep-hot dispatch - Per-BP roof exposure (Ext1 flat roof on flats) - RdSAP §11.1 b) % of roof area PV synthesis - SAP code 631 → house coal secondary fuel - SAP codes 2111/2113 → control type 2 Cohort-2 outcome: 22/38 exact (<1e-4), max residual ±0.55 SAP, 0 RAISES, 0 big-gaps. All structural cascade gaps closed. Open threads diagnosed in detail: 1. Cert 7700 -0.44 SAP — wall U code conflict (_WALL_INSULATION_NONE=4 vs Elmhurst "As Built"=4). Wider than a single slice; needs regression testing. 2. Cert 9796 +0.55 SAP — MIT precision floor (Mid-Terrace bungalow + HP, +0.06°C across all months). Same mechanism as cohort-1 HP-COP residuals. 3. API-path closure for all 38 certs (deferred). 4. Tighten cohort-1 chain tests to 1e-4 once thread 2 closes. Co-Authored-By: Claude Opus 4.7 --- .../docs/HANDOVER_COHORT_2_PRECISION_FLOOR.md | 313 ++++++++++++++++++ 1 file changed, 313 insertions(+) create mode 100644 domain/sap10_calculator/docs/HANDOVER_COHORT_2_PRECISION_FLOOR.md diff --git a/domain/sap10_calculator/docs/HANDOVER_COHORT_2_PRECISION_FLOOR.md b/domain/sap10_calculator/docs/HANDOVER_COHORT_2_PRECISION_FLOOR.md new file mode 100644 index 00000000..98bf9a43 --- /dev/null +++ b/domain/sap10_calculator/docs/HANDOVER_COHORT_2_PRECISION_FLOOR.md @@ -0,0 +1,313 @@ +# Handover — cohort-2 closure (5 slices shipped) + precision-floor next steps + +Branch `feature/per-cert-mapper-validation`. This session shipped +**5 slices** (S0380.21 → S0380.25) closing the bulk of the cohort-2 +residuals. All RAISES are gone, all ±5+ big-gaps closed. Picks up +from `HANDOVER_TABLE_3A_NO_KEEP_HOT.md`. + +**HEAD at handover start:** `36a3219d` (Slice S0380.25: SAP codes +2111/2113 are control type 2, not type 3 — closes certs 0652 + 6835). + +## User's stated goal (carried forward verbatim) + +> I've added some more test cases, in the same format, in here: +> `sap worksheets/additional with api 2` +> We should check that the Elmhurst mapping works and then the api + +Target: **1e-4 across the board** for every cert per +[[feedback-one-e-minus-4-across-the-board]] — HPs included. + +API-path closure (cohort-2 API JSON fetch + chain tests + cross-mapper +EPC parity) is **still deferred** — Summary path is shippable and +well-instrumented; the API path is fetchable but not yet mirrored. + +## Slices shipped this session + +| Slice | Commit | What | +|---|---|---| +| S0380.21 | `0d3fb980` | Table 3a row 1 + row 4 + PCDB keep-hot dispatch. Closes 9 of 11 cohort-2 RAISES exactly. Re-adds cert `0390-2954-3640-2196-4175` to the golden cohort. | +| S0380.22 | `1a25ea67` | Per-BP roof exposure — `roof_construction_type` containing "another dwelling above" suppresses that BP's roof regardless of dwelling-level flag. Closes cert `0036-6325-1100-0063-1226` Ext1 flat roof (+0.30 → -6e-6). | +| S0380.23 | `8dee1918` | RdSAP 10 §11.1 b) "% of roof area" PV synthesis — kWp = 0.12 × roof_area_for_heat_loss × pct / cos(35° for pitched). Closes cert `6835-3920-2509-0933-5226` -13.37 → +0.72. | +| S0380.24 | `c145953f` | SAP code 631 ("Open fire in grate") → house coal secondary fuel (Table 12 code 11, 3.67 p/kWh). Closes cert `2102-3018-0205-7886-5204` -15.81 → +5e-5. Also narrows gas range to 601-613 per spec. | +| S0380.25 | `36a3219d` | SAP codes 2111 ("TRVs and bypass") and 2113 ("Room thermostat and TRVs") are **control type 2** per SAP 10.2 spec page 171 Table 4e, not type 3. Closes certs `0652-3022-1205-2826-1200` (+1.93 → -1e-5) and `6835-3920-2509-0933-5226` (+0.72 → +0.015). | + +All on branch `feature/per-cert-mapper-validation`. Each slice +includes unit tests, pyright net-zero on touched files. + +## Cohort-2 distribution at HEAD + +Cohort-2 (38-cert dataset) Summary-path probe: + +| Bucket (\|Δ\|) | Pre-session | Now | Δ | +|---|---|---|---| +| exact (<1e-4) | 10 | **22** | **+12** | +| 1e-4..0.07 | 13 | **14** | +1 | +| 0.07..0.5 | 2 | **1** | -1 | +| 0.5..1 | 1 | **1** | = | +| 1..5 | 0 | **0** | = | +| >5 | 1 | **0** | -1 | +| **RAISES (PCDB)** | 11 | **0** | **-11** | + +Cohort-1 (7-ASHP + 2 newer) untouched: all still at ±0.04 SAP. No +regressions from any slice. + +## ★ Open threads with diagnoses (priority order) + +### 1. Cert 7700-3362-0922-7022-3563 (-0.44 SAP, gas PCDF 17741) + +**Diagnosed root cause — code conflict:** + +`heat_transmission.py:88` defines `_WALL_INSULATION_NONE = 4` — +heat_transmission treats `wall_insulation_type = 4` as "no insulation +present" (cascade routes through `u_wall` uninsulated branch). + +But `mapper.py:2064-2073` maps Elmhurst `"A As Built"` insulation code +to SAP10 enum value **4** ("As built / assumed (default cascade)") — +the mapper's intent is "use cascade defaults for age-band + +construction" (which for an OLD cavity wall means uninsulated → U=1.50 +age C). The two interpretations happen to agree for cavity walls but +disagree for solid + other constructions. + +For cert 7700's alt wall (cavity + "As Built"): +- Mapper sets `wall_insulation_type = 4` (intent: use defaults) +- Cascade interprets 4 as "no insulation" → `u_wall` returns 1.50 +- Worksheet uses U=1.20 for the same wall (Table 16 cavity intermediate + thickness OR an Elmhurst-specific midpoint) + +Cascade walls = 75.62 W/K; worksheet (29a) sum = 71.29 W/K; Δ +4.33. +That's almost the entire fabric (33) gap (148.72 - 144.38 = +4.34). +And the entire +0.44 SAP residual. + +**Why this is wider than a single slice:** + +`_WALL_INSULATION_NONE = 4` is also used at line 568 for the MAIN BP +walls path (not just alt). Changing the enum mapping touches both the +main + alt wall paths. Cohort-1 + cohort-2 certs may rely on the +current behavior (e.g. cert 0036 closes exactly with the current +mapping, so its main wall + alt wall both happen to fall in the +right branches). + +**Suggested approach:** +- Audit Table 6 / Table 16 for cavity walls — what's the spec-correct + U for "As Built, age C, no measured thickness"? Worksheet's 1.20 + isn't an obvious Table 16 row. +- Consider adding a separate `is_as_built: bool` flag on + `SapAlternativeWall` rather than overloading + `wall_insulation_type=4` for two meanings. +- Or: rename the constant to `_WALL_INSULATION_AS_BUILT = 4` and + verify cohort 1 + cohort 2 regressions. +- Cert 7700's main wall U (cascade 0.53 vs worksheet 0.70) is ALSO + off — same root cause likely. + +### 2. Cert 9796-3058-6205-0346-9200 (+0.55 SAP, ASHP PCDF 104568) + +**Diagnosed — no single bug:** + +Cascade matches worksheet exactly on: +- Fabric heat loss (33) = 62.03 W/K ✓ +- Ventilation (38) = 47.87 W/K Jan ✓ +- Internal gains (73) = 429.85 W Jan ✓ (full cert_to_inputs path) +- Solar gains (83) = 65.44 W Jan ✓ +- PV generation = 1493.88 vs worksheet 1492.33 (Δ <0.1%) + +But MIT (92) Jan: cascade **18.51** vs worksheet **18.45** → Δ ++0.06°C. Consistent +0.05..+0.09°C offset across all months. + +This is the "Appendix N3.6 PSR-precision floor" residual the older +handover described — except the user rejects that framing per +[[feedback-one-e-minus-4-across-the-board]]. Cohort-1 ASHP certs hit ++0.001..+0.04 SAP with similar mechanism; cert 9796 is at +0.55. + +**Why cert 9796 is an outlier:** + +It's the only **Mid-Terrace bungalow** with PCDF 104568 in the cohort. +Other PCDF 104568 certs (4800, 2800, 3336) are End-Terrace bungalows +and close to <0.04 SAP. Possibly the residual scales with party-wall +count or some interaction with extended-heating allocation. Worth +checking whether the cascade's `_zone_mean_temp_with_per_zone_eta` η +calculation drifts at this particular HLC/PSR/storey combination. + +**Suggested next step:** Pin η for cert 9796 line-by-line against +worksheet (86)/(89) — η_living + η_elsewhere — and trace where the +~0.005 difference enters. + +### 3. HP-COP residual on 10 triple-glazed HP certs (+0.001..+0.04 SAP) + +Same precision-floor mechanism as cert 9796 but smaller. Cohort-1 ASHP +chain tests are currently pinned at `_ASHP_COHORT_CHAIN_TOLERANCE += 0.07`. Tightening to 1e-4 requires closing the MIT precision floor. + +**Suggested approach:** Once cert 9796 root cause is found, the same +fix likely tightens these. + +### 4. API-path closure for all 38 cohort-2 certs + +User's longstanding goal. Process: +1. Fetch + persist JSON via `EpcClientService._fetch_certificate` (token in + `backend/.env` as `OPEN_EPC_API_TOKEN`). +2. Mirror Summary chain tests on the API path + (`backend/documents_parser/tests/test_summary_pdf_mapper_chain.py` + pattern). +3. Cross-mapper EPC parity (Summary EPC ≡ API EPC for load-bearing + fields) — user's longstanding north star. + +### 5. Tighten cohort-1 ASHP chain tests to 1e-4 + +Once thread 3 closes, drop the ±0.07 tolerance pin in +`backend/documents_parser/tests/test_summary_pdf_mapper_chain.py +::_ASHP_COHORT_CHAIN_TOLERANCE`. + +## Methodology — preserved conventions + +Carried forward unchanged from prior sessions: + +- **1e-4 across the board** ([[feedback-one-e-minus-4-across-the-board]]) + — HP certs target the same precision as boilers; reject any + "calculator precision floor" framing. +- **Worksheet, not API, is the target** ([[feedback-worksheet-not-api-reference]]). +- **One slice = one commit; stage by name** ([[feedback-commit-per-slice]]). +- **AAA test convention** with literal `# Arrange / # Act / # Assert` + ([[feedback-aaa-test-convention]]). +- **`abs(diff) <= tol`** not `pytest.approx` ([[feedback-abs-diff-over-pytest-approx]]). +- **Spec citation in commit messages** ([[feedback-spec-citation-in-commits]]). +- **Strict-enum raises on unmapped labels / unresolved cascade dispatch** + (Slices S0380.15, S0380.17, S0380.20 established the pattern). +- **Pyright net-zero per file**. + +## Test baseline at HEAD + +```bash +PYTHONPATH=/workspaces/model python -m pytest \ + backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \ + backend/documents_parser/tests/test_elmhurst_extractor.py \ + backend/documents_parser/tests/test_elmhurst_end_to_end.py \ + domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \ + domain/sap10_calculator/worksheet/tests/test_water_heating.py \ + domain/sap10_calculator/worksheet/tests/test_mean_internal_temperature.py \ + domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \ + domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \ + domain/sap10_calculator/tests/test_pcdb_table_362_lookup.py \ + domain/sap10_ml/tests/test_rdsap_uvalues.py \ + datatypes/epc/schema/tests/test_schema_loading.py \ + --no-cov -q +``` + +Expected: **704 pass + 10 pre-existing fails** (9 × cert 001479 Layer 1 +hand-built skeleton + 1 × pre-existing FEE round-trip). + +Pyright per-file baselines (touched files; net-zero on each): +- `datatypes/epc/domain/mapper.py`: 32 +- `datatypes/epc/surveys/elmhurst_site_notes.py`: 0 +- `backend/documents_parser/elmhurst_extractor.py`: 0 +- `backend/documents_parser/tests/test_summary_pdf_mapper_chain.py`: 0 +- `domain/sap10_calculator/rdsap/cert_to_inputs.py`: 35 +- `domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py`: 13 +- `domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py`: 1 +- `domain/sap10_calculator/worksheet/water_heating.py`: 1 +- `domain/sap10_calculator/worksheet/heat_transmission.py`: 13 +- `domain/sap10_calculator/worksheet/tests/test_water_heating.py`: 94 +- `domain/sap10_calculator/worksheet/tests/test_heat_transmission.py`: 71 + +## Diagnostic probe script (carried forward from prior handover) + +```bash +PYTHONPATH=/workspaces/model python <<'PY' +import re, subprocess +from collections import defaultdict +from pathlib import Path +from backend.documents_parser.tests.test_summary_pdf_mapper_chain import _summary_pdf_to_textract_style_pages +from backend.documents_parser.elmhurst_extractor import ElmhurstSiteNotesExtractor +from datatypes.epc.domain.mapper import EpcPropertyDataMapper, UnmappedElmhurstLabel +from domain.sap10_calculator.rdsap.cert_to_inputs import ( + cert_to_inputs, SAP_10_2_SPEC_PRICES, UnresolvedPcdbCombiLoss, +) +from domain.sap10_calculator.calculator import calculate_sap_from_inputs + +src_root = Path('/workspaces/model/sap worksheets/additional with api 2') +buckets = defaultdict(list) +def bucket(d): + a = abs(d) + if a < 1e-4: return "exact" + if a < 0.07: return "<=0.07" + if a < 0.5: return "0.07..0.5" + if a < 1: return "0.5..1" + if a < 5: return "1..5" + return "5+" +for cd in sorted(src_root.iterdir()): + if not cd.is_dir() or cd.name.startswith('.'): continue + sp = next(cd.glob("Summary_*.pdf"), None) + ws_pdf = next(cd.glob("dr87-*.pdf"), None) + if not (sp and ws_pdf): continue + out = subprocess.run(["pdftotext", str(ws_pdf), "-"], capture_output=True, text=True).stdout + m = re.search(r"SAP value\s*\n?\s*([\d.]+)", out) + ws_sap = float(m.group(1)) if m else None + try: + sn = ElmhurstSiteNotesExtractor(_summary_pdf_to_textract_style_pages(sp)).extract() + epc = EpcPropertyDataMapper.from_elmhurst_site_notes(sn) + r = calculate_sap_from_inputs(cert_to_inputs(epc, prices=SAP_10_2_SPEC_PRICES)) + d = r.sap_score_continuous - ws_sap + buckets[bucket(d)].append((cd.name, d)) + except UnresolvedPcdbCombiLoss as e: + buckets["RAISES (Pcdb)"].append((cd.name, e.pcdf_index)) + except UnmappedElmhurstLabel as e: + buckets["RAISES (Elm)"].append((cd.name, str(e))) + +for b in ("exact", "<=0.07", "0.07..0.5", "0.5..1", "1..5", "5+", "RAISES (Pcdb)", "RAISES (Elm)"): + if b in buckets: + print(f"\n[{b}] {len(buckets[b])}:") + for c, d in buckets[b]: + print(f" {c} {d}") +PY +``` + +Mirror against `/workspaces/model/sap worksheets/Additional data with api` +for cohort-1 cross-checks. + +## Memory references + +Cross-session memories load automatically. Key ones for this work: + +- [[feedback-one-e-minus-4-across-the-board]] — user target is 1e-4 for HPs too. +- [[project-instantaneous-shower-cascade-gap]] — closed by S0380.21. +- [[project-summary-path-cohort-closure]] — original 7-cert ASHP cohort context. +- [[feedback-worksheet-not-api-reference]] — Summary path pins to worksheet, not API. +- [[feedback-cascade-pin-methodology]] — test the actual cascade against PDF line refs. +- [[reference-sap10-spec-docs]] — full BRE technical paper set at + `domain/sap10_calculator/docs/specs/`. +- [[feedback-commit-per-slice]] / [[feedback-aaa-test-convention]] / + [[feedback-abs-diff-over-pytest-approx]] / [[feedback-spec-citation-in-commits]] / + [[feedback-worksheet-shape-fidelity]] / [[feedback-zero-error-strict]] — + slicing + test conventions. + +## First concrete actions for next agent + +1. **Re-run the diagnostic probe** to confirm baseline reproduces + (22 exact + 14 ≤±0.07 + 1 ±0.07..0.5 + 1 ±0.5..1 + 0 RAISES). + +2. **Investigate cert 7700 wall-U code conflict** (thread 1). + Concrete steps: + - Read `heat_transmission.py:80-95` (constant block) + + `heat_transmission.py:560-580` (main wall path) + + `heat_transmission.py:878-905` (`_alt_wall_w_per_k`). + - Read `mapper.py:2064-2073` (insulation enum) + + `mapper.py:2866-2887` (`_map_elmhurst_alternative_wall`). + - Probe the worksheet's U=1.20 for cert 7700 alt wall against + RdSAP 10 spec Table 16 (cavity walls) — figure out which row + matches and why the cascade picks 1.50. + - Probe cert 7700 main wall U=0.70 (cascade) vs worksheet 0.70 — does + the main path have a similar precision issue? + - **Critically**: run the full diagnostic probe with any proposed + fix to confirm cohort-1 + the 22 exact cohort-2 certs don't + regress. + +3. **Investigate cert 9796 MIT precision residual** (thread 2). Likely + needs line-by-line η pinning at the Mid-Terrace-bungalow scale. + +4. **API path** — fetch + persist the 38-cert JSON via + `EpcClientService._fetch_certificate`. Pattern follows + `domain/sap10_calculator/rdsap/tests/fixtures/golden/*.json`. Token + in `backend/.env` as `OPEN_EPC_API_TOKEN`. + +Good luck. The Summary-path cohort is in very strong shape (22/38 +exact; max residual ±0.55 SAP). The remaining residuals are +precision-floor concerns rather than structural cascade bugs.