diff --git a/domain/sap10_calculator/docs/HANDOVER_POST_S0380_152.md b/domain/sap10_calculator/docs/HANDOVER_POST_S0380_152.md new file mode 100644 index 00000000..7fa9aea1 --- /dev/null +++ b/domain/sap10_calculator/docs/HANDOVER_POST_S0380_152.md @@ -0,0 +1,276 @@ +# Handover — post Slices S0380.150..152 + +Branch: `feature/per-cert-mapper-validation`. **HEAD `d4f6ff0f`**. +Predecessor: [`HANDOVER_POST_S0380_149.md`](HANDOVER_POST_S0380_149.md). + +## TL;DR + +Three slices landed. The session pivoted partway through from +incremental fixes to a **spec-led cluster audit** (the user pushed +back that we were spinning wheels). The audit identified three +distinct clusters; two were closed. + +| Slice | Commit | Spec rule closed | +|---|---|---| +| S0380.150 | `a658f736` | SAP 10.2 §12 / Appendix F2 — 18-hour tariff: pumps + lighting bill at 18-hour HIGH rate (13.67 p/kWh) not standard (13.19) | +| S0380.151 | `fb173cdf` | RdSAP 10 §4.1 Table 5 — extract-fans age-band default (`max(lodged, table_5_default)`) | +| S0380.152 | `d4f6ff0f` | SAP 10.2 Table 3 — primary loss for ANY wet boiler + cylinder + WHC=901 (not just Table 4b gas/oil) | + +Extended handover suite at HEAD: **896 pass, 0 fail.** Pyright +net-zero (43 → 43). + +## The mid-session pivot — read this before doing anything + +The user explicitly called out "spinning wheels" partway through. +I'd shipped S0380.150 (18-hour tariff fix) which closed ~£2/variant +uniformly across the cohort, but several variants got *worse*. The +user asked for a **spec-led picture** of where the actual gaps were +across the open variants, not more incremental fixes. + +The audit produced this categorisation: + +**Cluster A** — cohort-wide systematic ~-1.2% SH USEFUL kWh deficit +across 18 of 25 variants. Same property, same magnitude on every +variant. Root cause: RdSAP 10 Table 5 extract-fans default missing +(lodged 0 was being trusted verbatim instead of `max(lodged, default)`). +**Closed in S0380.151.** + +**Cluster B** — three variants overshoot by +2.3% (solid fuel 2/3, +electric 5). My audit hypothesised this was a Table 9c step 12 sign +convention for low-R systems. **This was wrong.** When I probed +solid fuel 2's monthly MIT, it was actually 0.035°C LOWER than the +worksheet (not higher), yet had MORE SH demand. The decomposition +showed the entire 73 W gain gap was in (72) water-heating gains — +because cascade (59) primary loss was 0 while worksheet was ~505 +kWh/yr. **Partially closed in S0380.152** — SF3 fully (+1.31 → ++0.30), SF2 partially (+2.77 → +2.06). + +**Cluster C** — HW kWh mismatch on 4 specific variants (gshp, +electric 2, solid fuel 2/3). Different spec rules per variant. + +The audit doc lives at the top of the conversation. The key +discipline: don't form a spec hypothesis from headline residuals; +walk the per-line cascade against the worksheet PDF, find which +line ref diverges, then look up the spec rule that produces that +line. My Cluster B hypothesis didn't survive contact with the +data — see [[feedback-spec-floor-skepticism]] for the discipline +that cuts both ways. + +## Current residual state at HEAD `d4f6ff0f` + +### Cascade-OK tier (25 variants on pin grid) + +Sorted by |ΔSAP_c|: + +| Variant | ΔSAP_c | Δcost | ΔPE | Cluster | Notes | +|---|---:|---:|---:|:--|---| +| oil 1 | **+0.0000** | **+0.0000** | **+0.0000** | — | EXACT | +| oil pcdb 1/2 | **+0.0000** | **+0.0000** | **+0.0000** | — | EXACT | +| oil pcdb 3 | **+0.0000** | **+0.0000** | **-0.0000** | — | EXACT | +| electric 1 | **-0.0000** | **-0.0000** | +48.66 | — | SAP exact, PE +49 kWh follow-up | +| solid fuel 5 | **+0.0000** | **+0.0000** | +48.66 | — | SAP exact | +| solid fuel 6 | **+0.0000** | **+0.0000** | +48.66 | — | SAP exact | +| solid fuel 7 | **-0.0000** | **+0.0000** | +48.66 | — | SAP exact | +| solid fuel 8 | **-0.0000** | **+0.0000** | +48.66 | — | SAP exact | +| pcdb 1 | -0.0108 | +£0.24 | +5.70 | — | basically exact | +| ashp | -0.024 | +£0.55 | +36.34 | — | basically exact | +| solid fuel 4 | +0.085 | -£1.96 | -5.78 | — | close | +| solid fuel 11 | +0.0912 | -£2.10 | -0.74 | — | close | +| electric 8 | +0.0941 | -£2.17 | +6.58 | — | close | +| electric 7 | +0.1017 | -£2.34 | +3.10 | — | close | +| electric 6 | +0.1081 | -£2.49 | +0.16 | — | close | +| solid fuel 9 | +0.1072 | -£2.47 | -5.07 | — | close | +| solid fuel 10 | +0.1134 | -£2.61 | -13.91 | — | close | +| electric 9 | +0.1199 | -£2.76 | -4.51 | — | close | +| electric 3 | +0.1215 | -£2.80 | -5.99 | — | close | +| **solid fuel 3** | **+0.2968** | **-£6.84** | **-214.25** | B (~done) | **closed by .152** | +| **electric 2** | **-0.4584** | **+£10.56** | **+443.13** | C | warm-air ASHP HW cascade | +| **gshp** | **+0.9373** | **-£21.60** | **-418.92** | C | HP DHW Appendix N3 | +| **electric 5** | **-1.1759** | **+£27.09** | **+438.03** | B (open) | storage code 402, R=0.40 — distinct cause | +| **solid fuel 2** | **+2.0649** | **-£47.58** | **-754.09** | B (partial) | needs `_separately_timed_dhw=False` | + +Σ |ΔSAP_c| across 25 variants ≈ **6.4 SAP points** (was ~14.5 pre- +session, ~6.4 now = ~55% reduction across 3 slices). + +### Blocked tier (16 variants — `MissingMainFuelType`) + +Unchanged. Community heating × 5, electric storage 11-14, no +system, oil 2-6, pcdb 3. + +## Open fronts ranked by leverage + +### 1. **SF2 separately-timed-DHW for solid-fuel back-boilers** — +2.06 SAP + +The cascade post-S0380.152 applies primary loss year-round (h=3 +winter / h=3 summer via `_separately_timed_dhw=True`). Worksheet +applies winter-only (h=5 winter / 0 summer). Daily-rate diff = the +ENTIRE remaining SF2 residual. + +Spec hint: `_separately_timed_dhw` at line 3765 currently returns +True for cylinder + non-electric HW fuel. For solid-fuel back- +boilers the HW timing is *tied to the room fire* (no separate +programmer) — the cascade should return False here, switching the +formula to (h=5, h=3). And then there's still the summer-zero +question — possibly a separate rule for "back-boiler doesn't run in +summer". + +Compare SF2 to SF3 (both code 158/160 + WHC=901): SF3 has Jun-Sep +non-zero (~42 kWh/month) while SF2 has Jun-Sep = 0. Same property, +same boiler type. Probably a lodging difference (cylinder thermostat +or DHW timing). Worth a 30-min probe before coding. + +### 2. **Cluster C — gshp HW cascade** — +0.94 SAP / -419 PE + +Cascade HW = 841 kWh vs worksheet 1138 kWh — under by 26%. +Spec: SAP 10.2 Appendix N3.6 / N3.7 (PDF p.107-109) — HP DHW +efficiency cascade. The current cascade may be applying the wrong +in-use factor (Table N8) or PSR interpolation. Cohort-1 ASHP closed +via Appendix N N3.6 reciprocal interpolation in S0380.28 — the gshp +fix may share a path. + +### 3. **Cluster C — electric 2 (warm-air HP) HW cascade** — -0.46 SAP / +443 PE + +Cascade HW = 2849 kWh vs worksheet 2384 = OVER by 19%. Different +direction from gshp. Code 524 (warm-air ASHP). Probably wrong +water_heating efficiency dispatch. + +### 4. **electric 5** — -1.18 SAP / +438 PE + +Storage heater code 402 (R=0.40, +0.4 K Table 4e adjustment). +Worsened by S0380.145 (then was net-zero from offsetting bugs) +and by S0380.151 (lighting now correctly billed). Cascade SH +USEFUL was +196 kWh OVER worksheet pre-cluster-A. After Cluster A +and now the secondary cascade fixes, the residual is the *real* +spec gap. Need to probe MIT cascade for electric 5 specifically. + +### 5. **Lighting-only PE +48.66 cohort cluster** — 5 variants + +Variants where SAP / cost are EXACT but PE is +48.66 kWh/yr (and +CO2 +11.94 kg/yr). Identical offset across electric 1, solid fuel +5/6/7/8. This is suspicious — same exact value. Probably a Table +12e PE factor mismatch on the added extract fan kWh. + +Diagnostic: 48.66 / (10 m³/h × something) = ? — back-solve for the +per-kWh PE factor diff. Then check `_pumps_fans_pe_factor`. + +## Slice history (this session) + +| Slice | HEAD | Scope | +|---|---|---| +| S0380.150 | `a658f736` | SAP 10.2 §12 (p.45) + Appendix F2 (p.63) — 18-hour tariff non-heating uses bill at 18-hour high rate (13.67 not 13.19 p/kWh). New `_other_fuel_cost_gbp_per_kwh` branch for `Tariff.EIGHTEEN_HOUR` returning the Table 32 code 38 high rate. Closures: oil 1 -£9.31→-£6.69, all 25 variants shift £1.35-£2.62. | +| S0380.151 | `fb173cdf` | RdSAP 10 §4.1 Table 5 (PDF p.28) — extract-fans default when lodged is unknown/zero. New `_rdsap_extract_fans_default(age_band, habitable_rooms, *, is_park_home)` helper + `max(lodged, default)` wiring in `ventilation_from_cert`. Cohort: 8 variants → EXACT, 11 → ±0.02-0.12. Golden cert 0240 PE +2.18→+5.80, cert 0390-2954 PE -28.27→-27.97. | +| S0380.152 | `d4f6ff0f` | SAP 10.2 Table 3 (PDF p.160) — primary circuit loss applies to ANY heat generator + cylinder via primary pipework, not just Table 4b. `_primary_loss_applies(...)` gains optional `water_heating_code` parameter + new branch using `_is_wet_boiler_main(main)` + WHC ∈ {901, 902, 914}. Closures: solid fuel 3 +1.31→+0.30, solid fuel 2 +2.77→+2.06 (partial; needs separately-timed-DHW fix). | + +## Standard slice workflow (unchanged) + +1. Read spec page + identify rule +2. Probe one cluster variant; verify diagnosis via monkey-patch / direct walk +3. Write failing AAA test (literal `# Arrange / # Act / # Assert`) +4. Implement helper / dispatch entry / mapper extension +5. Re-pin affected variants (DO NOT widen tolerance) +6. Run extended handover suite (command below) +7. Pyright net-zero check (`git stash` → pyright → `git stash pop` → pyright) +8. Commit with spec citation + `Co-Authored-By: Claude Opus 4.7 ` +9. Update `project-heating-systems-corpus` + `MEMORY.md` index + +**Bonus discipline from this session**: when forming a spec +hypothesis, dump the per-line worksheet values for the variant and +walk them against the cascade output BEFORE writing the slice. My +Cluster B narrative had the wrong spec section entirely — what +looked like Table 9c was Table 3. The data caught it; the audit +narrative didn't. + +## Test baseline at HEAD `d4f6ff0f` + +```bash +PYTHONPATH=/workspaces/model python -m pytest \ + backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \ + backend/documents_parser/tests/test_heating_systems_corpus.py \ + backend/documents_parser/tests/test_elmhurst_extractor.py \ + backend/documents_parser/tests/test_elmhurst_end_to_end.py \ + domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \ + domain/sap10_calculator/worksheet/tests/test_heat_transmission.py \ + domain/sap10_calculator/worksheet/tests/test_internal_gains.py \ + domain/sap10_calculator/worksheet/tests/test_solar_gains.py \ + domain/sap10_calculator/worksheet/tests/test_dimensions.py \ + domain/sap10_calculator/worksheet/tests/test_rating.py \ + domain/sap10_calculator/worksheet/tests/test_ventilation.py \ + domain/sap10_calculator/worksheet/tests/test_appendix_h_solar.py \ + domain/sap10_calculator/worksheet/tests/test_mev.py \ + domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \ + domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \ + domain/sap10_calculator/tests/test_pcdb_table_322_lookup.py \ + domain/sap10_calculator/tests/test_pcdb_table_329_lookup.py \ + domain/sap10_calculator/tests/test_table_12a.py \ + --no-cov -q +``` + +Expected: **896 pass, 0 fail.** + +## Memories to load (in order) + +``` +project-heating-systems-corpus # HEAD d4f6ff0f +feedback-sap-10-2-only-never-10-3 # CRITICAL — never reference SAP 10.3 +feedback-software-no-special-handling # CRITICAL — apply spec uniformly, no empirical gates +feedback-worksheet-not-api-reference +feedback-spec-citation-in-commits +feedback-verify-handover-claims +feedback-zero-error-strict # TARGET: ΔSAP_c < 1e-4 vs worksheet +feedback-commit-per-slice +feedback-aaa-test-convention +feedback-e2e-validation-philosophy +feedback-abs-diff-over-pytest-approx +feedback-spec-floor-skepticism # CUTS BOTH WAYS — be skeptical of your own narrative +feedback-golden-residuals-near-zero +feedback-one-e-minus-4-across-the-board +reference-unmapped-sap-code +reference-unmapped-api-code +project-oil-price-spec-divergence +``` + +## What NOT to do + +- **Don't reference SAP 10.3** — track 10.2 deliberately +- **Don't widen pin tolerances** — re-pin smaller or find the spec gap +- **Don't add empirical gates** to keep cohort pins stable when a + spec rule clearly applies +- **Don't re-investigate Slices .91..152** — all settled +- **Don't add new helpers to `domain/sap10_ml/`** — on deprecation + path; `domain/sap10_calculator/tables/` is the canonical home +- **Don't treat ΔSAP=0.07 as "closed"** — target is <1e-4 vs worksheet +- **Don't form a spec hypothesis without per-line data** — walk the + worksheet line-by-line for the failing variant first, then look up + the spec rule. Headline residuals tell you a gap exists; only the + per-line walk tells you which section of the spec it lives in. + +## Spec source quick-reference + +All under `domain/sap10_calculator/docs/specs/`: + +- **SAP 10.2 full spec**: `sap-10-2-full-specification-2025-03-14.pdf` + - **§4** (p.135-137) — water heating worksheet (45..65) + - **§9** (p.155+) — MIT calc, Tables 9/9a/9b/9c + - **§9.4.11** (p.30) — Boiler interlock: -5pp to BOTH SH and DHW + - **§12** (p.45) — Electricity tariff types (7/10/18/24-hour rules) + - **§A.2.2** (~p.189) — Forced-secondary set + - **Appendix D §D2.1 (2)** (p.57) — Eq D1 monthly water eff cascade + - **Appendix F2** (p.63) — 18-hour CPSU: high rate for all other uses + - **Appendix N3** (p.107-109) — Heat pump DHW efficiency cascade + - **Table 3** (p.160) — Primary circuit loss; zero-loss list. **Slice .152** + extended this to all wet boilers + cylinder + WHC=901. + - **Table 4a** (p.163-170) — heating systems incl. R column + - **Table 4b** (p.168) — gas/liquid boilers seasonal efficiency + - **Table 4f** (p.174) — pumps + fans + - **Table 9c** (p.184) — MIT cascade (step 8 = Table 4e adj wired) + - **Table 11** (p.188) — secondary heating fraction + - **Table 12** (p.191) — SAP rating fuel prices + standing charges + - **Table 12a** (p.191) — high/low-rate fraction by system × tariff +- **RdSAP 10 spec**: `RdSAP 10 Specification 10-06-2025.pdf` + - **§4.1 Table 5** (p.28) — Ventilation parameters incl. **extract fans + age-band default** (slice .151) + - **§5** (p.29) — Floor infiltration spec rule + - **§10.11 Table 29** (p.56) — Heating/HW parameters; inaccessible cylinder + - **§19 Table 32** (p.95) — RdSAP10 fuel prices / CO2 / PE + +## Good luck.