Three slices closed: - S0380.150 18-hour tariff for pumps+lighting (§12 + App F2) - S0380.151 RdSAP 10 §4.1 Table 5 extract-fans default - S0380.152 Table 3 primary loss for solid-fuel back-boilers Cluster A closed; Cluster B partial (SF3 done, SF2 partial); Cluster C open. Σ|ΔSAP| 14.5 → 6.4 across the 25 cascade-OK cohort variants. Mid-session pivot documented: my Cluster B hypothesis was wrong (Table 9c step 12), the actual gap was Table 3 primary loss for solid-fuel boilers. Discipline added: dump per-line worksheet data before forming a spec hypothesis. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
14 KiB
Handover — post Slices S0380.150..152
Branch: feature/per-cert-mapper-validation. HEAD d4f6ff0f.
Predecessor: HANDOVER_POST_S0380_149.md.
TL;DR
Three slices landed. The session pivoted partway through from incremental fixes to a spec-led cluster audit (the user pushed back that we were spinning wheels). The audit identified three distinct clusters; two were closed.
| Slice | Commit | Spec rule closed |
|---|---|---|
| S0380.150 | a658f736 |
SAP 10.2 §12 / Appendix F2 — 18-hour tariff: pumps + lighting bill at 18-hour HIGH rate (13.67 p/kWh) not standard (13.19) |
| S0380.151 | fb173cdf |
RdSAP 10 §4.1 Table 5 — extract-fans age-band default (max(lodged, table_5_default)) |
| S0380.152 | d4f6ff0f |
SAP 10.2 Table 3 — primary loss for ANY wet boiler + cylinder + WHC=901 (not just Table 4b gas/oil) |
Extended handover suite at HEAD: 896 pass, 0 fail. Pyright net-zero (43 → 43).
The mid-session pivot — read this before doing anything
The user explicitly called out "spinning wheels" partway through. I'd shipped S0380.150 (18-hour tariff fix) which closed ~£2/variant uniformly across the cohort, but several variants got worse. The user asked for a spec-led picture of where the actual gaps were across the open variants, not more incremental fixes.
The audit produced this categorisation:
Cluster A — cohort-wide systematic ~-1.2% SH USEFUL kWh deficit
across 18 of 25 variants. Same property, same magnitude on every
variant. Root cause: RdSAP 10 Table 5 extract-fans default missing
(lodged 0 was being trusted verbatim instead of max(lodged, default)).
Closed in S0380.151.
Cluster B — three variants overshoot by +2.3% (solid fuel 2/3, electric 5). My audit hypothesised this was a Table 9c step 12 sign convention for low-R systems. This was wrong. When I probed solid fuel 2's monthly MIT, it was actually 0.035°C LOWER than the worksheet (not higher), yet had MORE SH demand. The decomposition showed the entire 73 W gain gap was in (72) water-heating gains — because cascade (59) primary loss was 0 while worksheet was ~505 kWh/yr. Partially closed in S0380.152 — SF3 fully (+1.31 → +0.30), SF2 partially (+2.77 → +2.06).
Cluster C — HW kWh mismatch on 4 specific variants (gshp, electric 2, solid fuel 2/3). Different spec rules per variant.
The audit doc lives at the top of the conversation. The key discipline: don't form a spec hypothesis from headline residuals; walk the per-line cascade against the worksheet PDF, find which line ref diverges, then look up the spec rule that produces that line. My Cluster B hypothesis didn't survive contact with the data — see feedback-spec-floor-skepticism for the discipline that cuts both ways.
Current residual state at HEAD d4f6ff0f
Cascade-OK tier (25 variants on pin grid)
Sorted by |ΔSAP_c|:
| Variant | ΔSAP_c | Δcost | ΔPE | Cluster | Notes |
|---|---|---|---|---|---|
| oil 1 | +0.0000 | +0.0000 | +0.0000 | — | EXACT |
| oil pcdb 1/2 | +0.0000 | +0.0000 | +0.0000 | — | EXACT |
| oil pcdb 3 | +0.0000 | +0.0000 | -0.0000 | — | EXACT |
| electric 1 | -0.0000 | -0.0000 | +48.66 | — | SAP exact, PE +49 kWh follow-up |
| solid fuel 5 | +0.0000 | +0.0000 | +48.66 | — | SAP exact |
| solid fuel 6 | +0.0000 | +0.0000 | +48.66 | — | SAP exact |
| solid fuel 7 | -0.0000 | +0.0000 | +48.66 | — | SAP exact |
| solid fuel 8 | -0.0000 | +0.0000 | +48.66 | — | SAP exact |
| pcdb 1 | -0.0108 | +£0.24 | +5.70 | — | basically exact |
| ashp | -0.024 | +£0.55 | +36.34 | — | basically exact |
| solid fuel 4 | +0.085 | -£1.96 | -5.78 | — | close |
| solid fuel 11 | +0.0912 | -£2.10 | -0.74 | — | close |
| electric 8 | +0.0941 | -£2.17 | +6.58 | — | close |
| electric 7 | +0.1017 | -£2.34 | +3.10 | — | close |
| electric 6 | +0.1081 | -£2.49 | +0.16 | — | close |
| solid fuel 9 | +0.1072 | -£2.47 | -5.07 | — | close |
| solid fuel 10 | +0.1134 | -£2.61 | -13.91 | — | close |
| electric 9 | +0.1199 | -£2.76 | -4.51 | — | close |
| electric 3 | +0.1215 | -£2.80 | -5.99 | — | close |
| solid fuel 3 | +0.2968 | -£6.84 | -214.25 | B (~done) | closed by .152 |
| electric 2 | -0.4584 | +£10.56 | +443.13 | C | warm-air ASHP HW cascade |
| gshp | +0.9373 | -£21.60 | -418.92 | C | HP DHW Appendix N3 |
| electric 5 | -1.1759 | +£27.09 | +438.03 | B (open) | storage code 402, R=0.40 — distinct cause |
| solid fuel 2 | +2.0649 | -£47.58 | -754.09 | B (partial) | needs _separately_timed_dhw=False |
Σ |ΔSAP_c| across 25 variants ≈ 6.4 SAP points (was ~14.5 pre- session, ~6.4 now = ~55% reduction across 3 slices).
Blocked tier (16 variants — MissingMainFuelType)
Unchanged. Community heating × 5, electric storage 11-14, no system, oil 2-6, pcdb 3.
Open fronts ranked by leverage
1. SF2 separately-timed-DHW for solid-fuel back-boilers — +2.06 SAP
The cascade post-S0380.152 applies primary loss year-round (h=3
winter / h=3 summer via _separately_timed_dhw=True). Worksheet
applies winter-only (h=5 winter / 0 summer). Daily-rate diff = the
ENTIRE remaining SF2 residual.
Spec hint: _separately_timed_dhw at line 3765 currently returns
True for cylinder + non-electric HW fuel. For solid-fuel back-
boilers the HW timing is tied to the room fire (no separate
programmer) — the cascade should return False here, switching the
formula to (h=5, h=3). And then there's still the summer-zero
question — possibly a separate rule for "back-boiler doesn't run in
summer".
Compare SF2 to SF3 (both code 158/160 + WHC=901): SF3 has Jun-Sep non-zero (~42 kWh/month) while SF2 has Jun-Sep = 0. Same property, same boiler type. Probably a lodging difference (cylinder thermostat or DHW timing). Worth a 30-min probe before coding.
2. Cluster C — gshp HW cascade — +0.94 SAP / -419 PE
Cascade HW = 841 kWh vs worksheet 1138 kWh — under by 26%. Spec: SAP 10.2 Appendix N3.6 / N3.7 (PDF p.107-109) — HP DHW efficiency cascade. The current cascade may be applying the wrong in-use factor (Table N8) or PSR interpolation. Cohort-1 ASHP closed via Appendix N N3.6 reciprocal interpolation in S0380.28 — the gshp fix may share a path.
3. Cluster C — electric 2 (warm-air HP) HW cascade — -0.46 SAP / +443 PE
Cascade HW = 2849 kWh vs worksheet 2384 = OVER by 19%. Different direction from gshp. Code 524 (warm-air ASHP). Probably wrong water_heating efficiency dispatch.
4. electric 5 — -1.18 SAP / +438 PE
Storage heater code 402 (R=0.40, +0.4 K Table 4e adjustment). Worsened by S0380.145 (then was net-zero from offsetting bugs) and by S0380.151 (lighting now correctly billed). Cascade SH USEFUL was +196 kWh OVER worksheet pre-cluster-A. After Cluster A and now the secondary cascade fixes, the residual is the real spec gap. Need to probe MIT cascade for electric 5 specifically.
5. Lighting-only PE +48.66 cohort cluster — 5 variants
Variants where SAP / cost are EXACT but PE is +48.66 kWh/yr (and CO2 +11.94 kg/yr). Identical offset across electric 1, solid fuel 5/6/7/8. This is suspicious — same exact value. Probably a Table 12e PE factor mismatch on the added extract fan kWh.
Diagnostic: 48.66 / (10 m³/h × something) = ? — back-solve for the
per-kWh PE factor diff. Then check _pumps_fans_pe_factor.
Slice history (this session)
| Slice | HEAD | Scope |
|---|---|---|
| S0380.150 | a658f736 |
SAP 10.2 §12 (p.45) + Appendix F2 (p.63) — 18-hour tariff non-heating uses bill at 18-hour high rate (13.67 not 13.19 p/kWh). New _other_fuel_cost_gbp_per_kwh branch for Tariff.EIGHTEEN_HOUR returning the Table 32 code 38 high rate. Closures: oil 1 -£9.31→-£6.69, all 25 variants shift £1.35-£2.62. |
| S0380.151 | fb173cdf |
RdSAP 10 §4.1 Table 5 (PDF p.28) — extract-fans default when lodged is unknown/zero. New _rdsap_extract_fans_default(age_band, habitable_rooms, *, is_park_home) helper + max(lodged, default) wiring in ventilation_from_cert. Cohort: 8 variants → EXACT, 11 → ±0.02-0.12. Golden cert 0240 PE +2.18→+5.80, cert 0390-2954 PE -28.27→-27.97. |
| S0380.152 | d4f6ff0f |
SAP 10.2 Table 3 (PDF p.160) — primary circuit loss applies to ANY heat generator + cylinder via primary pipework, not just Table 4b. _primary_loss_applies(...) gains optional water_heating_code parameter + new branch using _is_wet_boiler_main(main) + WHC ∈ {901, 902, 914}. Closures: solid fuel 3 +1.31→+0.30, solid fuel 2 +2.77→+2.06 (partial; needs separately-timed-DHW fix). |
Standard slice workflow (unchanged)
- Read spec page + identify rule
- Probe one cluster variant; verify diagnosis via monkey-patch / direct walk
- Write failing AAA test (literal
# Arrange / # Act / # Assert) - Implement helper / dispatch entry / mapper extension
- Re-pin affected variants (DO NOT widen tolerance)
- Run extended handover suite (command below)
- Pyright net-zero check (
git stash→ pyright →git stash pop→ pyright) - Commit with spec citation +
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> - Update
project-heating-systems-corpus+MEMORY.mdindex
Bonus discipline from this session: when forming a spec hypothesis, dump the per-line worksheet values for the variant and walk them against the cascade output BEFORE writing the slice. My Cluster B narrative had the wrong spec section entirely — what looked like Table 9c was Table 3. The data caught it; the audit narrative didn't.
Test baseline at HEAD d4f6ff0f
PYTHONPATH=/workspaces/model python -m pytest \
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
backend/documents_parser/tests/test_heating_systems_corpus.py \
backend/documents_parser/tests/test_elmhurst_extractor.py \
backend/documents_parser/tests/test_elmhurst_end_to_end.py \
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
domain/sap10_calculator/worksheet/tests/test_heat_transmission.py \
domain/sap10_calculator/worksheet/tests/test_internal_gains.py \
domain/sap10_calculator/worksheet/tests/test_solar_gains.py \
domain/sap10_calculator/worksheet/tests/test_dimensions.py \
domain/sap10_calculator/worksheet/tests/test_rating.py \
domain/sap10_calculator/worksheet/tests/test_ventilation.py \
domain/sap10_calculator/worksheet/tests/test_appendix_h_solar.py \
domain/sap10_calculator/worksheet/tests/test_mev.py \
domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
domain/sap10_calculator/tests/test_pcdb_table_322_lookup.py \
domain/sap10_calculator/tests/test_pcdb_table_329_lookup.py \
domain/sap10_calculator/tests/test_table_12a.py \
--no-cov -q
Expected: 896 pass, 0 fail.
Memories to load (in order)
project-heating-systems-corpus # HEAD d4f6ff0f
feedback-sap-10-2-only-never-10-3 # CRITICAL — never reference SAP 10.3
feedback-software-no-special-handling # CRITICAL — apply spec uniformly, no empirical gates
feedback-worksheet-not-api-reference
feedback-spec-citation-in-commits
feedback-verify-handover-claims
feedback-zero-error-strict # TARGET: ΔSAP_c < 1e-4 vs worksheet
feedback-commit-per-slice
feedback-aaa-test-convention
feedback-e2e-validation-philosophy
feedback-abs-diff-over-pytest-approx
feedback-spec-floor-skepticism # CUTS BOTH WAYS — be skeptical of your own narrative
feedback-golden-residuals-near-zero
feedback-one-e-minus-4-across-the-board
reference-unmapped-sap-code
reference-unmapped-api-code
project-oil-price-spec-divergence
What NOT to do
- Don't reference SAP 10.3 — track 10.2 deliberately
- Don't widen pin tolerances — re-pin smaller or find the spec gap
- Don't add empirical gates to keep cohort pins stable when a spec rule clearly applies
- Don't re-investigate Slices .91..152 — all settled
- Don't add new helpers to
domain/sap10_ml/— on deprecation path;domain/sap10_calculator/tables/is the canonical home - Don't treat ΔSAP=0.07 as "closed" — target is <1e-4 vs worksheet
- Don't form a spec hypothesis without per-line data — walk the worksheet line-by-line for the failing variant first, then look up the spec rule. Headline residuals tell you a gap exists; only the per-line walk tells you which section of the spec it lives in.
Spec source quick-reference
All under domain/sap10_calculator/docs/specs/:
- SAP 10.2 full spec:
sap-10-2-full-specification-2025-03-14.pdf- §4 (p.135-137) — water heating worksheet (45..65)
- §9 (p.155+) — MIT calc, Tables 9/9a/9b/9c
- §9.4.11 (p.30) — Boiler interlock: -5pp to BOTH SH and DHW
- §12 (p.45) — Electricity tariff types (7/10/18/24-hour rules)
- §A.2.2 (~p.189) — Forced-secondary set
- Appendix D §D2.1 (2) (p.57) — Eq D1 monthly water eff cascade
- Appendix F2 (p.63) — 18-hour CPSU: high rate for all other uses
- Appendix N3 (p.107-109) — Heat pump DHW efficiency cascade
- Table 3 (p.160) — Primary circuit loss; zero-loss list. Slice .152 extended this to all wet boilers + cylinder + WHC=901.
- Table 4a (p.163-170) — heating systems incl. R column
- Table 4b (p.168) — gas/liquid boilers seasonal efficiency
- Table 4f (p.174) — pumps + fans
- Table 9c (p.184) — MIT cascade (step 8 = Table 4e adj wired)
- Table 11 (p.188) — secondary heating fraction
- Table 12 (p.191) — SAP rating fuel prices + standing charges
- Table 12a (p.191) — high/low-rate fraction by system × tariff
- RdSAP 10 spec:
RdSAP 10 Specification 10-06-2025.pdf- §4.1 Table 5 (p.28) — Ventilation parameters incl. extract fans age-band default (slice .151)
- §5 (p.29) — Floor infiltration spec rule
- §10.11 Table 29 (p.56) — Heating/HW parameters; inaccessible cylinder
- §19 Table 32 (p.95) — RdSAP10 fuel prices / CO2 / PE