Model/domain/sap10_calculator/docs/HANDOVER_POST_S0380_152.md
Khalim Conn-Kowlessar 3a44ca89fb docs: handover post S0380.150..152
Three slices closed:
- S0380.150 18-hour tariff for pumps+lighting (§12 + App F2)
- S0380.151 RdSAP 10 §4.1 Table 5 extract-fans default
- S0380.152 Table 3 primary loss for solid-fuel back-boilers

Cluster A closed; Cluster B partial (SF3 done, SF2 partial); Cluster
C open. Σ|ΔSAP| 14.5 → 6.4 across the 25 cascade-OK cohort variants.

Mid-session pivot documented: my Cluster B hypothesis was wrong
(Table 9c step 12), the actual gap was Table 3 primary loss for
solid-fuel boilers. Discipline added: dump per-line worksheet data
before forming a spec hypothesis.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-01 13:03:55 +00:00

14 KiB
Raw Blame History

Handover — post Slices S0380.150..152

Branch: feature/per-cert-mapper-validation. HEAD d4f6ff0f. Predecessor: HANDOVER_POST_S0380_149.md.

TL;DR

Three slices landed. The session pivoted partway through from incremental fixes to a spec-led cluster audit (the user pushed back that we were spinning wheels). The audit identified three distinct clusters; two were closed.

Slice Commit Spec rule closed
S0380.150 a658f736 SAP 10.2 §12 / Appendix F2 — 18-hour tariff: pumps + lighting bill at 18-hour HIGH rate (13.67 p/kWh) not standard (13.19)
S0380.151 fb173cdf RdSAP 10 §4.1 Table 5 — extract-fans age-band default (max(lodged, table_5_default))
S0380.152 d4f6ff0f SAP 10.2 Table 3 — primary loss for ANY wet boiler + cylinder + WHC=901 (not just Table 4b gas/oil)

Extended handover suite at HEAD: 896 pass, 0 fail. Pyright net-zero (43 → 43).

The mid-session pivot — read this before doing anything

The user explicitly called out "spinning wheels" partway through. I'd shipped S0380.150 (18-hour tariff fix) which closed ~£2/variant uniformly across the cohort, but several variants got worse. The user asked for a spec-led picture of where the actual gaps were across the open variants, not more incremental fixes.

The audit produced this categorisation:

Cluster A — cohort-wide systematic ~-1.2% SH USEFUL kWh deficit across 18 of 25 variants. Same property, same magnitude on every variant. Root cause: RdSAP 10 Table 5 extract-fans default missing (lodged 0 was being trusted verbatim instead of max(lodged, default)). Closed in S0380.151.

Cluster B — three variants overshoot by +2.3% (solid fuel 2/3, electric 5). My audit hypothesised this was a Table 9c step 12 sign convention for low-R systems. This was wrong. When I probed solid fuel 2's monthly MIT, it was actually 0.035°C LOWER than the worksheet (not higher), yet had MORE SH demand. The decomposition showed the entire 73 W gain gap was in (72) water-heating gains — because cascade (59) primary loss was 0 while worksheet was ~505 kWh/yr. Partially closed in S0380.152 — SF3 fully (+1.31 → +0.30), SF2 partially (+2.77 → +2.06).

Cluster C — HW kWh mismatch on 4 specific variants (gshp, electric 2, solid fuel 2/3). Different spec rules per variant.

The audit doc lives at the top of the conversation. The key discipline: don't form a spec hypothesis from headline residuals; walk the per-line cascade against the worksheet PDF, find which line ref diverges, then look up the spec rule that produces that line. My Cluster B hypothesis didn't survive contact with the data — see feedback-spec-floor-skepticism for the discipline that cuts both ways.

Current residual state at HEAD d4f6ff0f

Cascade-OK tier (25 variants on pin grid)

Sorted by |ΔSAP_c|:

Variant ΔSAP_c Δcost ΔPE Cluster Notes
oil 1 +0.0000 +0.0000 +0.0000 EXACT
oil pcdb 1/2 +0.0000 +0.0000 +0.0000 EXACT
oil pcdb 3 +0.0000 +0.0000 -0.0000 EXACT
electric 1 -0.0000 -0.0000 +48.66 SAP exact, PE +49 kWh follow-up
solid fuel 5 +0.0000 +0.0000 +48.66 SAP exact
solid fuel 6 +0.0000 +0.0000 +48.66 SAP exact
solid fuel 7 -0.0000 +0.0000 +48.66 SAP exact
solid fuel 8 -0.0000 +0.0000 +48.66 SAP exact
pcdb 1 -0.0108 +£0.24 +5.70 basically exact
ashp -0.024 +£0.55 +36.34 basically exact
solid fuel 4 +0.085 -£1.96 -5.78 close
solid fuel 11 +0.0912 -£2.10 -0.74 close
electric 8 +0.0941 -£2.17 +6.58 close
electric 7 +0.1017 -£2.34 +3.10 close
electric 6 +0.1081 -£2.49 +0.16 close
solid fuel 9 +0.1072 -£2.47 -5.07 close
solid fuel 10 +0.1134 -£2.61 -13.91 close
electric 9 +0.1199 -£2.76 -4.51 close
electric 3 +0.1215 -£2.80 -5.99 close
solid fuel 3 +0.2968 -£6.84 -214.25 B (~done) closed by .152
electric 2 -0.4584 +£10.56 +443.13 C warm-air ASHP HW cascade
gshp +0.9373 -£21.60 -418.92 C HP DHW Appendix N3
electric 5 -1.1759 +£27.09 +438.03 B (open) storage code 402, R=0.40 — distinct cause
solid fuel 2 +2.0649 -£47.58 -754.09 B (partial) needs _separately_timed_dhw=False

Σ |ΔSAP_c| across 25 variants ≈ 6.4 SAP points (was ~14.5 pre- session, ~6.4 now = ~55% reduction across 3 slices).

Blocked tier (16 variants — MissingMainFuelType)

Unchanged. Community heating × 5, electric storage 11-14, no system, oil 2-6, pcdb 3.

Open fronts ranked by leverage

1. SF2 separately-timed-DHW for solid-fuel back-boilers — +2.06 SAP

The cascade post-S0380.152 applies primary loss year-round (h=3 winter / h=3 summer via _separately_timed_dhw=True). Worksheet applies winter-only (h=5 winter / 0 summer). Daily-rate diff = the ENTIRE remaining SF2 residual.

Spec hint: _separately_timed_dhw at line 3765 currently returns True for cylinder + non-electric HW fuel. For solid-fuel back- boilers the HW timing is tied to the room fire (no separate programmer) — the cascade should return False here, switching the formula to (h=5, h=3). And then there's still the summer-zero question — possibly a separate rule for "back-boiler doesn't run in summer".

Compare SF2 to SF3 (both code 158/160 + WHC=901): SF3 has Jun-Sep non-zero (~42 kWh/month) while SF2 has Jun-Sep = 0. Same property, same boiler type. Probably a lodging difference (cylinder thermostat or DHW timing). Worth a 30-min probe before coding.

2. Cluster C — gshp HW cascade — +0.94 SAP / -419 PE

Cascade HW = 841 kWh vs worksheet 1138 kWh — under by 26%. Spec: SAP 10.2 Appendix N3.6 / N3.7 (PDF p.107-109) — HP DHW efficiency cascade. The current cascade may be applying the wrong in-use factor (Table N8) or PSR interpolation. Cohort-1 ASHP closed via Appendix N N3.6 reciprocal interpolation in S0380.28 — the gshp fix may share a path.

3. Cluster C — electric 2 (warm-air HP) HW cascade — -0.46 SAP / +443 PE

Cascade HW = 2849 kWh vs worksheet 2384 = OVER by 19%. Different direction from gshp. Code 524 (warm-air ASHP). Probably wrong water_heating efficiency dispatch.

4. electric 5 — -1.18 SAP / +438 PE

Storage heater code 402 (R=0.40, +0.4 K Table 4e adjustment). Worsened by S0380.145 (then was net-zero from offsetting bugs) and by S0380.151 (lighting now correctly billed). Cascade SH USEFUL was +196 kWh OVER worksheet pre-cluster-A. After Cluster A and now the secondary cascade fixes, the residual is the real spec gap. Need to probe MIT cascade for electric 5 specifically.

5. Lighting-only PE +48.66 cohort cluster — 5 variants

Variants where SAP / cost are EXACT but PE is +48.66 kWh/yr (and CO2 +11.94 kg/yr). Identical offset across electric 1, solid fuel 5/6/7/8. This is suspicious — same exact value. Probably a Table 12e PE factor mismatch on the added extract fan kWh.

Diagnostic: 48.66 / (10 m³/h × something) = ? — back-solve for the per-kWh PE factor diff. Then check _pumps_fans_pe_factor.

Slice history (this session)

Slice HEAD Scope
S0380.150 a658f736 SAP 10.2 §12 (p.45) + Appendix F2 (p.63) — 18-hour tariff non-heating uses bill at 18-hour high rate (13.67 not 13.19 p/kWh). New _other_fuel_cost_gbp_per_kwh branch for Tariff.EIGHTEEN_HOUR returning the Table 32 code 38 high rate. Closures: oil 1 -£9.31→-£6.69, all 25 variants shift £1.35-£2.62.
S0380.151 fb173cdf RdSAP 10 §4.1 Table 5 (PDF p.28) — extract-fans default when lodged is unknown/zero. New _rdsap_extract_fans_default(age_band, habitable_rooms, *, is_park_home) helper + max(lodged, default) wiring in ventilation_from_cert. Cohort: 8 variants → EXACT, 11 → ±0.02-0.12. Golden cert 0240 PE +2.18→+5.80, cert 0390-2954 PE -28.27→-27.97.
S0380.152 d4f6ff0f SAP 10.2 Table 3 (PDF p.160) — primary circuit loss applies to ANY heat generator + cylinder via primary pipework, not just Table 4b. _primary_loss_applies(...) gains optional water_heating_code parameter + new branch using _is_wet_boiler_main(main) + WHC ∈ {901, 902, 914}. Closures: solid fuel 3 +1.31→+0.30, solid fuel 2 +2.77→+2.06 (partial; needs separately-timed-DHW fix).

Standard slice workflow (unchanged)

  1. Read spec page + identify rule
  2. Probe one cluster variant; verify diagnosis via monkey-patch / direct walk
  3. Write failing AAA test (literal # Arrange / # Act / # Assert)
  4. Implement helper / dispatch entry / mapper extension
  5. Re-pin affected variants (DO NOT widen tolerance)
  6. Run extended handover suite (command below)
  7. Pyright net-zero check (git stash → pyright → git stash pop → pyright)
  8. Commit with spec citation + Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
  9. Update project-heating-systems-corpus + MEMORY.md index

Bonus discipline from this session: when forming a spec hypothesis, dump the per-line worksheet values for the variant and walk them against the cascade output BEFORE writing the slice. My Cluster B narrative had the wrong spec section entirely — what looked like Table 9c was Table 3. The data caught it; the audit narrative didn't.

Test baseline at HEAD d4f6ff0f

PYTHONPATH=/workspaces/model python -m pytest \
    backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
    backend/documents_parser/tests/test_heating_systems_corpus.py \
    backend/documents_parser/tests/test_elmhurst_extractor.py \
    backend/documents_parser/tests/test_elmhurst_end_to_end.py \
    domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
    domain/sap10_calculator/worksheet/tests/test_heat_transmission.py \
    domain/sap10_calculator/worksheet/tests/test_internal_gains.py \
    domain/sap10_calculator/worksheet/tests/test_solar_gains.py \
    domain/sap10_calculator/worksheet/tests/test_dimensions.py \
    domain/sap10_calculator/worksheet/tests/test_rating.py \
    domain/sap10_calculator/worksheet/tests/test_ventilation.py \
    domain/sap10_calculator/worksheet/tests/test_appendix_h_solar.py \
    domain/sap10_calculator/worksheet/tests/test_mev.py \
    domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \
    domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
    domain/sap10_calculator/tests/test_pcdb_table_322_lookup.py \
    domain/sap10_calculator/tests/test_pcdb_table_329_lookup.py \
    domain/sap10_calculator/tests/test_table_12a.py \
    --no-cov -q

Expected: 896 pass, 0 fail.

Memories to load (in order)

project-heating-systems-corpus            # HEAD d4f6ff0f
feedback-sap-10-2-only-never-10-3         # CRITICAL — never reference SAP 10.3
feedback-software-no-special-handling     # CRITICAL — apply spec uniformly, no empirical gates
feedback-worksheet-not-api-reference
feedback-spec-citation-in-commits
feedback-verify-handover-claims
feedback-zero-error-strict                # TARGET: ΔSAP_c < 1e-4 vs worksheet
feedback-commit-per-slice
feedback-aaa-test-convention
feedback-e2e-validation-philosophy
feedback-abs-diff-over-pytest-approx
feedback-spec-floor-skepticism            # CUTS BOTH WAYS — be skeptical of your own narrative
feedback-golden-residuals-near-zero
feedback-one-e-minus-4-across-the-board
reference-unmapped-sap-code
reference-unmapped-api-code
project-oil-price-spec-divergence

What NOT to do

  • Don't reference SAP 10.3 — track 10.2 deliberately
  • Don't widen pin tolerances — re-pin smaller or find the spec gap
  • Don't add empirical gates to keep cohort pins stable when a spec rule clearly applies
  • Don't re-investigate Slices .91..152 — all settled
  • Don't add new helpers to domain/sap10_ml/ — on deprecation path; domain/sap10_calculator/tables/ is the canonical home
  • Don't treat ΔSAP=0.07 as "closed" — target is <1e-4 vs worksheet
  • Don't form a spec hypothesis without per-line data — walk the worksheet line-by-line for the failing variant first, then look up the spec rule. Headline residuals tell you a gap exists; only the per-line walk tells you which section of the spec it lives in.

Spec source quick-reference

All under domain/sap10_calculator/docs/specs/:

  • SAP 10.2 full spec: sap-10-2-full-specification-2025-03-14.pdf
    • §4 (p.135-137) — water heating worksheet (45..65)
    • §9 (p.155+) — MIT calc, Tables 9/9a/9b/9c
    • §9.4.11 (p.30) — Boiler interlock: -5pp to BOTH SH and DHW
    • §12 (p.45) — Electricity tariff types (7/10/18/24-hour rules)
    • §A.2.2 (~p.189) — Forced-secondary set
    • Appendix D §D2.1 (2) (p.57) — Eq D1 monthly water eff cascade
    • Appendix F2 (p.63) — 18-hour CPSU: high rate for all other uses
    • Appendix N3 (p.107-109) — Heat pump DHW efficiency cascade
    • Table 3 (p.160) — Primary circuit loss; zero-loss list. Slice .152 extended this to all wet boilers + cylinder + WHC=901.
    • Table 4a (p.163-170) — heating systems incl. R column
    • Table 4b (p.168) — gas/liquid boilers seasonal efficiency
    • Table 4f (p.174) — pumps + fans
    • Table 9c (p.184) — MIT cascade (step 8 = Table 4e adj wired)
    • Table 11 (p.188) — secondary heating fraction
    • Table 12 (p.191) — SAP rating fuel prices + standing charges
    • Table 12a (p.191) — high/low-rate fraction by system × tariff
  • RdSAP 10 spec: RdSAP 10 Specification 10-06-2025.pdf
    • §4.1 Table 5 (p.28) — Ventilation parameters incl. extract fans age-band default (slice .151)
    • §5 (p.29) — Floor infiltration spec rule
    • §10.11 Table 29 (p.56) — Heating/HW parameters; inaccessible cylinder
    • §19 Table 32 (p.95) — RdSAP10 fuel prices / CO2 / PE

Good luck.