Model/domain/sap10_calculator/docs/HANDOVER_POST_S0380_130.md
Khalim Conn-Kowlessar 38e6d18a13 docs: handover + next-agent prompt post S0380.125..130
Captures the heating-systems corpus closure work, the new permanent
residual-pin regression test, and the queued S0380.131 candidate
(heating-oil unit price spec-vs-worksheet divergence).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-31 09:05:24 +00:00

12 KiB
Raw Blame History

Handover — post Slices S0380.125..130

Branch: feature/per-cert-mapper-validation. HEAD c8486077. Predecessor: HANDOVER_POST_S0380_124.md.

TL;DR

Six slices landed on top of 8904ec09. The user pivoted away from cert 0240's residual closure and into a new controlled-variable heating-systems corpus (1 property × 41 heating variants). All 41 now cascade-execute; permanent residual-pin regression test landed; investigation surfaced a heating-oil unit-price discrepancy between the published RdSAP 10 spec PDF (7.64 p/kWh) and the operationally-canonical Elmhurst worksheet + gov.uk register values (5.44 p/kWh).

Slice Commit Scope
S0380.125 d8cdee4e meter_type "18 Hour" alias per RdSAP 10 §17 + §12
S0380.126 e25aa021 bare "Underfloor Heating" → §10.11 Table 29 subtype derivation
S0380.127 11ecac94 "No Access" cylinder → Table 28 derivation (oil HW + off-peak meter)
S0380.128 729ee29c extractor §14.0 closure falls back to "14.1 Community Heating"
S0380.129 82b8a16b permanent residual-pin regression guard (41 parametrised)
S0380.130 c8486077 Elmhurst oil-mains routed via §15.0 Water Heating Fuel Type fallback

Extended handover suite at HEAD: 874 pass, 0 fail.

What changed

The corpus

User provided sap worksheets/heating systems examples/ — 47 folders, 41 populated (6 empty: community heating 5, electric 4, electric 10, gshp 2, pcdb 2, solid fuel 1). Every variant is the same dwelling (Reference 001431, semi-detached, TFA 90 m², age G 1983-1990, W6 9BF) under a different heating system. Each carries an Elmhurst Summary PDF + an Elmhurst P960 worksheet PDF. Controlled- variable test set — cascade-vs-worksheet residuals are fully attributable to the heating subsystem.

Permanent regression test

backend/documents_parser/tests/test_heating_systems_corpus.py (S0380.129) — single parametrised test test_heating_systems_corpus_residual_matches_pin driven by 41 _CorpusExpectation entries. Per variant:

  1. Block 11a (individual) or 11b (community) pins extracted from P960: continuous SAP (SAP value), total fuel cost (255)/(355), CO2 (272/372/382/383), PE (286/386/486/483).
  2. Summary PDF → extractor → mapper → cascade.
  3. Each cascade output pinned against the residual at tight tolerance (SAP ±0.001, cost ±£0.01, CO2 ±0.1 kg/yr, PE ±0.1 kWh/yr).

Tolerances stay tight; expected residuals move toward 0 as heating-cascade gaps close. Per feedback-zero-error-strict + feedback-golden-residuals-near-zero — re-pin smaller, never widen the tolerance.

Current residual cluster (post-S0380.130)

Cascade SAP_c minus worksheet SAP_c per variant, sorted by absolute value (smallest first):

Variant ΔSAP_c Notes
solid fuel 8 +0.87 closest to closure
community heating 2/4 +1.16 gas-fired heat network (envelope-identical pairs)
solid fuel 5 +3.79
community heating 1/3 +4.18 gas-fired heat network (1↔3 + 2↔4 pairs)
solid fuel 4 +5.07
gshp +5.16
ashp +5.67
community heating 6 6.87 only negative ΔSAP — heat-pump heat network
oil 1 9.70 after S0380.130 — over-counts at 7.64 p/kWh
pcdb 1 9.41 after S0380.130
oil pcdb 3 10.87 after S0380.130
oil pcdb 1/2 11.63 after S0380.130
oil 3 +30.95 bio-FAME boiler (worksheet uses 7.64, spec says 5.44)
no system +21.94 SAP code 699
oil 5 (pathological) +120.75 bioethanol; worksheet clamps SAP int to 1

The S0380.131 candidate — heating-oil unit price

Status: queued, decision pending. Two slices were agreed; S0380.130 landed the mapper half. S0380.131 is the cascade-price half.

Evidence

Source Heating oil p/kWh Heating oil CO2 kg/kWh
SAP 10.2 spec PDF Table 12 p.191 4.94 0.298
RdSAP 10 spec PDF Table 32 p.95 7.64 0.298
domain/sap10_calculator/tables/table_32.py (verbatim from RdSAP 10) 7.64 0.298
Elmhurst P960 worksheet for oil 1 + oil pcdb 1/3 5.44 0.298
Cert 0240 (gov.uk register lodged SAP 73) back-solved ~5.48 matches oil

Two independent implementations (Elmhurst worksheet + gov.uk register's lodging software) agree on 5.44 for heating oil; the published RdSAP 10 spec PDF (7.64) is the outlier. Per feedback-worksheet-not-api-reference the worksheet is the source of truth.

Two distinct gaps were investigated

The S0380.130 mapper fix and S0380.131 price fix are independent:

  • S0380.130 (landed) fixes the Elmhurst mapper for oil mains. It affects the heating-systems corpus (oil 1, oil pcdb 1/2/3, pcdb 1). It does NOT touch cert 0240 (which already uses the API mapper with correct fuel routing).
  • S0380.131 (queued) would switch the cascade's heating-oil tariff to 5.44. It affects ANY oil cert whose cost passes through the cascade — including the heating-systems corpus AND cert 0240 AND cert 0390 in the golden corpus.

Closing S0380.131 is what would move cert 0240's golden residual from 10 toward 0; S0380.130 alone leaves cert 0240 unchanged.

Projected impact of switching cascade to 5.44

Cert Current ΔSAP After 7.64 → 5.44
oil 1 corpus 9.70 ~+0.6 (closes)
oil pcdb 1/2 corpus 11.63 ~1
oil pcdb 3 corpus 10.87 ~1
pcdb 1 corpus 9.41 ~+1
cert 0240 golden 10 ~0 (closes exactly to lodged 73)
cert 0390 golden 6 improves significantly

Open questions before implementing

  1. Is there a more authoritative spec source for 5.44? Check the BRE technical papers in domain/sap10_calculator/docs/specs/sap10 technical papers/ for any RdSAP 10 errata or fuel-price update.
  2. Should bio-FAME price also flip (worksheet uses 7.64 for FAME but spec says 5.44 — possible spec PDF row swap)?
  3. Should standing charges, CO2, or PE factors change too? Per the evidence above only the unit-price column is divergent.

The user explicitly agreed to the two-slice split so any spec-target change in S0380.131 is isolated and reviewable on its own.

Test baseline at HEAD c8486077

PYTHONPATH=/workspaces/model python -m pytest \
    backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
    backend/documents_parser/tests/test_heating_systems_corpus.py \
    backend/documents_parser/tests/test_elmhurst_extractor.py \
    backend/documents_parser/tests/test_elmhurst_end_to_end.py \
    domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
    domain/sap10_calculator/worksheet/tests/test_heat_transmission.py \
    domain/sap10_calculator/worksheet/tests/test_internal_gains.py \
    domain/sap10_calculator/worksheet/tests/test_solar_gains.py \
    domain/sap10_calculator/worksheet/tests/test_dimensions.py \
    domain/sap10_calculator/worksheet/tests/test_rating.py \
    domain/sap10_calculator/worksheet/tests/test_ventilation.py \
    domain/sap10_calculator/worksheet/tests/test_appendix_h_solar.py \
    domain/sap10_calculator/worksheet/tests/test_mev.py \
    domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \
    domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
    domain/sap10_calculator/tests/test_pcdb_table_322_lookup.py \
    domain/sap10_calculator/tests/test_pcdb_table_329_lookup.py \
    domain/sap10_calculator/tests/test_table_12a.py \
    --no-cov -q

Expected: 874 pass, 0 fail.

Memories to load (in order)

  1. project-heating-systems-corpus — full corpus state at HEAD c8486077
  2. project-oil-price-spec-divergence — S0380.131 plan + evidence
  3. project-cert-000565-recovery-state — per-slice history (legacy log)
  4. feedback-sap-10-2-only-never-10-3CRITICAL — never reference SAP 10.3
  5. feedback-worksheet-not-api-reference — worksheet PDF is source of truth
  6. feedback-spec-citation-in-commits — quote spec + page in commits
  7. feedback-verify-handover-claims — verify numeric claims against PDFs
  8. feedback-zero-error-strict — never widen tolerances; re-pin smaller
  9. feedback-commit-per-slice — one slice = one commit
  10. feedback-aaa-test-convention — literal # Arrange / # Act / # Assert
  11. feedback-e2e-validation-philosophy — abs=1e-4 pins
  12. feedback-abs-diff-over-pytest-approxabs(x-y) <= tol
  13. feedback-spec-floor-skepticism — verify "precision floor" against PDFs
  14. feedback-golden-residuals-near-zero — pins shrink toward zero
  15. feedback-one-e-minus-4-across-the-board — 1e-4 bar for HP certs too
  16. reference-unmapped-sap-code — calculator strict-raise pattern
  17. reference-unmapped-api-code — mapper strict-raise pattern
  18. project-sap10-ml-deprecationdomain/sap10_ml/ is retiring

Spec source quick-reference

All under domain/sap10_calculator/docs/specs/:

  • SAP 10.2 full spec: sap-10-2-full-specification-2025-03-14.pdf
    • §13 + Table 12 (p.191) — fuel cost / ECF / SAP rating
    • Table 4a-d (p.163-170) — heating systems + responsiveness
    • Appendix N (p.101-107) — heat pumps
  • RdSAP 10 spec: RdSAP 10 Specification 10-06-2025.pdf
    • §5 (p.29) — fabric defaults
    • §10.11 Table 29 (p.56) — heating/HW parameters (closed in S0380.126)
    • Table 28 (p.55) — cylinder size (closed in S0380.127)
    • §12 (p.62) — electricity tariff dispatch
    • §17 (p.85) — data collection (meter_type lodging form)
    • §19 Table 32 (p.95) — RdSAP10 fuel prices / CO2 / PE factors
  • BRE technical papers at sap10 technical papers/ — check for any RdSAP 10 errata / fuel-price update relevant to S0380.131
  • SAP 10.3 at sap-10-3-full-specification-2026-01-13.pdf: DO NOT reference (feedback-sap-10-2-only-never-10-3)

Standard workflow per slice

  1. Read spec page + identify rule
  2. Probe cascade vs worksheet/PDF; back-solve hypothesis
  3. Write failing AAA test
  4. Implement helper / cascade change
  5. Verify test passes
  6. Run extended handover suite (above command)
  7. Check pyright on touched files — net-zero from baseline (git stash → pyright → git stash pop → pyright)
  8. Commit with spec citation + verbatim quote + Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
  9. Update project-heating-systems-corpus + MEMORY.md index

What NOT to do

  • Don't reference SAP 10.3 — track 10.2 deliberately
  • Don't widen pin tolerances to make pins pass — re-pin smaller or find the spec gap
  • Don't re-investigate closed work (Slices .91..130) — all settled
  • Don't add new helpers to domain/sap10_ml/ — on the deprecation path
  • Don't conflate the mapper fix (S0380.130) with the price fix (S0380.131) — they're distinct. The mapper fix doesn't close cert 0240; only the price fix does
  • Don't accept "spec-precision floor" framing without spec-citation work — verify against worksheet PDF + cross-cert empirical evidence

Where new heating-systems-corpus fixtures live

  • Summary PDF: sap worksheets/heating systems examples/<variant>/Summary_001431.pdf
  • P960 worksheet PDF: sap worksheets/heating systems examples/<variant>/P960-0001-001431 - <timestamp>.pdf
  • Pin entries: backend/documents_parser/tests/test_heating_systems_corpus.py's _EXPECTATIONS tuple

User direction

Two-slice plan (S0380.130 + S0380.131) was agreed in the conversation. S0380.130 landed first. The user explicitly noted that the mapper fix and the golden-bug fix are distinct — the next agent should preserve that distinction in any future communication.

Good luck.