Captures the heating-systems corpus closure work, the new permanent residual-pin regression test, and the queued S0380.131 candidate (heating-oil unit price spec-vs-worksheet divergence). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
12 KiB
Handover — post Slices S0380.125..130
Branch: feature/per-cert-mapper-validation. HEAD c8486077.
Predecessor: HANDOVER_POST_S0380_124.md.
TL;DR
Six slices landed on top of 8904ec09. The user pivoted away from
cert 0240's residual closure and into a new controlled-variable
heating-systems corpus (1 property × 41 heating variants). All 41
now cascade-execute; permanent residual-pin regression test landed;
investigation surfaced a heating-oil unit-price discrepancy between
the published RdSAP 10 spec PDF (7.64 p/kWh) and the
operationally-canonical Elmhurst worksheet + gov.uk register values
(5.44 p/kWh).
| Slice | Commit | Scope |
|---|---|---|
| S0380.125 | d8cdee4e |
meter_type "18 Hour" alias per RdSAP 10 §17 + §12 |
| S0380.126 | e25aa021 |
bare "Underfloor Heating" → §10.11 Table 29 subtype derivation |
| S0380.127 | 11ecac94 |
"No Access" cylinder → Table 28 derivation (oil HW + off-peak meter) |
| S0380.128 | 729ee29c |
extractor §14.0 closure falls back to "14.1 Community Heating" |
| S0380.129 | 82b8a16b |
permanent residual-pin regression guard (41 parametrised) |
| S0380.130 | c8486077 |
Elmhurst oil-mains routed via §15.0 Water Heating Fuel Type fallback |
Extended handover suite at HEAD: 874 pass, 0 fail.
What changed
The corpus
User provided sap worksheets/heating systems examples/ — 47 folders,
41 populated (6 empty: community heating 5, electric 4,
electric 10, gshp 2, pcdb 2, solid fuel 1). Every variant is
the same dwelling (Reference 001431, semi-detached, TFA 90 m², age G
1983-1990, W6 9BF) under a different heating system. Each carries an
Elmhurst Summary PDF + an Elmhurst P960 worksheet PDF. Controlled-
variable test set — cascade-vs-worksheet residuals are fully
attributable to the heating subsystem.
Permanent regression test
backend/documents_parser/tests/test_heating_systems_corpus.py
(S0380.129) — single parametrised test
test_heating_systems_corpus_residual_matches_pin driven by 41
_CorpusExpectation entries. Per variant:
- Block 11a (individual) or 11b (community) pins extracted from P960:
continuous SAP (
SAP value), total fuel cost (255)/(355), CO2 (272/372/382/383), PE (286/386/486/483). - Summary PDF → extractor → mapper → cascade.
- Each cascade output pinned against the residual at tight tolerance (SAP ±0.001, cost ±£0.01, CO2 ±0.1 kg/yr, PE ±0.1 kWh/yr).
Tolerances stay tight; expected residuals move toward 0 as heating-cascade gaps close. Per feedback-zero-error-strict + feedback-golden-residuals-near-zero — re-pin smaller, never widen the tolerance.
Current residual cluster (post-S0380.130)
Cascade SAP_c minus worksheet SAP_c per variant, sorted by absolute value (smallest first):
| Variant | ΔSAP_c | Notes |
|---|---|---|
| solid fuel 8 | +0.87 | closest to closure |
| community heating 2/4 | +1.16 | gas-fired heat network (envelope-identical pairs) |
| solid fuel 5 | +3.79 | |
| community heating 1/3 | +4.18 | gas-fired heat network (1↔3 + 2↔4 pairs) |
| solid fuel 4 | +5.07 | |
| gshp | +5.16 | |
| ashp | +5.67 | |
| community heating 6 | −6.87 | only negative ΔSAP — heat-pump heat network |
| oil 1 | −9.70 | after S0380.130 — over-counts at 7.64 p/kWh |
| pcdb 1 | −9.41 | after S0380.130 |
| oil pcdb 3 | −10.87 | after S0380.130 |
| oil pcdb 1/2 | −11.63 | after S0380.130 |
| oil 3 | +30.95 | bio-FAME boiler (worksheet uses 7.64, spec says 5.44) |
| no system | +21.94 | SAP code 699 |
| oil 5 (pathological) | +120.75 | bioethanol; worksheet clamps SAP int to 1 |
The S0380.131 candidate — heating-oil unit price
Status: queued, decision pending. Two slices were agreed; S0380.130 landed the mapper half. S0380.131 is the cascade-price half.
Evidence
| Source | Heating oil p/kWh | Heating oil CO2 kg/kWh |
|---|---|---|
| SAP 10.2 spec PDF Table 12 p.191 | 4.94 | 0.298 |
| RdSAP 10 spec PDF Table 32 p.95 | 7.64 | 0.298 |
domain/sap10_calculator/tables/table_32.py (verbatim from RdSAP 10) |
7.64 | 0.298 |
| Elmhurst P960 worksheet for oil 1 + oil pcdb 1/3 | 5.44 | 0.298 |
| Cert 0240 (gov.uk register lodged SAP 73) back-solved | ~5.48 | matches oil |
Two independent implementations (Elmhurst worksheet + gov.uk register's lodging software) agree on 5.44 for heating oil; the published RdSAP 10 spec PDF (7.64) is the outlier. Per feedback-worksheet-not-api-reference the worksheet is the source of truth.
Two distinct gaps were investigated
The S0380.130 mapper fix and S0380.131 price fix are independent:
- S0380.130 (landed) fixes the Elmhurst mapper for oil mains. It affects the heating-systems corpus (oil 1, oil pcdb 1/2/3, pcdb 1). It does NOT touch cert 0240 (which already uses the API mapper with correct fuel routing).
- S0380.131 (queued) would switch the cascade's heating-oil tariff to 5.44. It affects ANY oil cert whose cost passes through the cascade — including the heating-systems corpus AND cert 0240 AND cert 0390 in the golden corpus.
Closing S0380.131 is what would move cert 0240's golden residual from −10 toward 0; S0380.130 alone leaves cert 0240 unchanged.
Projected impact of switching cascade to 5.44
| Cert | Current ΔSAP | After 7.64 → 5.44 |
|---|---|---|
| oil 1 corpus | −9.70 | ~+0.6 (closes) |
| oil pcdb 1/2 corpus | −11.63 | ~−1 |
| oil pcdb 3 corpus | −10.87 | ~−1 |
| pcdb 1 corpus | −9.41 | ~+1 |
| cert 0240 golden | −10 | ~0 (closes exactly to lodged 73) |
| cert 0390 golden | −6 | improves significantly |
Open questions before implementing
- Is there a more authoritative spec source for 5.44? Check the BRE
technical papers in
domain/sap10_calculator/docs/specs/sap10 technical papers/for any RdSAP 10 errata or fuel-price update. - Should bio-FAME price also flip (worksheet uses 7.64 for FAME but spec says 5.44 — possible spec PDF row swap)?
- Should standing charges, CO2, or PE factors change too? Per the evidence above only the unit-price column is divergent.
The user explicitly agreed to the two-slice split so any spec-target change in S0380.131 is isolated and reviewable on its own.
Test baseline at HEAD c8486077
PYTHONPATH=/workspaces/model python -m pytest \
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
backend/documents_parser/tests/test_heating_systems_corpus.py \
backend/documents_parser/tests/test_elmhurst_extractor.py \
backend/documents_parser/tests/test_elmhurst_end_to_end.py \
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
domain/sap10_calculator/worksheet/tests/test_heat_transmission.py \
domain/sap10_calculator/worksheet/tests/test_internal_gains.py \
domain/sap10_calculator/worksheet/tests/test_solar_gains.py \
domain/sap10_calculator/worksheet/tests/test_dimensions.py \
domain/sap10_calculator/worksheet/tests/test_rating.py \
domain/sap10_calculator/worksheet/tests/test_ventilation.py \
domain/sap10_calculator/worksheet/tests/test_appendix_h_solar.py \
domain/sap10_calculator/worksheet/tests/test_mev.py \
domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
domain/sap10_calculator/tests/test_pcdb_table_322_lookup.py \
domain/sap10_calculator/tests/test_pcdb_table_329_lookup.py \
domain/sap10_calculator/tests/test_table_12a.py \
--no-cov -q
Expected: 874 pass, 0 fail.
Memories to load (in order)
project-heating-systems-corpus— full corpus state at HEADc8486077project-oil-price-spec-divergence— S0380.131 plan + evidenceproject-cert-000565-recovery-state— per-slice history (legacy log)feedback-sap-10-2-only-never-10-3— CRITICAL — never reference SAP 10.3feedback-worksheet-not-api-reference— worksheet PDF is source of truthfeedback-spec-citation-in-commits— quote spec + page in commitsfeedback-verify-handover-claims— verify numeric claims against PDFsfeedback-zero-error-strict— never widen tolerances; re-pin smallerfeedback-commit-per-slice— one slice = one commitfeedback-aaa-test-convention— literal# Arrange / # Act / # Assertfeedback-e2e-validation-philosophy— abs=1e-4 pinsfeedback-abs-diff-over-pytest-approx—abs(x-y) <= tolfeedback-spec-floor-skepticism— verify "precision floor" against PDFsfeedback-golden-residuals-near-zero— pins shrink toward zerofeedback-one-e-minus-4-across-the-board— 1e-4 bar for HP certs tooreference-unmapped-sap-code— calculator strict-raise patternreference-unmapped-api-code— mapper strict-raise patternproject-sap10-ml-deprecation—domain/sap10_ml/is retiring
Spec source quick-reference
All under domain/sap10_calculator/docs/specs/:
- SAP 10.2 full spec:
sap-10-2-full-specification-2025-03-14.pdf- §13 + Table 12 (p.191) — fuel cost / ECF / SAP rating
- Table 4a-d (p.163-170) — heating systems + responsiveness
- Appendix N (p.101-107) — heat pumps
- RdSAP 10 spec:
RdSAP 10 Specification 10-06-2025.pdf- §5 (p.29) — fabric defaults
- §10.11 Table 29 (p.56) — heating/HW parameters (closed in S0380.126)
- Table 28 (p.55) — cylinder size (closed in S0380.127)
- §12 (p.62) — electricity tariff dispatch
- §17 (p.85) — data collection (meter_type lodging form)
- §19 Table 32 (p.95) — RdSAP10 fuel prices / CO2 / PE factors
- BRE technical papers at
sap10 technical papers/— check for any RdSAP 10 errata / fuel-price update relevant to S0380.131 - SAP 10.3 at
sap-10-3-full-specification-2026-01-13.pdf: DO NOT reference (feedback-sap-10-2-only-never-10-3)
Standard workflow per slice
- Read spec page + identify rule
- Probe cascade vs worksheet/PDF; back-solve hypothesis
- Write failing AAA test
- Implement helper / cascade change
- Verify test passes
- Run extended handover suite (above command)
- Check pyright on touched files — net-zero from baseline
(
git stash→ pyright →git stash pop→ pyright) - Commit with spec citation + verbatim quote +
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> - Update
project-heating-systems-corpus+MEMORY.mdindex
What NOT to do
- Don't reference SAP 10.3 — track 10.2 deliberately
- Don't widen pin tolerances to make pins pass — re-pin smaller or find the spec gap
- Don't re-investigate closed work (Slices .91..130) — all settled
- Don't add new helpers to
domain/sap10_ml/— on the deprecation path - Don't conflate the mapper fix (S0380.130) with the price fix (S0380.131) — they're distinct. The mapper fix doesn't close cert 0240; only the price fix does
- Don't accept "spec-precision floor" framing without spec-citation work — verify against worksheet PDF + cross-cert empirical evidence
Where new heating-systems-corpus fixtures live
- Summary PDF:
sap worksheets/heating systems examples/<variant>/Summary_001431.pdf - P960 worksheet PDF:
sap worksheets/heating systems examples/<variant>/P960-0001-001431 - <timestamp>.pdf - Pin entries:
backend/documents_parser/tests/test_heating_systems_corpus.py's_EXPECTATIONStuple
User direction
Two-slice plan (S0380.130 + S0380.131) was agreed in the conversation. S0380.130 landed first. The user explicitly noted that the mapper fix and the golden-bug fix are distinct — the next agent should preserve that distinction in any future communication.
Good luck.