docs: handover + next-agent prompt post S0380.125..130

Captures the heating-systems corpus closure work, the new permanent
residual-pin regression test, and the queued S0380.131 candidate
(heating-oil unit price spec-vs-worksheet divergence).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Khalim Conn-Kowlessar 2026-05-31 09:05:24 +00:00
parent c848607718
commit 38e6d18a13
2 changed files with 450 additions and 0 deletions

View file

@ -0,0 +1,253 @@
# Handover — post Slices S0380.125..130
Branch: `feature/per-cert-mapper-validation`. **HEAD `c8486077`**.
Predecessor: [`HANDOVER_POST_S0380_124.md`](HANDOVER_POST_S0380_124.md).
## TL;DR
Six slices landed on top of `8904ec09`. The user pivoted away from
cert 0240's residual closure and into a new controlled-variable
heating-systems corpus (1 property × 41 heating variants). All 41
now cascade-execute; permanent residual-pin regression test landed;
investigation surfaced a heating-oil unit-price discrepancy between
the published RdSAP 10 spec PDF (7.64 p/kWh) and the
operationally-canonical Elmhurst worksheet + gov.uk register values
(5.44 p/kWh).
| Slice | Commit | Scope |
|---|---|---|
| **S0380.125** | `d8cdee4e` | meter_type "18 Hour" alias per RdSAP 10 §17 + §12 |
| **S0380.126** | `e25aa021` | bare "Underfloor Heating" → §10.11 Table 29 subtype derivation |
| **S0380.127** | `11ecac94` | "No Access" cylinder → Table 28 derivation (oil HW + off-peak meter) |
| **S0380.128** | `729ee29c` | extractor §14.0 closure falls back to "14.1 Community Heating" |
| **S0380.129** | `82b8a16b` | permanent residual-pin regression guard (41 parametrised) |
| **S0380.130** | `c8486077` | Elmhurst oil-mains routed via §15.0 Water Heating Fuel Type fallback |
Extended handover suite at HEAD: **874 pass, 0 fail**.
## What changed
### The corpus
User provided `sap worksheets/heating systems examples/` — 47 folders,
**41 populated** (6 empty: `community heating 5`, `electric 4`,
`electric 10`, `gshp 2`, `pcdb 2`, `solid fuel 1`). Every variant is
the same dwelling (Reference 001431, semi-detached, TFA 90 m², age G
1983-1990, W6 9BF) under a different heating system. Each carries an
Elmhurst Summary PDF + an Elmhurst P960 worksheet PDF. Controlled-
variable test set — cascade-vs-worksheet residuals are fully
attributable to the heating subsystem.
### Permanent regression test
[`backend/documents_parser/tests/test_heating_systems_corpus.py`](backend/documents_parser/tests/test_heating_systems_corpus.py)
(S0380.129) — single parametrised test
`test_heating_systems_corpus_residual_matches_pin` driven by 41
`_CorpusExpectation` entries. Per variant:
1. Block 11a (individual) or 11b (community) pins extracted from P960:
continuous SAP (`SAP value`), total fuel cost (255)/(355), CO2
(272/372/382/383), PE (286/386/486/483).
2. Summary PDF → extractor → mapper → cascade.
3. Each cascade output pinned against the residual at tight tolerance
(SAP ±0.001, cost ±£0.01, CO2 ±0.1 kg/yr, PE ±0.1 kWh/yr).
Tolerances stay tight; **expected residuals move toward 0** as
heating-cascade gaps close. Per [[feedback-zero-error-strict]] +
[[feedback-golden-residuals-near-zero]] — re-pin smaller, never
widen the tolerance.
### Current residual cluster (post-S0380.130)
Cascade SAP_c minus worksheet SAP_c per variant, sorted by absolute
value (smallest first):
| Variant | ΔSAP_c | Notes |
|---|---:|---|
| solid fuel 8 | +0.87 | closest to closure |
| community heating 2/4 | +1.16 | gas-fired heat network (envelope-identical pairs) |
| solid fuel 5 | +3.79 | |
| community heating 1/3 | +4.18 | gas-fired heat network (1↔3 + 2↔4 pairs) |
| solid fuel 4 | +5.07 | |
| gshp | +5.16 | |
| ashp | +5.67 | |
| **community heating 6** | **6.87** | **only negative ΔSAP — heat-pump heat network** |
| oil 1 | **9.70** | **after S0380.130 — over-counts at 7.64 p/kWh** |
| pcdb 1 | 9.41 | **after S0380.130** |
| oil pcdb 3 | 10.87 | **after S0380.130** |
| oil pcdb 1/2 | 11.63 | **after S0380.130** |
| oil 3 | +30.95 | bio-FAME boiler (worksheet uses 7.64, spec says 5.44) |
| no system | +21.94 | SAP code 699 |
| oil 5 (pathological) | +120.75 | bioethanol; worksheet clamps SAP int to 1 |
## The S0380.131 candidate — heating-oil unit price
**Status: queued, decision pending.** Two slices were agreed; S0380.130
landed the mapper half. S0380.131 is the cascade-price half.
### Evidence
| Source | Heating oil p/kWh | Heating oil CO2 kg/kWh |
|---|---:|---:|
| SAP 10.2 spec PDF Table 12 p.191 | 4.94 | 0.298 |
| **RdSAP 10 spec PDF** Table 32 p.95 | **7.64** | 0.298 |
| `domain/sap10_calculator/tables/table_32.py` (verbatim from RdSAP 10) | 7.64 | 0.298 |
| **Elmhurst P960 worksheet** for oil 1 + oil pcdb 1/3 | **5.44** | 0.298 |
| **Cert 0240** (gov.uk register lodged SAP 73) back-solved | **~5.48** | matches oil |
Two independent implementations (Elmhurst worksheet + gov.uk register's
lodging software) agree on **5.44** for heating oil; the published
RdSAP 10 spec PDF (7.64) is the outlier. Per
[[feedback-worksheet-not-api-reference]] the worksheet is the source
of truth.
### Two distinct gaps were investigated
The S0380.130 mapper fix and S0380.131 price fix are **independent**:
- **S0380.130** (landed) fixes the Elmhurst mapper for oil mains. It
affects the heating-systems corpus (oil 1, oil pcdb 1/2/3, pcdb 1).
It does NOT touch cert 0240 (which already uses the API mapper with
correct fuel routing).
- **S0380.131** (queued) would switch the cascade's heating-oil tariff
to 5.44. It affects ANY oil cert whose cost passes through the
cascade — including the heating-systems corpus AND cert 0240 AND
cert 0390 in the golden corpus.
Closing S0380.131 is what would move cert 0240's golden residual from
10 toward 0; S0380.130 alone leaves cert 0240 unchanged.
### Projected impact of switching cascade to 5.44
| Cert | Current ΔSAP | After 7.64 → 5.44 |
|---|---:|---:|
| oil 1 corpus | 9.70 | ~+0.6 (closes) |
| oil pcdb 1/2 corpus | 11.63 | ~1 |
| oil pcdb 3 corpus | 10.87 | ~1 |
| pcdb 1 corpus | 9.41 | ~+1 |
| **cert 0240 golden** | **10** | **~0 (closes exactly to lodged 73)** |
| cert 0390 golden | 6 | improves significantly |
### Open questions before implementing
1. Is there a more authoritative spec source for 5.44? Check the BRE
technical papers in `domain/sap10_calculator/docs/specs/sap10
technical papers/` for any RdSAP 10 errata or fuel-price update.
2. Should bio-FAME price also flip (worksheet uses 7.64 for FAME but
spec says 5.44 — possible spec PDF row swap)?
3. Should standing charges, CO2, or PE factors change too? Per the
evidence above only the unit-price column is divergent.
The user explicitly agreed to the two-slice split so any spec-target
change in S0380.131 is isolated and reviewable on its own.
## Test baseline at HEAD `c8486077`
```bash
PYTHONPATH=/workspaces/model python -m pytest \
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
backend/documents_parser/tests/test_heating_systems_corpus.py \
backend/documents_parser/tests/test_elmhurst_extractor.py \
backend/documents_parser/tests/test_elmhurst_end_to_end.py \
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
domain/sap10_calculator/worksheet/tests/test_heat_transmission.py \
domain/sap10_calculator/worksheet/tests/test_internal_gains.py \
domain/sap10_calculator/worksheet/tests/test_solar_gains.py \
domain/sap10_calculator/worksheet/tests/test_dimensions.py \
domain/sap10_calculator/worksheet/tests/test_rating.py \
domain/sap10_calculator/worksheet/tests/test_ventilation.py \
domain/sap10_calculator/worksheet/tests/test_appendix_h_solar.py \
domain/sap10_calculator/worksheet/tests/test_mev.py \
domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
domain/sap10_calculator/tests/test_pcdb_table_322_lookup.py \
domain/sap10_calculator/tests/test_pcdb_table_329_lookup.py \
domain/sap10_calculator/tests/test_table_12a.py \
--no-cov -q
```
Expected: **874 pass, 0 fail**.
## Memories to load (in order)
1. `project-heating-systems-corpus` — full corpus state at HEAD `c8486077`
2. `project-oil-price-spec-divergence` — S0380.131 plan + evidence
3. `project-cert-000565-recovery-state` — per-slice history (legacy log)
4. `feedback-sap-10-2-only-never-10-3`**CRITICAL** — never reference SAP 10.3
5. `feedback-worksheet-not-api-reference` — worksheet PDF is source of truth
6. `feedback-spec-citation-in-commits` — quote spec + page in commits
7. `feedback-verify-handover-claims` — verify numeric claims against PDFs
8. `feedback-zero-error-strict` — never widen tolerances; re-pin smaller
9. `feedback-commit-per-slice` — one slice = one commit
10. `feedback-aaa-test-convention` — literal `# Arrange / # Act / # Assert`
11. `feedback-e2e-validation-philosophy` — abs=1e-4 pins
12. `feedback-abs-diff-over-pytest-approx``abs(x-y) <= tol`
13. `feedback-spec-floor-skepticism` — verify "precision floor" against PDFs
14. `feedback-golden-residuals-near-zero` — pins shrink toward zero
15. `feedback-one-e-minus-4-across-the-board` — 1e-4 bar for HP certs too
16. `reference-unmapped-sap-code` — calculator strict-raise pattern
17. `reference-unmapped-api-code` — mapper strict-raise pattern
18. `project-sap10-ml-deprecation``domain/sap10_ml/` is retiring
## Spec source quick-reference
All under `domain/sap10_calculator/docs/specs/`:
- **SAP 10.2 full spec**: `sap-10-2-full-specification-2025-03-14.pdf`
- §13 + Table 12 (p.191) — fuel cost / ECF / SAP rating
- Table 4a-d (p.163-170) — heating systems + responsiveness
- Appendix N (p.101-107) — heat pumps
- **RdSAP 10 spec**: `RdSAP 10 Specification 10-06-2025.pdf`
- §5 (p.29) — fabric defaults
- §10.11 Table 29 (p.56) — heating/HW parameters (closed in S0380.126)
- Table 28 (p.55) — cylinder size (closed in S0380.127)
- §12 (p.62) — electricity tariff dispatch
- §17 (p.85) — data collection (meter_type lodging form)
- §19 Table 32 (p.95) — RdSAP10 fuel prices / CO2 / PE factors
- **BRE technical papers** at `sap10 technical papers/` — check for any
RdSAP 10 errata / fuel-price update relevant to S0380.131
- **SAP 10.3** at `sap-10-3-full-specification-2026-01-13.pdf`:
**DO NOT reference** ([[feedback-sap-10-2-only-never-10-3]])
## Standard workflow per slice
1. Read spec page + identify rule
2. Probe cascade vs worksheet/PDF; back-solve hypothesis
3. Write failing AAA test
4. Implement helper / cascade change
5. Verify test passes
6. Run extended handover suite (above command)
7. Check pyright on touched files — net-zero from baseline
(`git stash` → pyright → `git stash pop` → pyright)
8. Commit with spec citation + verbatim quote +
`Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>`
9. Update `project-heating-systems-corpus` + `MEMORY.md` index
## What NOT to do
- **Don't reference SAP 10.3** — track 10.2 deliberately
- **Don't widen pin tolerances** to make pins pass — re-pin smaller or
find the spec gap
- **Don't re-investigate closed work** (Slices .91..130) — all settled
- **Don't add new helpers to `domain/sap10_ml/`** — on the deprecation path
- **Don't conflate the mapper fix (S0380.130) with the price fix
(S0380.131)** — they're distinct. The mapper fix doesn't close cert
0240; only the price fix does
- **Don't accept "spec-precision floor" framing** without spec-citation
work — verify against worksheet PDF + cross-cert empirical evidence
## Where new heating-systems-corpus fixtures live
- Summary PDF: `sap worksheets/heating systems examples/<variant>/Summary_001431.pdf`
- P960 worksheet PDF: `sap worksheets/heating systems examples/<variant>/P960-0001-001431 - <timestamp>.pdf`
- Pin entries: `backend/documents_parser/tests/test_heating_systems_corpus.py`'s
`_EXPECTATIONS` tuple
## User direction
Two-slice plan (S0380.130 + S0380.131) was agreed in the conversation.
S0380.130 landed first. The user explicitly noted that the mapper fix
and the golden-bug fix are distinct — the next agent should preserve
that distinction in any future communication.
Good luck.

View file

@ -0,0 +1,197 @@
# Next-agent prompt — post S0380.130
You are picking up on branch `feature/per-cert-mapper-validation` at
**HEAD `c8486077`**. The previous session built a controlled-variable
heating-systems corpus (1 property × 41 heating variants), unblocked
all 41 to cascade-execute through 4 spec-cited closures, landed a
permanent residual-pin regression test, and routed the Elmhurst
mapper for oil mains via §15.0 Water Heating Fuel Type. Extended
handover suite: **874 pass, 0 fail**.
## Read these first
In order, before any tool call:
1. [`HANDOVER_POST_S0380_130.md`](HANDOVER_POST_S0380_130.md) — full
state at HEAD `c8486077`, S0380.131 plan + evidence, all open
residuals.
2. [`HANDOVER_POST_S0380_124.md`](HANDOVER_POST_S0380_124.md) — prior
state at HEAD `1e69bd39` (cert 0240 deferred + handover hypotheses
ranking — note the prior hypothesis ranking was disproved during
the S0380.130 investigation).
## Load these memories before starting
```
project-heating-systems-corpus # full corpus state + 41 residual pins
project-oil-price-spec-divergence # S0380.131 plan + evidence
project-cert-000565-recovery-state # per-slice history (legacy log)
feedback-sap-10-2-only-never-10-3 # CRITICAL — never reference SAP 10.3
feedback-worksheet-not-api-reference # worksheet PDF is source of truth
feedback-spec-citation-in-commits # quote spec + page in commits
feedback-verify-handover-claims # verify numeric claims against PDFs
feedback-zero-error-strict # never widen tolerances; re-pin smaller
feedback-commit-per-slice # one slice = one commit
feedback-aaa-test-convention # literal # Arrange / # Act / # Assert
feedback-e2e-validation-philosophy # abs=1e-4 pins
feedback-abs-diff-over-pytest-approx # abs(x-y) <= tol
feedback-spec-floor-skepticism # verify "precision floor" against PDFs
feedback-golden-residuals-near-zero # pins shrink toward zero
feedback-one-e-minus-4-across-the-board # 1e-4 bar for HP certs too
reference-unmapped-sap-code # calculator strict-raise pattern
reference-unmapped-api-code # mapper strict-raise pattern
project-sap10-ml-deprecation # domain/sap10_ml/ is retiring
```
## Verify baseline first
```bash
PYTHONPATH=/workspaces/model python -m pytest \
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
backend/documents_parser/tests/test_heating_systems_corpus.py \
backend/documents_parser/tests/test_elmhurst_extractor.py \
backend/documents_parser/tests/test_elmhurst_end_to_end.py \
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
domain/sap10_calculator/worksheet/tests/test_heat_transmission.py \
domain/sap10_calculator/worksheet/tests/test_internal_gains.py \
domain/sap10_calculator/worksheet/tests/test_solar_gains.py \
domain/sap10_calculator/worksheet/tests/test_dimensions.py \
domain/sap10_calculator/worksheet/tests/test_rating.py \
domain/sap10_calculator/worksheet/tests/test_ventilation.py \
domain/sap10_calculator/worksheet/tests/test_appendix_h_solar.py \
domain/sap10_calculator/worksheet/tests/test_mev.py \
domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
domain/sap10_calculator/tests/test_pcdb_table_322_lookup.py \
domain/sap10_calculator/tests/test_pcdb_table_329_lookup.py \
domain/sap10_calculator/tests/test_table_12a.py \
--no-cov -q
```
Expected: **874 pass, 0 fail**.
## The queued task — S0380.131 (heating-oil unit price)
**The user agreed to a two-slice plan to investigate oil 1's residual.
S0380.130 (mapper) landed first. S0380.131 (cascade price) is up
next, but the user wants it presented as a DISTINCT task — not a
follow-on to S0380.130.**
### Evidence (verbatim from S0380.130 investigation)
| Source | Heating oil p/kWh | Heating oil CO2 |
|---|---:|---:|
| SAP 10.2 spec PDF Table 12 p.191 | 4.94 | 0.298 |
| **RdSAP 10 spec PDF** Table 32 p.95 | **7.64** | 0.298 |
| `domain/sap10_calculator/tables/table_32.py` | 7.64 | 0.298 |
| **Elmhurst P960 worksheet** for oil 1 / oil pcdb 1/3 | **5.44** | 0.298 |
| **Cert 0240** gov.uk register, back-solved from SAP 73 | **~5.48** | matches |
Two independent implementations (Elmhurst worksheet + the gov.uk
register's lodging software) agree on **5.44 p/kWh** for heating
oil. The published RdSAP 10 spec PDF (7.64) is the outlier.
Per [[feedback-worksheet-not-api-reference]] the worksheet PDF is
the source of truth. Per [[feedback-spec-floor-skepticism]] don't
accept the spec-vs-worksheet gap without verification.
### Before implementing — investigate further
1. Read the BRE technical papers at
`domain/sap10_calculator/docs/specs/sap10 technical papers/`
for any RdSAP 10 errata or fuel-price update relevant to the 5.44
vs 7.64 discrepancy. Specifically look for STPs touching Table 32
or fuel prices.
2. Check if RdSAP 10 has a newer spec revision than `10-06-2025` in
`domain/sap10_calculator/docs/specs/`.
3. Verify the Elmhurst worksheet's heating-oil price across more
variants: oil 2 (HVO) uses 7.64; oil 3/4 (FAME) use 7.64; only
oil 1 + oil pcdb 1/3 use 5.44. So Elmhurst clearly distinguishes
them — it's the heating-oil row specifically that uses 5.44.
### Implementation plan (after investigation)
If the worksheet value 5.44 is empirically canonical:
1. **Failing test**: pin an oil-cert cascade SAP_c at the worksheet
value — e.g. oil 1 to ~+0.6 ΔSAP_c (instead of 9.70).
2. **Implement**: change
`domain/sap10_calculator/tables/table_32.py` `UNIT_PRICE_P_PER_KWH`
entry for code 4 (heating oil): 7.64 → 5.44.
3. **Consider**: should bio-FAME (code 73) also flip from 5.44 → 7.64
(matching worksheet's FAME treatment for oil 3/4)? Empirically
yes; if so add as part of the same slice.
4. **Re-pin** the 4 corpus oil variants in
`test_heating_systems_corpus.py` to the new (smaller-magnitude)
residuals.
5. **Re-pin** cert 0240 + cert 0390 in
`test_golden_fixtures.py` to the new residuals.
6. **Verify** cohort fixtures (000474..000516, 000565, ASHP cohort)
are all gas/HP — none oil-fired, so unaffected. Run extended
handover suite to confirm.
7. **Commit** S0380.131 with verbatim worksheet PDF evidence + cert
0240 back-solve as the citation. The spec PDF doesn't support
the value, so the empirical citation is what carries the slice.
### Projected impact
| Cert | Current ΔSAP_c | After 7.64 → 5.44 |
|---|---:|---:|
| oil 1 corpus | 9.70 | ~+0.6 (closes) |
| oil pcdb 1/2 corpus | 11.63 | ~1 |
| oil pcdb 3 corpus | 10.87 | ~1 |
| pcdb 1 corpus | 9.41 | ~+1 |
| **cert 0240 golden** | **10 SAP int** | **~0 (closes exactly to lodged 73)** |
| cert 0390 golden | 6 | improves significantly |
### Important: don't conflate S0380.130 and S0380.131
The user noted explicitly: **the mapper fix (S0380.130) and the
price fix (S0380.131) are distinct**. S0380.130 closed an Elmhurst
mapper coverage gap; it doesn't affect cert 0240 (which uses the
API mapper). S0380.131 changes the cascade tariff; it affects every
oil-heated cert whose cost passes through the cascade.
Don't present them as a chain ("we fixed the mapper, now let's fix
the price"). They're independent bugs that happen to both involve
oil.
## After S0380.131 — what's next
The corpus residual cluster still has work after the oil price
closes:
| ΔSAP_c | Variant | Likely cause |
|---|---:|---|
| +0.87 | solid fuel 8 | smallest residual — diagnose first |
| +1.16 | community heating 2/4 | gas-fired heat network |
| +3.79 | solid fuel 5 | solid-fuel cluster |
| 6.87 | community heating 6 | only negative — heat-pump heat network |
| +21.94 | no system | SAP code 699 |
| +120.75 | oil 5 (pathological) | bioethanol; worksheet clamps SAP int to 1 |
User direction at end of last session: investigate the smallest
residual first (`solid fuel 8` +0.87), the community-heating cluster
(envelope-identical pairs 1↔3 and 2↔4 — clean comparison), or the
lone negative outlier (`community heating 6`).
## What NOT to do
- **Don't reference SAP 10.3** ([[feedback-sap-10-2-only-never-10-3]])
- **Don't widen pin tolerances** to make pins pass — re-pin smaller
- **Don't re-investigate closed work** — Slices .91..130 all settled
- **Don't add new helpers to `domain/sap10_ml/`** — on the deprecation path
- **Don't conflate the mapper fix with the price fix** — they're distinct
- **Don't accept "spec-precision floor" framing** without verification
## Memory hygiene
After each slice:
1. Update `project-heating-systems-corpus` (per-variant residual table).
2. Update `MEMORY.md` — keep the HEAD pointer current.
3. If S0380.131 lands and cert 0240 closes, update
`project-cert-000565-recovery-state` to reflect the new golden
residuals.
Good luck.