Closes §4 LINE_43 + LINE_44/45/46/61/62/64 for 000487 (7 of 8 fails).
LINE_65 still fails — needs Appendix J step 8 (electric-shower kWh
derivation from cert) to land before LINE_65 heat gains close.
Spec citation: SAP10.2 Appendix J (p.81) step 2a: `Nbath = 0.13N + 0.19
if shower also present; = 0.35N + 0.50 if no shower present`. The
"shower also present" branch fires when ANY shower is lodged — mixer OR
electric — per the implicit reading that step 1a's Noutlets includes
electric showers in the count.
Changes:
- SapHeating gains `electric_shower_count` + `mixer_shower_count`.
- `water_heating_from_cert` gains `has_electric_shower: bool = False`;
combined with mixer-flow-rate presence to drive `has_shower`.
- `_mixer_shower_flow_rates_from_cert` honors `mixer_shower_count`
(default 1 vented when unlodged — preserves legacy behaviour).
- `_has_electric_shower_from_cert` new helper.
- `water_heating_section_from_cert` plumbs `has_electric_shower`
through bootstrap + final call (and the internal cert_to_inputs path).
- 000487 fixture: `electric_shower_count=1, mixer_shower_count=0`.
§4 per-fixture:
fixture | LINE_42 | LINE_43 | LINE_44-46 | LINE_61-65
000474 | ✓ | ✓ | ✓ | ✓ (9/9)
000477 | ✓ | ✓ | ✓ | ✗ LINE_61/62/64/65 (slice 25c)
000480 | ✓ | ✓ | ✓ | ✓ (9/9)
000487 | ✓ | ✓ | ✓ | ✓ except LINE_65 (8/9)
000490 | ✓ | ✓ | ✓ | ✓ (9/9)
000516 | ✓ | ✓ | ✓ | ✓ (9/9)
Scoreboard:
section_cascade_pins: 279 → 286 PASS (+7)
e2e SapResult: 32 → 32 PASS (unchanged — LINE_65 cascade still
open, blocks downstream §5 LINE_72/73 + §6 LINE_84 + §7 + downstream)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
21 KiB
Handover — strict zero-error cascade pin closure for the 6 Elmhurst fixtures
For the agent picking up the next chunk of work. Read this BEFORE any tool call. Read it in full. The previous agents' errors are catalogued here so you don't repeat them.
Owner: khalim@domna.homes. Branch: ara-backend-design-prd.
Spec PDFs in docs/sap-spec/: SAP 10.2 (14-03-2025), RdSAP 10 (10-06-2025), PCDF.
§A — Hard rules. Internalise these BEFORE anything else.
A.1 What this project IS
This repo replicates the rdSAP calculation engine to bit-level fidelity against 6 known test vectors (the U985 Elmhurst worksheets):
- Inputs: Summary_NNNNNN.pdf (cert lodgement) for each of 6 fixtures (000474, 000477, 000480, 000487, 000490, 000516).
- Intermediate values: U985-0001-NNNNNN.{pdf,txt} lodges every worksheet line ref (1) through (258+) to 4 decimal places.
- Final outputs: SAP rating (continuous + integer), ECF, total fuel cost, CO2, primary energy, per-end-use kWh.
It is a deterministic numerical function with fully-known test vectors.
A.2 The bar: abs=1e-4 on EVERY pin, every fixture, every line ref
Every SAP-result field AND every section line ref must pin to PDF at abs=1e-4.
- The PDF lodges 4 d.p. display precision. abs=1e-4 is the floor of "match what the PDF says".
- No
rel=...tolerances. Slice 19b removedrel=0.15(fuel cost) andrel=0.05(fuel cost) precedents. Never re-add these. - No
<= 0.5continuous SAP ceilings. Slice 19a removed these. Never re-add. - No
xfailmarkers on cascade pins. A failing pin is a calculator bug or fixture defect to fix. - No "documented widening". There is no such thing for this project.
If a pin can't be closed in the current slice, leave it failing. The failing pin is the next slice's work. Tolerances are NEVER widened to make the suite green. CI red is fine while bugs are being fixed.
A.3 Past-agent mistakes — DO NOT REPEAT
The user is frustrated with previous agents because:
- Treated SAP integer Δ=0 as "closed" — that's a weak gate (hides ±0.5 continuous drift). The real gate is per-line-ref abs=1e-4.
- Widened tolerances to make tests green (
rel=0.15,<=0.5). Every such widening masked a real residual. - Tested sections in isolation using
fixture.LINE_XPDF values AS INPUTS. That doesn't test the cascade — it tests the section formula given correct inputs. The cascade can still drift. - Missed fixture defects — multiple fixtures had missing or wrong lodgement (bulbs, windows, sap_heating, detailed RR, exposed_floor, door_count, per-window U). When a cascade pin fails, ALWAYS audit the fixture against the PDF first.
- Labelled code "SAP 10.3" when implementing SAP 10.2 (mostly cleaned in
slice 21a;
tables/table_12.pyretains intentional 10.2-vs-10.3 comparison). - Diagnosed downstream first. The cascade is upstream→downstream
(§1 → §2 → §3 → §4 → §5 → §6 → §7 → §8 → §9a → §10a → §11a → §12). A
downstream pin failure (e.g.
total_fuel_cost_gbp) is meaningless to diagnose until upstream pins close.
If you find yourself about to widen a tolerance, add an xfail, or skip a fixture — stop and ask the user. Those are anti-patterns for this project.
A.4 Reporting format — use the matrix
The user prefers the per-(fixture × line-ref) matrix for cohort scoreboard updates. Example shape (use this exactly when reporting cascade-pin status):
field | 000474 | 000477 | 000480 | 000487 | 000490 | 000516
sap_score (int) | ✓ | ✓ | ✓ | ✗ | ✓ | ✓
sap_score_continuous | ✗ | ✗ | ✗ | ✗ | ✗ | ✗
ecf | ✗ | ✗ | ✓ | ✗ | ✗ | ✗
...
Or with numeric residuals when finer granularity helps:
fixture | LINE_31 Δ | LINE_33 Δ | LINE_36 Δ | LINE_37 Δ
000474 | 0.0014 | 0.0296 | 0.0002 | 0.0294
000477 | 0.0004 | 0.1246 | ✓ | 0.1244
...
✓ = within abs=1e-4. Numeric value = the actual diff. This format lets the
user scan visually and spot per-fixture vs per-line patterns. Use it instead
of prose summaries when reporting scoreboard state.
A.5 Workflow rules
- Don't scan >50 lines of spec PDF without checking with the user for the specific page/table range. Spec PDFs are big and the user has the page anchors. (Table 11 = page 188, Table 12 = 189, Table 12a = 191, Table 3a/b/c = 160/161/162 already given.)
- One slice = one commit. AAA test convention (`# Arrange / # Act /
Assert`). Co-Authored-By trailer.
- Don't touch SAP rating constants in
worksheet/rating.py—ENERGY_COST_DEFLATOR=0.42,ECF_LOG_THRESHOLD=3.5,SAP_LOG_COEFF=113.7,SAP_LOG_CONSTANT=117.0. SAP 10.2 (14-03-2025) per ADR-0010. Pinned by 8+ tests. - Don't auto-update unrelated git status changes — see deletions/new files
in
git statusthat aren't from your work? Don't touch them without asking. - Don't invoke
/ultrareview— user-triggered only. - Caveman mode for prose. Terse. Technical. No filler.
§B — Current state (as of 2026-05-23)
B.1 Cascade pin scoreboard
Two test files contain the strict pins:
test_e2e_elmhurst_sap_score.py::test_sap_result_pin[fixture-field]— top-level SapResult fields. 66 cases (11 fields × 6 fixtures). Currently 18 PASS / 48 FAIL at abs=1e-4.test_section_cascade_pins.py— per-section line refs walking<section>_from_cert(epc)against PDF. Currently 151 PASS / 35 FAIL:- §1 (dimensions): 12 PASS / 0 FAIL ✓
- §2 (ventilation): 96 PASS / 0 FAIL ✓
- §3 (heat losses): 1 PASS / 23 FAIL
- §4 (water heating): 42 PASS / 12 FAIL
- §5-§12: not yet pinned
Total: 169 PASS / 83 FAIL across the strict pins. 4 of 6 fixtures fully close §1+§2+§4. 000487 is the worst (RR fixture defect propagates everywhere).
(Post-slice-25b: section_cascade_pins 286 PASS / 26 FAIL, e2e SapResult 32 PASS / 40 FAIL. §3 fully closes for all 6 fixtures (24/24). §4 closes 8 of 9 for 000487 — only LINE_65 (heat gains from WH) still fails because the §4 cascade doesn't yet derive (64a) electric-shower kWh from the cert (Appendix J step 8). Remaining cascade failures: §4 on 000477 (combi loss precision, slice 25c) + §4 LINE_65 on 000487 (electric shower derivation), §5/§6 LINE_72/73/84 on 000477+487 (cascade from §4), §7 LINE_92/93 marginal on 000474/477/480/490 (precision artefact), §7 on 000487 (cascade from §4 LINE_65).)
B.2 SapResult pin matrix (post-slice-22/23)
field | 474 | 477 | 480 | 487 | 490 | 516
-----------------------------------|-----|-----|-----|-----|-----|-----
sap_score (int) | ✓ | ✓ | ✓ | ✗ | ✓ | ✓
sap_score_continuous | ✗ | ✗ | ✗ | ✗ | ✗ | ✗
ecf | ✗ | ✗ | ✓ | ✗ | ✗ | ✗
total_fuel_cost_gbp | ✗ | ✗ | ✗ | ✗ | ✗ | ✗
co2_kg_per_yr | ✗ | ✗ | ✗ | ✗ | ✗ | ✗
space_heating_kwh_per_yr | ✗ | ✗ | ✗ | ✗ | ✗ | ✗
main_heating_fuel_kwh_per_yr | ✗ | ✗ | ✗ | ✗ | ✗ | ✗
secondary_heating_fuel_kwh_per_yr | ✓ | ✗ | ✗ | ✗ | ✗ | ✗
hot_water_kwh_per_yr | ✗ | ✗ | ✗ | ✗ | ✗ | ✗
lighting_kwh_per_yr | ✓ | ✓ | ✓ | ✓ | ✓ | ✗
pumps_fans_kwh_per_yr | ✓ | ✓ | ✓ | ✓ | ✓ | ✓
5 of 6 fixtures hit SAP integer Δ=0 (000487 is the holdout). But continuous SAP
is still off by sub-SAP-point amounts on every fixture — none of sap_score_ continuous is closed at abs=1e-4.
B.3 §3 residuals after slice 27b (RdSAP10 §15 element-area rounding)
fixture | LINE_31 Δ | LINE_33 Δ | LINE_36 Δ | LINE_37 Δ
000474 | ✓ | ✓ | ✓ | ✓
000477 | ✓ | ✓ | ✓ | ✓
000480 | ✓ | ✓ | ✓ | ✓
000487 | 8.83 | 37.79 | 1.32 | 39.11
000490 | ✓ | ✓ | ✓ | ✓
000516 | ✓ | ✓ | ✓ | ✓
§3 now closes for 5 of 6 fixtures at abs=1e-4. Slice 27b applied the RdSAP10 §15 (p.66) rounding policy: "All element areas (gross) including window areas: 2 d.p." Per-element gross wall / party / roof / floor / window / door / alt-wall / RR-sub-area inputs to the §3 cascade are now rounded to 2 d.p. before A × U.
The remaining work is on 000487 — the worst fixture — driven by an
RR detailed-surface lodgement defect + a U=0.86 external-gable variant
our gable_wall enum doesn't handle. That's slice 25.
B.4 §4 residuals
fixture | section §4 pin status
000474 | 9/9 ✓
000477 | 5/9 (combi loss LINE_61m diverges → cascades to 62/64/65)
000480 | 9/9 ✓
000487 | 1/9 (LINE_43 + every monthly fails — HW lodgement defect)
000490 | 9/9 ✓
000516 | 9/9 ✓
B.5 Recent slices (in reverse order — newest first)
Slice 25b: 000487 §4 closure (7/8) — has_electric_shower + mixer/electric counts on SapHeating, Appendix J step 2a fix
Slice 25a: 000487 §3 closure — detailed RR + gable_wall_external + Ext1 alt U=1.9 + §3.8 max-floor roof + half-up rounding
Slice 26c: §7 mean internal temp cascade pin (60 cases, 44 PASS) — LINE_85..94
Slice 26b: §6 solar gains cascade pin (12 cases, 10 PASS) + SapRoofWindow solar attrs + plumb to §6 cascade
Slice 26: §5 internal gains cascade pin (54 cases, 50 PASS / 4 FAIL) + rooflight plumb to daylight factor
Slice 27b: §3 element-area + door-area rounding to 2 d.p. per RdSAP10 §15 (p.66)
Slice 27: BS EN ISO 13370 floor U rounded to 2 d.p. per RdSAP10 §5.12
Slice 24: rooflight (line 27a) — SapRoofWindow datatype + 000516 cascade closure
ac68cf88 Slice 23: 000516 detailed RR + exposed_floor + door_count fixture lodgement
6be8fdb7 Slice 22: per-window curtain resistance fix (mixed glazing)
024244ec Slice 21d: §3 cascade pins + heat_transmission_section_from_cert helper
778b150c Slice 21e: §4 water heating cascade pins (42/54 PASS)
5b7dbe2c Slice 21c: §2 cascade pins + ventilation_from_cert helper (96 PASS)
c1472330 Slice 21b: §1 cascade pins (12/12 PASS)
20424a2d Slice 21a: relabel SAP 10.3 → SAP 10.2 in calculator docstrings
4c2f37f6 Slice 19b: drop loose-tolerance fuel cost tests (rel=0.15, rel=0.05)
6bfb0614 Slice 19a: strict cascade-pin scoreboard for SapResult vs U985 PDFs
e2d9f77d Slice 20: lodge per-window u_value on mixed-glazing fixtures
5e34594d Slice 18a: sap_heating lodgement on 000480 / 487 / 516
8786b907 Slice 17: wire Appendix L inputs into 000480 / 487 / 516
§C — Work queue (in priority order)
C.1 Slice 24 — Rooflight (line 27a) heat transmission, for 000516 DONE
Done. 000516 PDF lodged 1.18 m² rooflight on line (27a) at U_eff=2.9930 →
3.5317 W/K. Wired by adding SapRoofWindow datatype to EpcPropertyData
and iterating epc.sap_roof_windows alongside vertical windows in
heat_transmission_from_cert — same SAP10.2 §3.2 curtain transform R=0.04
applied; rooflight area subtracted from main part's roof gross. Raw U=3.40
sourced from RdSAP10 Table 24 (p.50/113) "Roof window" column.
§3 LINE_33 residual for 000516: 0.8215 W/K → 0.0038 W/K. Remaining 0.0038 is the same pre-existing wall-perimeter + per-window curtain precision drift biting 000474/477/480/490 — closes in slice 27.
C.2 Slice 25 — 000487 §3 RR + external gable variant DONE (slice 25a)
§3 now fully closes for 000487. Remaining work: §4 HW lodgement (slice 25b — 000487 cert has 1 bath + 1 electric shower, no mixer outlet; calc treats "no mixer outlets" as "no shower", bumping Nbath from 0.13N+0.19 to 0.35N+0.50 and over-counting bath volume 2.5×).
Spec source: SAP 10.2 Appendix J step 2a (p.81) — Nbath = 0.13N + 0.19 if shower also present (including electric); = 0.35N + 0.50 if no shower present. Fix needs: lodge electric-shower presence on cert, plumb
has_electric_shower through water_heating_section_from_cert, OR the
fixture-shower-count refactor that closes 000477 LINE_61 simultaneously.
C.3 Slice 26+ — §5 / §6 / §7 / §8 / §9a / §10a / §11a / §12 cascade pins
The cascade pin work continues in worksheet order. For each section:
- Identify the cert→inputs cascade entry point. May need to extract a
<section>_from_cert(epc)helper fromcert_to_inputs(mirroring slice 21c'sventilation_from_cert, 21d'sheat_transmission_section_from_cert, 21e'swater_heating_section_from_cert, 26'sinternal_gains_section_from_cert). - Map fixture
LINE_X_<NAME>constants to result struct attributes. - Add scalar + monthly pin tests at abs=1e-4 to
test_section_cascade_pins.py. - Run, see failures, diagnose. Fixture defect or calculator bug — fix in place, no widening.
Sections still to pin:
§5 internal gains (lines 66-73 + 232 lighting kWh)DONE (slice 26)§6 solar gains (lines 83-84)DONE (slice 26b — 5/6 fixtures close, 000477/487 cascade from §4)§7 mean internal temperature (lines 85-94)MOSTLY DONE (slice 26c — 44/60 PASS; LINE_92/93 marginal ~0.0001 K residual on 000474/477/480/490 needs investigation; 000487 cascades from §3/§4 defects).- §8 space heating (lines 95-99). 4 monthly + 2 annual.
- §9a energy requirements (lines 201, 206-208, 211-215, 219). 5 scalar + 2
monthly. Currently only the annual aggregates show on
SapResult— may need monthly exposure. - §10a fuel costs (lines 240-255). 17+ line refs.
- §11a SAP rating (lines 256-258). 3 line refs.
- §12 environmental (lines 261-282). CO2 + primary energy + EI rating.
Some fixtures' constants for these sections may be missing — check first. PDF extraction commands (sample for §9a):
awk '/^9a\. Energy requirements/,/^10a\./' "sap worksheets/U985-0001-NNNNNN.txt"
C.4 Slice 27 — Floor-U precision DONE (mostly)
Done. The §5.12 spec mandates "rounded to two decimal places" for BS EN ISO
13370 floor U-values, which my calc was skipping. Applied round(U, 2) to
both suspended-timber and solid-floor branches in u_floor — closed
000474/477/490 from ~0.03–0.13 W/K residual to under 0.002 W/K on each.
Remaining 0.0013–0.0075 W/K residual is wall + party-wall area precision —
PDF stores 2-d.p.-rounded element areas (e.g. 36.4500 m² for a wall I
compute as 36.4492 m²). Closing these needs the §3 area-rounding spec
rule — see slice 27b below.
C.4b Slice 27b — §3 element-area rounding DONE
Done. RdSAP10 §15 (p.66) lodges the rounding policy: "All element areas
(gross) including window areas: 2 d.p." Applied to gross wall + party
wall + roof + floor + window + door + alt-wall + RR-sub-area inputs in
heat_transmission_from_cert. §3 cascade pins (LINE_31/33/36/37) now
close at abs=1e-4 for 5 of 6 fixtures; 000487 alone remains failing on
the RR defect (slice 25).
C.5 Slice 28 — Continuous SAP / fuel cost / CO2 closure
Once §1-§9a all close at abs=1e-4, the downstream pins
(total_fuel_cost_gbp, ecf, sap_score_continuous, co2_kg_per_yr) tighten
mechanically. Re-run the SapResult pin matrix; whatever still fails has a
section-specific residual to chase.
§D — How to work (toolbox)
D.1 Cascade pin diagnostic loop
When a pin fails:
- Add a TEMP diagnostic test in
packages/domain/src/domain/sap/worksheet/tests/test_<thing>_diag_TEMP.pythat dumps the cascade output alongside the PDF expected. - Compare element-by-element against the PDF block (use
awkto extract the relevant §X PDF block). - Identify the drift source — fixture defect or calc bug.
- Fix. Re-run the pin test.
- Delete the TEMP file before committing. Never commit
_TEMP.pyfiles.
D.2 Spec lookups
User has given these page anchors:
- Table 11 (secondary heating fraction): p 188
- Table 12 (fuel prices/CO2/PEF): p 189
- Table 12a (standing charges, off-peak): p 191
- Table 3a (water heating single-system): p 160
- Table 3b (water heating combi PCDB): p 161
- Table 3c (water heating two-profile): p 162
For other pages, ask the user. Don't scan more than ~50 lines of spec PDF without permission.
D.3 PDF extraction
Worksheet PDFs are in sap worksheets/ (note the space — quote in shell).
Each fixture has U985-0001-NNNNNN.{pdf,txt} (intermediate values) and
Summary_NNNNNN.pdf (cert lodgement).
PDF blocks for sections (sample for §3):
awk '/^3\. Heat losses/,/Thermal mass parameter/' "sap worksheets/U985-0001-000474.txt"
D.4 Section helpers (cascade-pin enablers)
Already extracted in domain.sap.rdsap.cert_to_inputs:
dimensions_from_cert(epc) -> Dimensions(§1)ventilation_from_cert(epc) -> VentilationResult(§2, slice 21c)heat_transmission_section_from_cert(epc) -> HeatTransmission(§3, slice 21d)water_heating_section_from_cert(epc) -> WaterHeatingResult(§4, slice 21e)
For §5/§6/§7/§8/§9a/§10a/§11a/§12 you may need to extract similar helpers.
The existing internal_gains_from_cert, solar_gains_from_cert, etc. mostly
exist already — check whether they're already public on the worksheet/* module.
D.5 Hard rules summary card
| do | don't |
|---|---|
pytest.approx(..., abs=1e-4) |
rel=… |
| Audit fixture against PDF first | Diagnose downstream first |
| Leave failing pins, fix one at a time | Widen tolerance / add xfail |
| Quote PDF page when asking for spec | Scan >50 lines of PDF without asking |
[[reference-style]] cross-links in memory |
Bare prose references |
Delete _TEMP.py before commit |
Commit diagnostic scripts |
§E — Key files
docs/sap-spec/sap-10-2-full-specification-2025-03-14.pdf Spec PDF
docs/sap-spec/HANDOVER_NEXT.md This file
docs/sap-spec/PARITY_FINDINGS.md Older findings
sap worksheets/ U985 + Summary PDFs
packages/domain/src/domain/sap/calculator.py Top-level SAP10.2 orchestrator
packages/domain/src/domain/sap/rdsap/cert_to_inputs.py Cert→CalculatorInputs
+ section_from_cert helpers
packages/domain/src/domain/sap/tables/table_12.py SAP 10.2 Table 12 (price/CO2/PEF)
packages/domain/src/domain/sap/tables/table_12a.py Off-peak high-rate fraction
packages/domain/src/domain/sap/tables/table_32.py RdSAP 10 Table 32 (cost prices)
packages/domain/src/domain/sap/worksheet/
dimensions.py §1
ventilation.py §2 + VentilationResult
heat_transmission.py §3 + HeatTransmission
water_heating.py §4 + WaterHeatingResult + water_heating_from_cert
internal_gains.py §5 + InternalGainsResult + internal_gains_from_cert
solar_gains.py §6 + solar_gains_from_cert
mean_internal_temperature.py §7
space_heating.py §8 + SpaceHeatingResult
fabric_energy_efficiency.py §8f
space_cooling.py §8c
fuel_cost.py §10a + FuelCostResult
rating.py §11/§13 SAP rating equations (10.2 constants — DO NOT TOUCH)
packages/domain/src/domain/sap/worksheet/tests/
test_section_cascade_pins.py Strict per-section line-ref pins (THE work)
test_e2e_elmhurst_sap_score.py SapResult-field pins + monthly_infiltration_ach pin
_elmhurst_worksheet_NNNNNN.py The 6 fixture modules (1 per fixture)
_elmhurst_fixtures.py ALL_FIXTURES registry
test_dimensions.py / _ventilation.py / _heat_transmission.py / ...
← LEGACY per-section isolation tests; use PDF values as INPUTS.
Keep them but understand they don't test the cascade.
§F — Definitely do NOT
- Do not widen any tolerance, ever.
- Do not add xfail to cascade pins.
- Do not "investigate later" by widening — fix it or leave it failing.
- Do not assume the calculator is wrong before auditing the fixture.
- Do not touch
rating.pyconstants. - Do not scan unread spec PDF pages without asking the user.
- Do not invoke
/ultrareview. - Do not auto-update unrelated
git statusitems (deletions / new files that aren't from your work).
§G — Quick orient
# Run full cohort pin matrix
python -m pytest \
packages/domain/src/domain/sap/worksheet/tests/test_section_cascade_pins.py \
packages/domain/src/domain/sap/worksheet/tests/test_e2e_elmhurst_sap_score.py \
--no-header --no-cov --tb=no -q
# Run §3 pins only
python -m pytest packages/domain/src/domain/sap/worksheet/tests/test_section_cascade_pins.py::test_section_3_line_refs_match_pdf -v --no-header --no-cov --tb=no
# Run a single SapResult pin to see numeric diff
python -m pytest \
"packages/domain/src/domain/sap/worksheet/tests/test_e2e_elmhurst_sap_score.py::test_sap_result_pin[000477-space_heating_kwh_per_yr]" \
--no-cov 2>&1 | grep AssertionError
# PDF §X block
awk '/^X\. Section/,/^Y\./' "sap worksheets/U985-0001-NNNNNN.txt"
End of handover. Read §A again before starting.