Documents the 5-slice session that closed the prior handover's
"precision floor" cluster end-to-end:
S0380.26 RdSAP10 §5.8 dry-lining adjustment (cert 7700)
S0380.27 floor_construction_type → _main_floor_u_value (cert 9796)
S0380.28 SAP 10.2 Appendix N fn 43 reciprocal η interpolation
(closes the +0.03..+0.06 ASHP cluster cohort-wide)
S0380.29 _ASHP_COHORT_CHAIN_TOLERANCE 0.07 → 0.04
S0380.30 glazing codes 8-15 (RdSAP 21 schema) — closes API path
cohort-1 +0.014..+0.031 cluster
Final state:
Cohort-2 Summary path (38): 33 exact + 5 ≤0.07
Cohort-1 ASHP cohort (7): 6/7 <1e-4 both Summary + API paths
cert 2636 -0.015 (cantilever, path-symmetric) — only open thread
The prior `HANDOVER_CERT_0380_MIT_CASCADE.md` had concluded the
+0.04 ASHP cluster was unfixable without Elmhurst access; the
spec citation (SAP 10.2 Appendix N fn 43) was sitting in the same
PDF that handover referenced. Be skeptical of "spec-precision
floor" framing — see [[feedback-spec-floor-skepticism]].
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
16 KiB
Handover — precision floors closed, only cantilever residual + cohort-2 tail remain
Branch feature/per-cert-mapper-validation. This session shipped
5 slices (S0380.26 → S0380.30) that closed the entire "spec-precision
floor" cluster the prior handover
(HANDOVER_COHORT_2_PRECISION_FLOOR.md)
described. Two of those — the η interpolation bug and the glazing
code table — were real spec-citation cascade bugs, not vendor
precision drift. The user's feedback-one-e-minus-4-across-the-board
posture (skeptical of "precision floor" framing) was correct on both.
HEAD at handover start: faf116bd (Slice S0380.30).
User's stated goal (carried forward verbatim)
I've added some more test cases, in the same format, in here:
sap worksheets/additional with api 2We should check that the Elmhurst mapping works and then the api
Target: 1e-4 across the board for every cert per feedback-one-e-minus-4-across-the-board — HPs included.
Slices shipped this session
| Slice | Commit | What |
|---|---|---|
| S0380.26 | c144d444 |
RdSAP10 §5.8 + Table 14 dry-lining R=0.17 adjustment on alt walls. Closes cert 7700 -0.44 → +5e-5. New AlternativeWall.dry_lined: bool, Elmhurst extractor reads "Alternative Wall N Dry-lining: Yes/No", mapper threads wall_dry_lined="Y", u_wall(dry_lined=True) applies §5.8 R=0.17 at as-built bucket only. |
| S0380.27 | 012cbd18 |
Thread floor_construction_type into _main_floor_u_value per heat_transmission's effective_floor_description rule. Closes cert 9796 +0.55 → +0.00174. Cert 8135 golden PE -4.96 → -0.07 kWh/m² (same broken-helper mechanism). |
| S0380.28 | 081bb8fd |
SAP 10.2 Appendix N footnote 43 (PDF p.101 line 7053) reciprocal-linear PSR η interpolation: 1/η = (1−t)/η_low + t/η_high. Cascade was using linear-on-η directly. Closes the +0.03..+0.06 ASHP cluster across cohort-1 + cohort-2. |
| S0380.29 | e27b923b |
Tighten _ASHP_COHORT_CHAIN_TOLERANCE 0.07 → 0.04 (~30% headroom over worst residual). |
| S0380.30 | faf116bd |
Extend _G_LIGHT_BY_GLAZING_CODE + _G_PERPENDICULAR_BY_GLAZING_TYPE to cover RdSAP 21 codes 8-15 (per datatypes/epc/domain/epc_codes.csv). Closes the cohort-1 API path +0.014..+0.031 cluster (5 of 6 certs to <1e-4) — cohort uses code 14 (triple 2022+) which pre-slice fell to the DG default. |
All on branch feature/per-cert-mapper-validation. Each includes unit
tests, pyright net-zero on touched files.
Cohort distributions at HEAD
Cohort-2 (38-cert dataset, Summary path)
| Bucket (|Δ|) | Session start | Now | Δ |
|---|---|---|---|
| exact (<1e-4) | 22 | 33 | +11 |
| 1e-4..0.07 | 14 | 5 | -9 |
| 0.07..0.5 | 1 | 0 | -1 |
| 0.5..1 | 1 | 0 | -1 |
| 1..5 | 0 | 0 | = |
| >5 | 0 | 0 | = |
| RAISES | 0 | 0 | = |
Cohort-2 ≤0.07 residuals remaining:
| Cert | Δ SAP | Pattern |
|---|---|---|
2536-2525-0600-0788-2292 |
+0.00072 | Shared 3-cert +0.0007 pattern |
2800-7999-0322-4594-3563 |
+0.00068 | (same) |
4800-3992-0422-0599-3563 |
+0.00068 | (same) |
6835-3920-2509-0933-5226 |
+0.01453 | PV cert (slices S0380.23+S0380.25 closed bulk; tail remains) |
9380-2957-7490-2595-3141 |
+0.02732 | Gas cert; unrelated to ASHP cluster |
Cohort-1 ASHP cohort (7-cert dataset, Summary + API paths)
| Cert | Summary delta | API delta | Notes |
|---|---|---|---|
| 0380 | +1e-6 | +9e-7 | EXACT both paths |
| 0350 | +2.2e-5 | +2.2e-5 | EXACT both paths |
| 2225 | -4.8e-5 | -4.8e-5 | EXACT both paths |
| 2636 | -0.01495 | -0.01495 | Cantilever fixture — same residual on both paths |
| 3800 | -2e-5 | -2e-5 | EXACT both paths |
| 9285 | -3.4e-5 | -3.4e-5 | EXACT both paths |
| 9418 | -4e-7 | -4e-7 | EXACT both paths |
Summary EPC ≡ API EPC for the cascade outputs on 6 of 7 ASHP cohort certs (cross-mapper parity validated end-to-end). Cert 2636 is the same residual both ways — the bug is path-agnostic, in the cantilever cascade.
★ Open threads with diagnoses (priority order)
1. Cert 2636 cantilever residual (-0.01495 SAP, both paths)
Setup: Mid-Terrace house age D, alt-wall + cantilever (3.74 m² / 9.5% of ground floor, first-floor-over-passageway). PCDB 104568 ASHP. Mid-terrace bungalow cantilever is the most complex geometry in the ASHP cohort. Worksheet "SAP value" 86.2641.
Diagnosis (NOT done this session — fresh investigation needed):
Cohort-1 ASHP cohort closes to <1e-4 on 6 of 7 certs after S0380.28 (reciprocal η) + S0380.30 (glazing codes). Cert 2636 stays at -0.015 on both paths identically — the cascade outputs are the same on Summary EPC and API EPC. So:
- This is NOT a mapper bug (path-symmetric).
- This is NOT η interpolation (PSR matches worksheet).
- This is NOT a glazing-code bug (already closes the post-S0380.30 cluster).
Likely candidates (worth probing in order):
- Cantilever exposed-floor U-value — Table 20 lookup at cert 2636's geometry (3.74 m² cantilever / age D ground floor). Slice 102f-prep.9 added RdSAP cantilever exposed-floor detection; verify Table 20 row + insulation thickness routing.
- Cantilever in (31) total external area — used for thermal bridging.
The 3.74 m² should add to (31) once (heat_transmission.py:828-837
includes
cantilever_areainpart_external_area). - Alt-wall window allocation — cert 2636's §11 has the 1.19 m² alt-wall window (S0380.12 closed the window-location parser). Verify the area deduction lands on the alt wall, not the main wall.
Probe recipe (analogous to the cert 9796 / cert 3336 probes earlier this session):
# Compare cascade line-by-line vs worksheet for cert 2636
# heat_transmission components (33)/(31)/(36)/(37), monthly (38)/(39)/(40),
# (94) η_whole, (98)m space heating, and trace where the -0.015 enters.
# If a non-zero delta appears between cascade and worksheet for any single
# section line ref, that's the gap. If every component matches at 1e-4,
# the residual must come from the η_main_heating step (post-N3.6 in-use
# factor or similar).
2. Cohort-2 cert 9380 (+0.027) and cert 6835 (+0.015)
Both gas certs (no ASHP precision-floor mechanism). Likely cohort-2-specific mapper details surfaced after the ASHP cluster closed.
- Cert 6835 had two prior slices (S0380.23 PV %-of-roof, S0380.25 SAP code 2111/2113 control type). Remaining +0.015 may be a small lighting/HW detail.
- Cert 9380 hasn't had a dedicated slice yet — first place to look: Summary §11 windows lodgement, §14 heating controls, §15 thermal mass.
Standard probe: compare cascade end-state (SAP, ECF, total_fuel_cost, main_heating_fuel_kwh, hot_water_kwh, lighting_kwh) vs worksheet section 1 readouts → isolate which line ref diverges.
3. Cohort-2 certs 2536 / 2800 / 4800 (+0.0007 shared pattern)
Three certs at +0.00068..+0.00072 SAP — suspiciously consistent. Likely a shared small artifact (rounding step, fuel-cost decimal precision, internal gains rounding, etc.). Could close as one slice if the shared cause is found.
4. API path closure for cohort-2 (all 38 certs)
Longstanding goal from the prior handover, NOT addressed this session.
Process:
- Fetch + persist JSON via
EpcClientService._fetch_certificate(token inbackend/.envasOPEN_EPC_API_TOKEN). - Mirror Summary chain tests on the API path. Pattern: see
backend/documents_parser/tests/test_summary_pdf_mapper_chain.pytest_api_*family. - Cross-mapper EPC parity (Summary EPC ≡ API EPC for load-bearing fields) — user's longstanding north star. After S0380.30, the cohort-1 ASHP cohort already passes this parity at <1e-4 cascade output on 6 of 7 certs. Cohort-2 should be similar but needs verification.
5. Tighten _ASHP_COHORT_CHAIN_TOLERANCE 0.04 → smaller
Once cert 2636 closes (thread 1) the tolerance can drop to ~0.001 or similar. Current 0.04 sits at ~30% headroom over cert 2636's -0.015.
Test baseline at HEAD
PYTHONPATH=/workspaces/model python -m pytest \
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
backend/documents_parser/tests/test_elmhurst_extractor.py \
backend/documents_parser/tests/test_elmhurst_end_to_end.py \
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
domain/sap10_calculator/worksheet/tests/test_water_heating.py \
domain/sap10_calculator/worksheet/tests/test_mean_internal_temperature.py \
domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
domain/sap10_calculator/tests/test_pcdb_table_362_lookup.py \
domain/sap10_ml/tests/test_rdsap_uvalues.py \
datatypes/epc/schema/tests/test_schema_loading.py \
--no-cov -q
Expected: 711 pass + 10 pre-existing fails (9 × cert 001479 Layer 1 hand-built skeleton + 1 × pre-existing FEE round-trip).
Diagnostic probe script
Cohort-2 Summary path sweep (full distribution):
PYTHONPATH=/workspaces/model python <<'PY'
import re, subprocess
from collections import defaultdict
from pathlib import Path
from backend.documents_parser.tests.test_summary_pdf_mapper_chain import _summary_pdf_to_textract_style_pages
from backend.documents_parser.elmhurst_extractor import ElmhurstSiteNotesExtractor
from datatypes.epc.domain.mapper import EpcPropertyDataMapper, UnmappedElmhurstLabel
from domain.sap10_calculator.rdsap.cert_to_inputs import (
cert_to_inputs, SAP_10_2_SPEC_PRICES, UnresolvedPcdbCombiLoss,
)
from domain.sap10_calculator.calculator import calculate_sap_from_inputs
src_root = Path('/workspaces/model/sap worksheets/additional with api 2')
buckets = defaultdict(list)
def bucket(d):
a = abs(d)
if a < 1e-4: return "exact"
if a < 0.07: return "<=0.07"
if a < 0.5: return "0.07..0.5"
if a < 1: return "0.5..1"
if a < 5: return "1..5"
return "5+"
for cd in sorted(src_root.iterdir()):
if not cd.is_dir() or cd.name.startswith('.'): continue
sp = next(cd.glob("Summary_*.pdf"), None)
ws_pdf = next(cd.glob("dr87-*.pdf"), None)
if not (sp and ws_pdf): continue
out = subprocess.run(["pdftotext", str(ws_pdf), "-"], capture_output=True, text=True).stdout
m = re.search(r"SAP value\s*\n?\s*([\d.]+)", out)
ws_sap = float(m.group(1)) if m else None
try:
sn = ElmhurstSiteNotesExtractor(_summary_pdf_to_textract_style_pages(sp)).extract()
epc = EpcPropertyDataMapper.from_elmhurst_site_notes(sn)
r = calculate_sap_from_inputs(cert_to_inputs(epc, prices=SAP_10_2_SPEC_PRICES))
d = r.sap_score_continuous - ws_sap
buckets[bucket(d)].append((cd.name, d))
except UnresolvedPcdbCombiLoss as e:
buckets["RAISES (Pcdb)"].append((cd.name, e.pcdf_index))
except UnmappedElmhurstLabel as e:
buckets["RAISES (Elm)"].append((cd.name, str(e)))
for b in ("exact", "<=0.07", "0.07..0.5", "0.5..1", "1..5", "5+", "RAISES (Pcdb)", "RAISES (Elm)"):
if b in buckets:
print(f"\n[{b}] {len(buckets[b])}:")
for c, d in buckets[b]:
print(f" {c} {d}")
PY
Methodology — preserved conventions
Carried forward unchanged from prior sessions:
- 1e-4 across the board (feedback-one-e-minus-4-across-the-board)
- Worksheet, not API, is the target (feedback-worksheet-not-api-reference)
- One slice = one commit; stage by name (feedback-commit-per-slice)
- AAA test convention with literal
# Arrange / # Act / # Assert(feedback-aaa-test-convention) abs(diff) <= tolnotpytest.approx(feedback-abs-diff-over-pytest-approx)- Spec citation in commit messages (feedback-spec-citation-in-commits)
- Strict-enum raises on unmapped labels / unresolved cascade dispatch
- Pyright net-zero per file
Method that worked this session — verbatim
The "spec-precision floor" framing from the prior handover was wrong on both bugs found this session. The pattern that worked:
- Pick the worst-residual cert in the open thread.
- Probe cascade vs worksheet line-by-line for every numbered line ref in the path (section 2 ventilation, section 3 fabric, section 7 MIT/η, section 8 space heating, section 9 fuel, section 10 cost). When every line matches except one, that line's input is the gap.
- Back-solve the worksheet to identify the implied parameter (cert 3336: cascade η_space=237.31 vs ws-implied 236.74 → linear vs reciprocal interpolation; cert 9796: cascade (12)=0.1 vs ws (12)=0.2 → sealed vs unsealed verdict).
- Verify against spec before claiming a fix. Both S0380.27 (RdSAP10 §5.8 + Table 14) and S0380.28 (SAP 10.2 Appendix N fn 43) found explicit spec citations matching the worksheet behavior — neither was reverse-engineering vendor implementation.
The prior handover claimed "no public spec or BRE data field would distinguish [the +0.04 cluster]" — that was wrong. SAP 10.2 footnote 43 is explicit about reciprocal interpolation. Be skeptical of "spec precision floor" framing.
Pyright baselines (post-S0380.30; net-zero per slice)
datatypes/epc/domain/mapper.py: 32datatypes/epc/surveys/elmhurst_site_notes.py: 0backend/documents_parser/elmhurst_extractor.py: 0backend/documents_parser/tests/test_summary_pdf_mapper_chain.py: 0domain/sap10_calculator/rdsap/cert_to_inputs.py: 35domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py: 12domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py: 1domain/sap10_calculator/tables/pcdb/parser.py: 0domain/sap10_calculator/tests/test_pcdb_table_362_lookup.py: 0domain/sap10_calculator/worksheet/heat_transmission.py: 13domain/sap10_calculator/worksheet/internal_gains.py: 0domain/sap10_calculator/worksheet/solar_gains.py: 0domain/sap10_calculator/worksheet/tests/test_heat_transmission.py: 71domain/sap10_calculator/worksheet/tests/test_solar_gains.py: 22domain/sap10_calculator/worksheet/tests/test_water_heating.py: 94domain/sap10_ml/rdsap_uvalues.py: 0domain/sap10_ml/tests/test_rdsap_uvalues.py: 66
Memory references
Cross-session memories load automatically. Key ones for this work:
- feedback-one-e-minus-4-across-the-board — user target is 1e-4 for HPs too.
- feedback-worksheet-not-api-reference — Summary path pins to worksheet, not API.
- feedback-cascade-pin-methodology — test the actual cascade against PDF line refs.
- reference-sap10-spec-docs — full BRE technical paper set at
domain/sap10_calculator/docs/specs/. - feedback-commit-per-slice / feedback-aaa-test-convention / feedback-abs-diff-over-pytest-approx / feedback-spec-citation-in-commits / feedback-worksheet-shape-fidelity / feedback-zero-error-strict — slicing + test conventions.
- project-cohort-2-summary-path-closure — pre-S0380.26 cohort-2 state (now superseded by this handover).
- project-summary-path-cohort-closure — cohort-1 ASHP closure context.
First concrete actions for next agent
-
Re-run the diagnostic probe to confirm baseline reproduces (33 exact + 5 ≤0.07 + 0 elsewhere + 0 RAISES on cohort-2; 6/7 ASHP cohort at <1e-4 both paths; cert 2636 -0.015 both paths).
-
Investigate cert 2636 cantilever residual (thread 1):
- Probe line-by-line cascade vs worksheet for cert 2636. The fact that Summary EPC and API EPC produce the same cascade output means this is in the cascade itself, not the mapper.
- First section to check:
(28b)/(31)cantilever floor area contribution → thermal bridging factory × (31)→ (36) → (37). - Second: alt-wall window allocation (cert 2636's §11 lodges one alt-wall window per S0380.12).
-
Cohort-2 tail closure (threads 2-3):
- Cert 9380 +0.027 — fresh cert, hasn't had a dedicated slice.
- Cert 6835 +0.015 — partially closed by S0380.23/S0380.25; tail remains.
- Certs 2536/2800/4800 +0.0007 shared pattern — likely single shared cause.
-
API path for cohort-2 (thread 4) — fetch + persist 38 cert JSON, mirror Summary chain tests, add cross-mapper parity probes.
Good luck. The Summary-path cohort is in excellent shape (33/38 exact at 1e-4). The ASHP cohort is essentially closed at the cascade level (6/7 both paths at <1e-4). The remaining work is small cohort-2 residuals + cert 2636 cantilever + API-path closure for cohort-2.