Model/domain/sap10_calculator/docs/HANDOVER_FRESH_API_DEBUG.md
Khalim Conn-Kowlessar d3def1e254 docs: handover — S0380.218 closed the "with api 3" pair (both clean)
Record S0380.218 shipped, bump HEAD/next-slice, note both certs are
0-residual cross-validated golden fixtures and flag the optional
Summary-path regression guard as the cheap follow-up.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 12:36:23 +00:00

9.7 KiB
Raw Permalink Blame History

Handover — fresh-API cross-comparison + flagged-cert debugging

Point-in-time note. Start from AGENT_GUIDE.md for methodology, the 1e-4 bar, the per-line debugging loop, the section helpers, and the suite command.

  • Branch: feature/per-cert-mapper-validation
  • HEAD: 6d9ef114 (S0380.218). Confirm with git rev-parse HEAD.
  • Baseline (AGENT_GUIDE §4 suite): tests/domain/sap10_calculator/ backend/documents_parser/tests/ → green (2392 passed, 1 skipped at HEAD; the golden + worksheet pins all pass).
  • Next slice number: S0380.219.

S0380.218 (DONE) — Part 1 closed for the "with api 3" pair. The two certs the user dropped under sap worksheets/with api 3/0340-2467-9260-2006-6521 (Summary_000922 / dr87 000922) and 5500-5070-0822-0201-3663 (Summary_000920 / dr87 000920) — are clean. Fetched fresh, run through BOTH front-ends, both paths agree to <1e-4 on SAP/cost/CO2/PE AND reproduce the worksheet (255)/(272)/(286)/(33)/(37) exactly. SAP integer = lodged (resid +0) on both. No mapper/calculator bug surfaced. Dropped-field audit clean (only created_at + _normalize_shower_outlets-handled shower keys). Locked in as golden fixtures: 2 JSONs under fixtures/golden/ + entries in _EXPECTATIONS and _WORKSHEET_PE_CO2 (test_golden_fixtures.py). The Summary path was validated manually but is NOT pinned in a committed test (would need the Summary PDFs copied into backend/documents_parser/tests/fixtures/ + a textract-preprocessed chain test) — a cheap follow-up if cross-mapper parity wants a standing regression guard beyond the API-path golden pin.

  • Pre-existing failures (NOT yours, out of scope):
    • domain/sap10_ml/tests/test_rdsap_uvalues.py — 2 stone-§5.6 thin-wall failures (granite + sandstone band A, 3.7408 vs Table-6 1.7 cap). Run this suite when you touch rdsap_uvalues.py.
    • datatypes/epc/domain/tests/test_from_rdsap_schema.py::TestFromRdSapSchema21_0_1::test_total_floor_area (145.82 vs 45.82) — fails at original HEAD ec64c39d too. This file is NOT in the §4 suite command.

★ THE TASK — fetch fresh from the EPC API and debug, with worksheet cross-comparison

The previous session drove the golden-fixtures cascade (cert_to_inputscalculate_sap_from_inputs) and concluded that the three then-flagged certs (7536, 2130, 0240) are "0240-like" — API-only residuals not reproducible from the register JSON. The user pushed back ("going around in circles"), and the right next move is fresh raw-API data + worksheet triples, not more simulated worksheets.

Part 1 — two NEW certs with API + Summary + worksheet (cross-comparison)

The user has two certs that have all three artifacts: the GOV.UK API JSON, the Elmhurst Summary PDF (site notes / input), and the Elmhurst worksheet PDF (the (1)..(286) ground truth). These are gold — they let you run BOTH front-ends (from_api_response and from_elmhurst_site_notes) through the same cascade and pin both against the worksheet at 1e-4. The user will provide the cert numbers + drop the PDFs. For each:

  1. Fetch the API JSON (see Fetching below).
  2. Run API path → cascade; run Summary path → cascade; pin both vs the worksheet line refs (pdftotext -layout the worksheet; compare (27)/(28a)/(29a)/(30)/(33)/(36)/(45)m/ (62)/(233a)/(233b)/(258)…). Cross-mapper parity: the two paths must agree to 1e-4 AND match the worksheet (memory feedback_cross_mapper_parity_via_cascade).
  3. The first diverging line ref localises the bug (AGENT_GUIDE §3): value present in worksheet but cascade 0/wrong → calculator; input field absent in epc → mapper or extractor. Fix one cause = one slice.

Part 2 — (secondary) re-check the previously-flagged certs on THIS branch

A dashboard once flagged six certs (0240, 0390-2954-3640, 2130, 6035, 7536, 9390). Those numbers are STALE — they came from a branch WITHOUT this branch's fixes (the user confirmed this). Do not chase them. On THIS branch the picture is different and mostly settled:

  • 7536 (68.924, +1), 2130 (83.78, +2), 0240 (1) — concluded 0240-like (API-only residuals; see per-cert notes below). 0390-2954-3640 pins at +0 (exact).
  • 6035 (+2.19) and 9390 (community, 2) carry documented open residuals (see notes) but are lower-priority and not worksheet-backed.

So Part 2 is only worth touching if a fresh fetch differs from the committed fixture (curated/hand-corrected fixtures can mask raw-API mapper behaviour) — diff fresh vs fixture and debug the delta. Otherwise these are done; the real new work is Part 1.


Fetching from the EPC API

Token lives in backend/.env as OPEN_EPC_API_TOKEN (also EPC_AUTH_TOKEN). The exact mechanism (from scripts/fetch_cohort2_api_jsons.py):

import httpx, os
from dotenv import load_dotenv
from infrastructure.epc_client.epc_client_service import EpcClientService
load_dotenv("backend/.env")
token = os.environ["OPEN_EPC_API_TOKEN"]
resp = httpx.get(
    f"{EpcClientService.BASE_URL}/api/certificate",
    params={"certificate_number": "<CERT>"},
    headers={"Authorization": f"Bearer {token}", "Accept": "application/json"},
    timeout=EpcClientService.REQUEST_TIMEOUT,
)
payload = resp.json()["data"]   # <- this is the schema-21 JSON the mapper consumes

EpcPropertyDataMapper.from_api_response(payload) only supports schema_type RdSAP-Schema-21.0.0 / 21.0.1; it raises for others. The persisted golden fixture IS this data payload. So diff <(fresh) vs the committed fixture is apples-to-apples.


Per-cert notes carried from the previous session (verify against FRESH data)

  • 7536 (+1) — roof bug fixed (S0380.214: as-built sloping ceiling → Table 18 col 3). Every per-element U matches Elmhurst (cases 15-17 worksheets). Concluded 0240-like; cont 68.924.
  • 2130 (+2) — dropped measured wall insulation captured (S0380.215 → Table 8 U=0.32), which exposed the true residual (the +1 was two offsetting bugs). PV β-split proven exact vs simulated case 18 worksheet (onsite 970.77 / export 1713.40 to the decimal). Gas PE factor exact (1.13). Concluded 0240-like; cont 83.78.
  • 0240 (1) — export-dropped 2013+ circulation-pump age (115 vs 41 kWh); WWHRS confirmed inert (shower_wwhrs=1 is the universal default across all 47 certs). User previously decided NOT to re-pin. Concluded 0240-like.
  • 0390-2954-3640 — pinned +0 (oil combi, Table 3a row 1). The user's 6.85 flag is the reconciliation mystery above — START HERE; it's the clearest signal of a fresh-vs-fixture or different-engine gap.
  • 6035 — see memory project_golden_coverage_state: a user-simulated 6035 worksheet closed to 1e-4, but "6035 remaining +19 PE needs its own worksheet"; flagged +2.19 SAP.
  • 9390 — community heat-network (S0380.212/.213 fixed the fuel-code collision + standing charge); left at SAP 2 with a documented ~7% demand over-count (heat-source-eff default?). Unpinned/retired. The user's 4.24 may be the same demand over-count on fresh data.

What this session shipped (commits ec64c39d..f895dd3a)

slice what
S0380.214 As-built "Pitched, sloping ceiling" (code 8) roof → RdSAP 10 Table 18 col (3) (band F 0.40→0.68, L 0.16→0.18) per §5.11 item 5-5 + note (b). Code-5 vaulted stays col (1) (cohort). Worksheet-validated (sim case 15). Re-pinned 7536.
S0380.215 Captured dropped wall_insulation_thickness_measured (schema 21 didn't declare it → from_dict dropped it). 2130 Ext1 "measured"/100 mm → RdSAP Table 8 U=0.32 (was 0.55 default). Exposed 2130's true +2 residual.
S0380.216 Extractor: handle pdftotext wrapping the §11 glazing-GAP column onto the glazing-TYPE token ("…16 mm or [1st]"). Fallback strip AFTER the direct lookup (preserves explicit interleaved keys). Unblocked running the cascade on hand-entered worksheet Summaries.
S0380.217 Captured dropped wall_insulation_thermal_conductivity (schema → domain → mapper) and wired it into u_wall's §5.8 λ resolver. Code 1 = default 0.04; unmapped codes raise. Zero cascade effect today (2130's §5.8 path doesn't fire).
3× docs finalised 7536 / 2130 as 0240-like; corrected diagnoses.

Audit method that found the dropped fields (reuse it on the fresh certs): recursively compare raw JSON keys against the parsed schema dataclass fields — anything in the JSON but not a declared field is silently dropped by from_dict. The two real drops (2130's measured wall insulation + thermal conductivity) came from this. Re-run it on the fresh fetches; new certs may surface new dropped fields.


Conventions (unchanged)

One cause = one slice = one commit; spec citation (page + line) in the message; AAA tests (# Arrange / # Act / # Assert); assert with abs(x - y) <= tol (not pytest.approx); SAP 10.2 only; no tolerance widening / xfail / rel-tol. New code passes pyright strict with ZERO NEW errors — baseline-compare with git stash + PYRIGHT_PYTHON_FORCE_VERSION=latest (mapper.py / cert_to_inputs.py / heat_transmission.py / rdsap_uvalues.py carry pre-existing errors; compare counts). Stage files by name — the working tree has pre-existing unrelated changes to pytest.ini / scripts/ that must NOT be staged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>.

When you re-pin a golden cert, update expected_sap_resid (±0), expected_pe_resid_kwh_per_m2 (±0.01) and expected_co2_resid_tonnes_per_yr (±0.001) to the exact post-fix values and append a slice note to the cert's notes: explaining the cause + spec/worksheet citation. Run the full §4 suite as the blast-radius check after any fabric/factor change.