Per user suggestion: the iteration history in this chat has likely
accreted blind spots that a long context window can't shed (e.g. I
spent slices comparing our delivered kWh to the cert's primary kWh
without noticing the apples-to-oranges error). A fresh agent reading
the SAP 10.2 + RdSAP 10 PDFs cold against the current calculator may
spot gaps faster.
HANDOVER_FRESH_REVIEW.md gives the fresh agent:
- Current state (MAE 5.34, primary-energy bias +51 kWh/m²)
- Repo layout pointer
- Priority-ordered dig list (PEUI mystery first)
- Validated truths
- Dead-end list (don't repeat S-B5 NI thickness switch etc.)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Per user suggestion (switch from probe-driven to worksheet-driven
iteration), enumerates the §§1-15 worksheet + Appendices A-U state in
the calculator with a status grade and a prioritised gap list. Becomes
the roadmap for Session B remaining slices.
Next slice from this list: Table 11 secondary heating allocation —
10% fraction on most boiler-main certs that we currently model as 0.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Verified against the SAP 10.2 spec (14-03-2025): Table 12 unit prices
are IDENTICAL to SAP 10.3 Table 12. Both specs mandate (§12.2): "Fuel
costs are calculated using the fuel prices given in Table 12. Other
prices must not be used for calculation of SAP ratings." The legacy
ML-pipeline prices in domain.ml.sap_efficiencies (3.48 gas, 13.19 elec,
5.50 E7-low) do NOT match either SAP 10.2 or 10.3 and appear to be a
pre-2022 holdover.
New module domain.sap.tables.table_12 carries the spec-correct
values:
mains gas: 3.64 (was 3.48 legacy)
standard electricity: 16.49 (was 13.19)
7h-low / Economy-7: 9.40 (was 5.50)
24h-heating: 14.04 (was 6.61)
Also corrects an S-B4 bug: SAP 10.2 Table 12a shows direct-acting
electric heating (codes 191-196) runs at 90% high-rate on 7h tariffs,
not 0% — only true storage heaters (401-409, 421-425) bill at the
low rate. _E7_SPACE_HEATING_CODES narrowed accordingly.
100-cert parity probe with spec-correct prices:
MAE 4.66 → 6.66 (regression vs legacy prices)
bias -0.70 → -4.66 (over-counting cost)
spec-correctness: SAP 10.2 verbatim
The MAE regression confirms the corpus's lodged ratings were NOT
calculated against the published SAP 10.2 Table 12 prices. The cert
ratings appear to use the legacy lower prices despite reporting
sap_version=10.2. Three paths forward documented in next commit's
discussion thread.
Also adds the SAP 10.2 spec PDF to docs/sap-spec/.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds services/ml_training_data/src/ml_training_data/sap_parity_probe.py
— samples N certs from the v18a corpus, streams them via BulkZipReader,
runs Sap10Calculator, prints MAE/RMSE/bias + worst-N residuals. Baseline
across 100 certs: MAE 8.41, RMSE 13.98, bias -2.65, 0 errors.
docs/sap-spec/PARITY_FINDINGS.md captures the dominant failure pattern
(flats + bungalows under-predicted, 10 of the worst-15 are flats whose
floor/roof are party with neighbouring dwellings) and the priority-
ordered Session B iteration backlog (S-B-flat-surfaces first).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Promotes ADR-0009 from Proposed to Accepted after the grill-with-docs
session resolved all seven open questions. Bundles the SAP 10.3 and
RdSAP 10 specifications under docs/sap-spec/ plus a calculator design
sketch (module layout, monthly-loop pseudo-code, status table).
CONTEXT.md adds three new domain terms parallel to existing performance
language:
- Calculated SAP10 Performance (parallel to Effective / Lodged)
- SAP10 Calculation (process; implemented by Sap10Calculator)
- Measure Application (process; implemented by MeasureApplicator)
ML pipeline is NOT retired — it stays as the residual head once the
calculator reaches parity in Session B. ADR-0009 §"Grill outcomes" carries
the seven binding scope decisions plus three Session-A-scope changes
discovered during the grill (RdSAP §19 EER formula, SAP 10.2 Appendix A
cross-reference, RdSAP Table 29 cascade defaults).