Pin the bills design from a /grill-with-docs session: - ADR-0014: whole-home annual bill from SAP10 Calculation's delivered kWh per end use, re-priced at real Fuel Rates (NOT the calculator's SAP-notional total_fuel_cost_gbp, which is RdSAP Table 32 standardised prices ~half real electricity). Fuel enum + FuelRates + FuelRatesRepository static snapshot; per-section + total flat columns; raise on unpriced fuel (house coal / heat network are the named gaps). - ADR-0013 amendment: the shadow stepping-stone is collapsed — the calculator is load-bearing now. effective=calculated for sap_version<10.2 (StubRebaseliner floor 10.0->10.2); >=10.2 keeps lodged + logs divergence; a strict-raise aborts the batch (load-bearing for bills regardless of version). - CONTEXT: EPC Energy Derivation -> Bill Derivation (no "service" suffix); Baseline Performance energy block = per-end-use kWh + per-section bill + total; Fuel Rates = committed static snapshot; Rebaselining trigger threshold 10.2. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
7.3 KiB
| Status |
|---|
| accepted |
The Sap10Calculator produces Effective Performance (it is the Rebaseliner); Calculated SAP10 Performance is not a persisted third value-set, and is wired in shadow first
Refines ADR-0004 (the Lodged/Effective
pair), ADR-0009/ADR-0010
(the calculator + the Calculated SAP10 Performance term), ADR-0011
(the Rebaseliner seam) and ADR-0012
(all-or-nothing per batch). Decided in a /grill-with-docs session (2026-06-01) before wiring
Sap10Calculator into PropertyBaselineOrchestrator.
Context
The old model_engine (backend/engine/engine.py) called out to an ML API
(model_api.predict_all over BASELINE_MODEL_PREFIXES) to rebaseline the properties that needed
it. The rebuild replaces that round-trip with the deterministic Sap10Calculator, run live.
The handover and CONTEXT (line 100) framed Calculated SAP10 Performance as a third value-set
persisted alongside Lodged and Effective (calculated_* columns). Walking the baselining
scenarios shows that framing reifies a distinction that does not exist in the domain:
- real lodged SAP10 EPC, no overrides ⇒ Calculated = Lodged = Effective;
- real EPC + property/landlord overrides ⇒ Calculated = Lodged-plus-overrides = Effective;
- estimated EPC (± overrides), or a pre-SAP10 EPC ⇒ Calculated = Effective (no lodged SAP10 to compare against — Lodged Performance exists only for a real lodged EPC).
In every scenario Effective = Calculated. There is no third quantity.
Decision
The calculator is the mechanism that produces Effective Performance — i.e. the deterministic
Rebaseliner (ADR-0011's seam), superseding the old ML-API rebaseliner. "Calculated SAP10
Performance" is the name of that output during validation, not a separately-persisted third
value-set. No calculated_* columns are added; property_baseline_performance keeps its
Lodged/Effective shape (ADR-0004). The ADR-0009 ML model is repositioned as a future residual head
over the calculator, not the baseline producer.
Shadow-first, then promotion. The calculator still strict-raises (UnmappedSapCode,
MissingMainFuelType, UnresolvedPcdbCombiLoss) on cert mappings it has not yet hardened, and the
strict-typing of EpcPropertyData that will close most of those gaps is still pending. A ~40,000
property test cohort is about to flow through baselining. So this lands in two steps:
-
This slice — shadow. Performance is still defined by the input data:
StubRebaselinerkeeps producing Effective (= Lodgedfor the only live scenario, real SAP10 + no overrides). The calculator runs beside it, on every Property's Effective EPC, purely to be battle-tested in the wild. It is not load-bearing, therefore:- a calculator raise is caught and logged at
error, never aborts the batch — otherwise one unmappable cert would lose the load-bearing Lodged/Effective write for the whole batch, and over a 40k run most batches would never baseline; - on success, its output is compared to Lodged and logged, not persisted —
warningwhen|sap_continuous − lodged_sap| > 0.5, or PEUI / CO2 diverge beyond tolerance (CO2 after the kg→tonnes conversion). Each log is tagged with the cert'ssap_versionso SAP-10.2 divergence (a real calculator signal) is separable from older-spec drift (expected — see ADR-0010 Validation Cohort).
- a calculator raise is caught and logged at
-
Next slice or two — load-bearing. When overrides + EPC estimation land (days away),
StubRebaselineris replaced by a calculator-backedRebaseliner: the calculator's output becomes Effective Performance. The failure posture flips to abort per ADR-0012 — now that the calculator is the baseline, a silent wrong answer is the expensive outcome, so a raise must fail the batch noisily. Same exception, opposite handling, because the calculator went from shadow to load-bearing. The shadow logging is then retired.
Considered options
- A third persisted
calculated_*value-set onPropertyBaselinePerformance(the handover's recommendation) — rejected:Effective = Calculatedin every scenario, so the columns would store a distinction with no domain reality, and the future "supersede effective" promotion would be a data move instead of nothing. - Promote the calculator to drive Effective immediately — rejected for this one slice: it still strict-raises on un-hardened mappings, so over the imminent 40k run it would gate the load-bearing baseline write. Shadow-first surfaces every gap as an aggregatable error log without blocking baselining.
- A separate
calculator_shadowvalidation table — held in reserve: log-only is enough while the calculator is moving and the shadow step is a 1–2 day stepping stone; we add a queryable table only if log aggregation proves too weak.
Consequences
property_baseline_performanceis unchanged this slice — no migration.- CONTEXT Calculated SAP10 Performance, Effective Performance, and Rebaselining are updated: the calculator (not ML) is the rebaseliner mechanism in the rebuilt engine; Calculated is not a stored third set.
- The shadow runner's broad
exceptis deliberate (the point is to discover what breaks in the wild); each caught exception is logged with its type andproperty_id. - This decision is short-lived in its shadow form by design; the durable half — "the calculator produces Effective Performance; there is no third value-set" — outlives it.
Amendment (2026-06-02): shadow collapsed — the calculator is load-bearing now
The shadow stepping-stone was right in shape but wrong in duration: the calculator was ready, and wiring Bill Derivation onto its delivered-kWh breakdown makes it load-bearing for bills on every property — so the "shadow until overrides / estimation land" timeline collapses to now. The durable decision stands (calculator produces Effective Performance; no third value-set); only the timing changes:
sap_version < 10.2→ effective performance is the calculator's output (theStubRebaselinerfloor moves10.0 → 10.2; mechanism is the calculator, not ML).sap_version ≥ 10.2→ effective = the API's lodged figures; the calculator still runs alongside, logging divergence (the surviving half of the shadow runner) as a validation signal.- Failure posture flips to abort: the calculator is load-bearing for Bill Derivation regardless
of version, so a strict-raise aborts the batch (ADR-0012) — the un-mapped cert is fixed
immediately rather than skipped. The shadow's catch-and-log of raises is retired; divergence
warnings on
≥ 10.2certs remain.
The ≥1000-cert parity gate from ADR-0009/0010 still governs whether the calculator's figures are
trusted as definitive for the SAP-10.2 cohort, but it no longer gates wiring — pre-10.2 certs
have no current-spec lodged figure to fall back to, so the calculator is the only source there.