mirror of
https://github.com/Hestia-Homes/Model.git
synced 2026-06-08 11:17:27 +00:00
docs(baseline): Bill Derivation design — fuel as calculator output + rebaselining is assemble-and-score
Captures a /grill-with-docs session resolving how BillDerivation gets the fuel each end use burns, and what Rebaselining actually is. - ADR-0014 amendment: per-end-use fuel is a calculator OUTPUT (resolved Table-32 codes on SapResult: main-1/main-2/secondary/HW + pv_exported_kwh); the adapter is a pure SapResult->EnergyBreakdown map. Corrects stale §3 (is_gas_code... -> sap_fuel.sap_code_to_fuel). Adds COOLING section. Interim, pending ADR-0015. - ADR-0013 amendment: the calculator is the SCORING ENGINE within Rebaselining (assemble the Effective EPC picture, then score), not the whole of it; the Rebaseliner exposes its SapResult so the orchestrator composes Effective Performance AND the Bill from one scoring. - ADR-0015 (new): mappers own cert normalization; EpcPropertyData becomes a strict type. Explains why fuel resolution sits in the calculator today. - CONTEXT.md: Effective EPC = the assembled picture; Rebaselining = assemble (overrides / neighbour-estimation / old-schema remap) then score. - EpcPropertyData docstring points at ADR-0015. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
c431453d75
commit
19a56461ba
5 changed files with 133 additions and 3 deletions
|
|
@ -78,15 +78,15 @@ _Avoid_: patches (deprecated), corrections, manual EPC, edits
|
||||||
### Modelling
|
### Modelling
|
||||||
|
|
||||||
**Effective EPC**:
|
**Effective EPC**:
|
||||||
The EpcPropertyData scored by the modelling pipeline for a single Property, derived from either Site Notes alone or the public EPC with Landlord Overrides applied; carries source-derived physical fields and originally recorded performance values, with model-rebaselined performance held separately in Baseline Performance.
|
The assembled `EpcPropertyData` picture the modelling pipeline scores for a single Property. Assembled from whichever source applies: Site Notes alone; or the public EPC with **Landlord Overrides** applied; or — when the EPC is **old** — its schema re-mapped to current and gaps filled from neighbour predictions; or — when there is **no EPC** — components **estimated from surrounding properties**. Carries source-derived physical fields and originally recorded performance values; the performance scored from this picture is held separately in **Baseline Performance**.
|
||||||
_Avoid_: modelling EPC, working EPC, resolved EPC, derived EPC
|
_Avoid_: modelling EPC, working EPC, resolved EPC, derived EPC
|
||||||
|
|
||||||
**Rebaselining**:
|
**Rebaselining**:
|
||||||
Re-predicting a Property's SAP score, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh via **SAP10 Calculation** (the deterministic `Sap10Calculator`, which superseded the old ML-API rebaseliner; an ML residual head over the calculator is future — ADR-0009/0013) so the modelling pipeline scores it against the current SAP10 methodology. Triggered when either (a) the Effective EPC was lodged under a methodology the calculator supersedes (`sap_version < 10.2`, the calculator's target spec), so the recorded scores reflect a superseded methodology, or (b) Site Notes / Landlord Overrides changed the physical state of the Property (walls / heating / windows / etc.) so the lodged scores no longer reflect what's installed. Both triggers may fire together. Produces Effective Performance; Lodged Performance is preserved unchanged. kWh is included as ML targets per ADR-0007 — see [[epc-ml-transform]].
|
Establishing a Property's **Effective Performance** (SAP score, EPC Band, CO2, Primary Energy Intensity, space-heating & hot-water kWh) by **assembling the Effective EPC picture and scoring it** through **SAP10 Calculation** (the deterministic `Sap10Calculator`, which superseded the old ML-API rebaseliner; an ML residual head over the calculator is future — ADR-0009/0013). The *assembly* is the substance: apply **Landlord Overrides** (e.g. boiler → ASHP, wall insulated) as a simulation on the `EpcPropertyData`; estimate components from surrounding properties when there is no EPC; re-map an old-schema EPC to current and gap-fill from neighbour predictions. The calculator is the **scoring engine at the tail**, not the whole of Rebaselining — so its call lives inside the Rebaseliner, after assembly. Triggered whenever the assembled picture differs from the lodged record: (a) the EPC was lodged under a methodology the calculator supersedes (`sap_version < 10.2`), (b) Overrides / Site Notes changed the physical state (walls / heating / windows / etc.), or (c) the picture is estimated or remapped rather than a real current EPC. Produces Effective Performance; Lodged Performance is preserved unchanged. The same single scoring also yields the per-end-use kWh that **Bill Derivation** prices — one scoring, two products. kWh is an ML target per ADR-0007 — see [[epc-ml-transform]].
|
||||||
_Avoid_: re-scoring, re-prediction, performance recomputation, refresh (for cache-freshness)
|
_Avoid_: re-scoring, re-prediction, performance recomputation, refresh (for cache-freshness)
|
||||||
|
|
||||||
**Baseline Performance**:
|
**Baseline Performance**:
|
||||||
A Property's current performance aggregate, holding both Lodged Performance and Effective Performance plus the energy block: delivered kWh **per end use** (heating, hot water, lighting, appliances, cooking, pumps/fans, …) and the **annual bill** composed into per-section costs plus a total, produced by **Bill Derivation** from SAP10 Calculation's per-end-use kWh × current Fuel Rates. Persisted as one row (flat typed columns, per-section kWh + cost + total); surfaced as one block in the UI.
|
A Property's current performance aggregate, holding both Lodged Performance and Effective Performance plus the energy block: delivered kWh **per end use** (heating, hot water, lighting, appliances, cooking, pumps/fans, cooling) and the **annual bill** composed into per-section costs plus a total, produced by **Bill Derivation** from SAP10 Calculation's per-end-use kWh × current Fuel Rates. Persisted as one row (flat typed columns, per-section kWh + cost + total); surfaced as one block in the UI.
|
||||||
_Avoid_: baseline predictions, predicted baseline, rebaselined values
|
_Avoid_: baseline predictions, predicted baseline, rebaselined values
|
||||||
|
|
||||||
**Lodged Performance**:
|
**Lodged Performance**:
|
||||||
|
|
|
||||||
|
|
@ -566,6 +566,16 @@ class RenewableHeatIncentive:
|
||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
class EpcPropertyData:
|
class EpcPropertyData:
|
||||||
|
"""The cert aggregate every downstream stage reads.
|
||||||
|
|
||||||
|
Currently **loosely typed** (`Union[int, str]` fuel/emitter fields, raw
|
||||||
|
`Optional[int]` codes, `str` fallbacks) and filled by three mappers — EPC
|
||||||
|
API, Elmhurst site notes, pashub — with different conventions, so
|
||||||
|
normalization happens *downstream* (e.g. fuel resolution in the calculator's
|
||||||
|
`cert_to_inputs`). The direction is to push normalization to the mappers and
|
||||||
|
make this a strict type — see docs/adr/0015-mappers-own-cert-normalization.md.
|
||||||
|
"""
|
||||||
|
|
||||||
# General
|
# General
|
||||||
dwelling_type: str # TODO: make enum?
|
dwelling_type: str # TODO: make enum?
|
||||||
inspection_date: date
|
inspection_date: date
|
||||||
|
|
|
||||||
|
|
@ -107,3 +107,26 @@ Effective Performance; no third value-set); only the timing changes:
|
||||||
The `≥1000-cert parity` gate from ADR-0009/0010 still governs whether the calculator's figures are
|
The `≥1000-cert parity` gate from ADR-0009/0010 still governs whether the calculator's figures are
|
||||||
*trusted as definitive* for the SAP-10.2 cohort, but it no longer gates *wiring* — pre-10.2 certs
|
*trusted as definitive* for the SAP-10.2 cohort, but it no longer gates *wiring* — pre-10.2 certs
|
||||||
have no current-spec lodged figure to fall back to, so the calculator is the only source there.
|
have no current-spec lodged figure to fall back to, so the calculator is the only source there.
|
||||||
|
|
||||||
|
## Amendment (2026-06-02): the calculator is the *scoring engine* within Rebaselining, which also feeds Bill Derivation
|
||||||
|
|
||||||
|
This ADR's shorthand — "the calculator *is* the Rebaseliner" — is sharpened by the fuller picture of
|
||||||
|
Rebaselining. **Rebaselining is _assemble the Effective EPC picture, then score it_**: apply
|
||||||
|
**Landlord Overrides** (boiler → ASHP, wall insulated) as a simulation on `EpcPropertyData`; estimate
|
||||||
|
components from surrounding properties when there is no EPC; re-map an old-schema EPC and gap-fill from
|
||||||
|
neighbour predictions (the override/estimation work lands shortly). The `Sap10Calculator` is the
|
||||||
|
**scoring engine at the tail of that assembly**, not the whole of Rebaselining — so the calculator
|
||||||
|
call lives **inside** the Rebaseliner (after assembly), never hoisted up into the orchestrator.
|
||||||
|
|
||||||
|
Because [Bill Derivation](0014-bill-derivation-from-real-fuel-rates.md) prices the **same scored
|
||||||
|
picture**, the Rebaseliner **exposes its `SapResult` as a first-class part of its result** — not just
|
||||||
|
`(Performance, reason)`. The orchestrator runs the calculator **once** (via the Rebaseliner) and
|
||||||
|
composes two products from that one `SapResult`: Effective Performance, and the Bill
|
||||||
|
(`EnergyBreakdown.from_sap_result` → `BillDerivation`). Running the calculator a second time for bills
|
||||||
|
is rejected — it is the expensive step over the ~40k cohort and a second call could drift from the
|
||||||
|
first.
|
||||||
|
|
||||||
|
Corollary: once Overrides/estimation land, Effective Performance is the calculator's output **even for
|
||||||
|
`sap_version ≥ 10.2`** — a user-modified or estimated dwelling has no valid lodged figure to keep. The
|
||||||
|
"keep lodged ≥ 10.2" rule holds only for a real, current, un-overridden EPC; the **Bill always derives
|
||||||
|
from the `SapResult` regardless** (lodged figures carry no per-end-use kWh).
|
||||||
|
|
|
||||||
|
|
@ -101,3 +101,34 @@ production migration is FE-owned (Drizzle); `docs/migrations/` updated.
|
||||||
- **Bill at SAP Table 32 prices** — rejected: standardised rating prices, ~half real electricity.
|
- **Bill at SAP Table 32 prices** — rejected: standardised rating prices, ~half real electricity.
|
||||||
- **JSON `bill_breakdown` block** — rejected: end-uses are fixed-cardinality, so flat columns are
|
- **JSON `bill_breakdown` block** — rejected: end-uses are fixed-cardinality, so flat columns are
|
||||||
clean and stay queryable (ADR-0004).
|
clean and stay queryable (ADR-0004).
|
||||||
|
|
||||||
|
## Amendment (2026-06-02): fuel is a calculator *output*; §3's mapping helpers corrected
|
||||||
|
|
||||||
|
Wiring the `SapResult → EnergyBreakdown` adapter forced the question §3 left implicit: *where does
|
||||||
|
the fuel each end use burns come from?* Resolved in a `/grill-with-docs` session.
|
||||||
|
|
||||||
|
- **Decision: per-end-use fuel is calculator output.** The calculator resolves the fuel for each
|
||||||
|
billable end use (it already uses it to derive the delivered kWh and the rating cost), so it emits
|
||||||
|
the **resolved Table-32 fuel codes** on `SapResult` (main-1 / main-2 / secondary / hot water — the
|
||||||
|
electric end uses are electricity by construction), alongside `pv_exported_kwh` for the SEG credit.
|
||||||
|
`BillDerivation`'s adapter is then a **pure `SapResult → EnergyBreakdown` map** and can never price
|
||||||
|
the calculator's kWh at a fuel the calculator never used. Rejected: an adapter that re-reads raw
|
||||||
|
`EpcPropertyData` fuel fields and re-normalizes them — that duplicates `cert_to_inputs`
|
||||||
|
(`_main_fuel_code`, `_water_heating_fuel_code`, HW→main default, CHP blend, the `MissingMainFuelType`
|
||||||
|
strict-raise) and reopens divergence between the bill and the rating.
|
||||||
|
|
||||||
|
- **§3 correction.** §3 says the per-end-use fuel codes map to `Fuel` "via the existing
|
||||||
|
`is_gas_code` / `is_electric_fuel_code` / `is_liquid_fuel_code` helpers." That is not what shipped:
|
||||||
|
mapping is `domain/property_baseline/sap_fuel.py::sap_code_to_fuel`, a bounded **Table-32 fuel-code
|
||||||
|
→ `Fuel`** dispatch that strict-raises `UnmappedSapCode` on an unmapped code. The "meet at one
|
||||||
|
vocabulary, not raw SAP codes" intent stands; the named helpers do not.
|
||||||
|
|
||||||
|
- **Interim, pending [ADR-0015](0015-mappers-own-cert-normalization.md).** Fuel resolution sits in
|
||||||
|
the calculator *because* `EpcPropertyData` is not yet a strict normalized type. Once ADR-0015 lands
|
||||||
|
(mappers normalize at the boundary), attribution can move upstream and the `SapResult` fuel-code
|
||||||
|
fields may be retired.
|
||||||
|
|
||||||
|
- **`COOLING` section added.** §1 listed cooling as an end use but §6's flat columns omitted it.
|
||||||
|
`BillSection` gains `COOLING` (kWh from `SapResult.space_cooling_fuel_kwh_per_yr`, electricity by
|
||||||
|
construction), so §6's layout gains a `cooling_kwh` + `cooling_cost_gbp` column pair (FE-owned
|
||||||
|
Drizzle migration).
|
||||||
|
|
|
||||||
66
docs/adr/0015-mappers-own-cert-normalization.md
Normal file
66
docs/adr/0015-mappers-own-cert-normalization.md
Normal file
|
|
@ -0,0 +1,66 @@
|
||||||
|
---
|
||||||
|
Status: accepted
|
||||||
|
---
|
||||||
|
|
||||||
|
# Mappers own cert normalization; `EpcPropertyData` becomes a strict normalized type
|
||||||
|
|
||||||
|
Names a direction that [ADR-0013](0013-calculator-produces-effective-performance-shadow-first.md)
|
||||||
|
already gestured at ("the strict-typing of `EpcPropertyData` that will close most of those gaps is
|
||||||
|
still pending") and that [ADR-0014](0014-bill-derivation-from-real-fuel-rates.md) ran into head-on.
|
||||||
|
Relates to [ADR-0001](0001-two-source-paths.md) (the two source paths). Decided in a
|
||||||
|
`/grill-with-docs` session (2026-06-02). This ADR records a **direction + a tracked piece of work**,
|
||||||
|
not a slice that has landed.
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
`EpcPropertyData` is the one cert aggregate every downstream stage reads, but it is **loosely
|
||||||
|
typed** — `main_fuel_type: Union[int, str]`, `heat_emitter_type: Union[int, str]`, bare
|
||||||
|
`Optional[int]` codes (`water_heating_fuel`, `secondary_fuel_type`), `str` fallbacks like
|
||||||
|
`'Unknown'` / `'Pre 2013'`. It is filled by **three mappers with different conventions**:
|
||||||
|
|
||||||
|
- the **EPC API** mapper (int codes),
|
||||||
|
- the **Elmhurst** site-notes mapper (string labels, e.g. `'Bulk LPG'`),
|
||||||
|
- **pashub**.
|
||||||
|
|
||||||
|
Because the cert arrives un-normalized, **normalization happens downstream in the calculator**
|
||||||
|
(`domain/sap10_calculator/rdsap/cert_to_inputs.py`): `_main_fuel_code` resolves the union and
|
||||||
|
**strict-raises `MissingMainFuelType`** on a non-int rather than defaulting; `_water_heating_fuel_code`
|
||||||
|
applies the "HW fuel defaults to the main system" rule; CHP/community blends are reassembled. This
|
||||||
|
logic is correct, but it lives in the wrong layer — it is *cert-shape* knowledge, not *physics*.
|
||||||
|
|
||||||
|
The trigger: [ADR-0014](0014-bill-derivation-from-real-fuel-rates.md)'s `BillDerivation` needs the
|
||||||
|
fuel each end use burns. The fuel fields *are* on `EpcPropertyData`, but reading them raw would mean
|
||||||
|
**re-implementing the calculator's normalization** (union resolution, HW→main default, strict-raise,
|
||||||
|
CHP blend) in a second place — and risk the bill pricing the calculator's delivered kWh at a fuel
|
||||||
|
the calculator never used. ADR-0014 therefore resolves fuel **inside the calculator** and emits it as
|
||||||
|
output. That is the right call *given today's loose cert*, but it is a **symptom**: the consumer is
|
||||||
|
paying for normalization that should have happened at the mapper boundary.
|
||||||
|
|
||||||
|
## Decision (direction)
|
||||||
|
|
||||||
|
1. **Normalization is a mapper responsibility.** Each mapper (API / Elmhurst / pashub) transforms its
|
||||||
|
source into a **single normalized shape**, resolving fuel labels→codes, applying defaults, and
|
||||||
|
raising on genuinely-missing required fields — at the boundary, once.
|
||||||
|
2. **`EpcPropertyData` becomes strict.** Replace `Union[int, str]` and raw `Optional[int]` code
|
||||||
|
fields with precise types (enums over SAP code ints; no string fallbacks in the domain object).
|
||||||
|
3. **Downstream consumers stop re-normalizing.** The calculator's `cert_to_inputs` normalization
|
||||||
|
shrinks to physics; a consumer like the bill adapter could then read fuel off a strict
|
||||||
|
`EpcPropertyData` safely (the "read it off the cert" option ADR-0014 rejected becomes sound).
|
||||||
|
|
||||||
|
## Consequences / affected areas
|
||||||
|
|
||||||
|
- **Calculator** — `cert_to_inputs` sheds its fuel/string normalization helpers; strict-raises move
|
||||||
|
to the mappers (the right place to fix a data gap).
|
||||||
|
- **Bill Derivation (ADR-0014)** — calculator-side fuel resolution on `SapResult` is an **interim
|
||||||
|
measure**, explicitly *because* the cert is not yet normalized. When this ADR lands, fuel attribution
|
||||||
|
can move upstream and the `SapResult` fuel-code fields may be retired.
|
||||||
|
- **The three mappers** — each gains normalization responsibility and its own conformance tests
|
||||||
|
(the strict-typing also makes mapper bugs fail loudly at the boundary, not deep in the cascade).
|
||||||
|
- **Reduced divergence risk** — one normalized vocabulary means the bill, the rating, and any future
|
||||||
|
consumer cannot silently disagree about a cert's fuels.
|
||||||
|
|
||||||
|
## Status of the work
|
||||||
|
|
||||||
|
Direction accepted; **not yet implemented**. To be broken into slices and tracked as an issue
|
||||||
|
parented to the Ara backend PRD (`#1128`). Until then, downstream normalization (and ADR-0014's
|
||||||
|
calculator-side fuel resolution) stands as the documented interim.
|
||||||
Loading…
Add table
Reference in a new issue