Model/docs/adr/0015-mappers-own-cert-normalization.md
Khalim Conn-Kowlessar 19a56461ba docs(baseline): Bill Derivation design — fuel as calculator output + rebaselining is assemble-and-score
Captures a /grill-with-docs session resolving how BillDerivation gets the
fuel each end use burns, and what Rebaselining actually is.

- ADR-0014 amendment: per-end-use fuel is a calculator OUTPUT (resolved
  Table-32 codes on SapResult: main-1/main-2/secondary/HW + pv_exported_kwh);
  the adapter is a pure SapResult->EnergyBreakdown map. Corrects stale §3
  (is_gas_code... -> sap_fuel.sap_code_to_fuel). Adds COOLING section.
  Interim, pending ADR-0015.
- ADR-0013 amendment: the calculator is the SCORING ENGINE within
  Rebaselining (assemble the Effective EPC picture, then score), not the
  whole of it; the Rebaseliner exposes its SapResult so the orchestrator
  composes Effective Performance AND the Bill from one scoring.
- ADR-0015 (new): mappers own cert normalization; EpcPropertyData becomes a
  strict type. Explains why fuel resolution sits in the calculator today.
- CONTEXT.md: Effective EPC = the assembled picture; Rebaselining = assemble
  (overrides / neighbour-estimation / old-schema remap) then score.
- EpcPropertyData docstring points at ADR-0015.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 18:04:55 +00:00

66 lines
4.1 KiB
Markdown

---
Status: accepted
---
# Mappers own cert normalization; `EpcPropertyData` becomes a strict normalized type
Names a direction that [ADR-0013](0013-calculator-produces-effective-performance-shadow-first.md)
already gestured at ("the strict-typing of `EpcPropertyData` that will close most of those gaps is
still pending") and that [ADR-0014](0014-bill-derivation-from-real-fuel-rates.md) ran into head-on.
Relates to [ADR-0001](0001-two-source-paths.md) (the two source paths). Decided in a
`/grill-with-docs` session (2026-06-02). This ADR records a **direction + a tracked piece of work**,
not a slice that has landed.
## Context
`EpcPropertyData` is the one cert aggregate every downstream stage reads, but it is **loosely
typed** — `main_fuel_type: Union[int, str]`, `heat_emitter_type: Union[int, str]`, bare
`Optional[int]` codes (`water_heating_fuel`, `secondary_fuel_type`), `str` fallbacks like
`'Unknown'` / `'Pre 2013'`. It is filled by **three mappers with different conventions**:
- the **EPC API** mapper (int codes),
- the **Elmhurst** site-notes mapper (string labels, e.g. `'Bulk LPG'`),
- **pashub**.
Because the cert arrives un-normalized, **normalization happens downstream in the calculator**
(`domain/sap10_calculator/rdsap/cert_to_inputs.py`): `_main_fuel_code` resolves the union and
**strict-raises `MissingMainFuelType`** on a non-int rather than defaulting; `_water_heating_fuel_code`
applies the "HW fuel defaults to the main system" rule; CHP/community blends are reassembled. This
logic is correct, but it lives in the wrong layer — it is *cert-shape* knowledge, not *physics*.
The trigger: [ADR-0014](0014-bill-derivation-from-real-fuel-rates.md)'s `BillDerivation` needs the
fuel each end use burns. The fuel fields *are* on `EpcPropertyData`, but reading them raw would mean
**re-implementing the calculator's normalization** (union resolution, HW→main default, strict-raise,
CHP blend) in a second place — and risk the bill pricing the calculator's delivered kWh at a fuel
the calculator never used. ADR-0014 therefore resolves fuel **inside the calculator** and emits it as
output. That is the right call *given today's loose cert*, but it is a **symptom**: the consumer is
paying for normalization that should have happened at the mapper boundary.
## Decision (direction)
1. **Normalization is a mapper responsibility.** Each mapper (API / Elmhurst / pashub) transforms its
source into a **single normalized shape**, resolving fuel labels→codes, applying defaults, and
raising on genuinely-missing required fields — at the boundary, once.
2. **`EpcPropertyData` becomes strict.** Replace `Union[int, str]` and raw `Optional[int]` code
fields with precise types (enums over SAP code ints; no string fallbacks in the domain object).
3. **Downstream consumers stop re-normalizing.** The calculator's `cert_to_inputs` normalization
shrinks to physics; a consumer like the bill adapter could then read fuel off a strict
`EpcPropertyData` safely (the "read it off the cert" option ADR-0014 rejected becomes sound).
## Consequences / affected areas
- **Calculator** — `cert_to_inputs` sheds its fuel/string normalization helpers; strict-raises move
to the mappers (the right place to fix a data gap).
- **Bill Derivation (ADR-0014)** — calculator-side fuel resolution on `SapResult` is an **interim
measure**, explicitly *because* the cert is not yet normalized. When this ADR lands, fuel attribution
can move upstream and the `SapResult` fuel-code fields may be retired.
- **The three mappers** — each gains normalization responsibility and its own conformance tests
(the strict-typing also makes mapper bugs fail loudly at the boundary, not deep in the cascade).
- **Reduced divergence risk** — one normalized vocabulary means the bill, the rating, and any future
consumer cannot silently disagree about a cert's fuels.
## Status of the work
Direction accepted; **not yet implemented**. To be broken into slices and tracked as an issue
parented to the Ara backend PRD (`#1128`). Until then, downstream normalization (and ADR-0014's
calculator-side fuel resolution) stands as the documented interim.