Model/docs/adr/0015-mappers-own-cert-normalization.md
Khalim Conn-Kowlessar 19a56461ba docs(baseline): Bill Derivation design — fuel as calculator output + rebaselining is assemble-and-score
Captures a /grill-with-docs session resolving how BillDerivation gets the
fuel each end use burns, and what Rebaselining actually is.

- ADR-0014 amendment: per-end-use fuel is a calculator OUTPUT (resolved
  Table-32 codes on SapResult: main-1/main-2/secondary/HW + pv_exported_kwh);
  the adapter is a pure SapResult->EnergyBreakdown map. Corrects stale §3
  (is_gas_code... -> sap_fuel.sap_code_to_fuel). Adds COOLING section.
  Interim, pending ADR-0015.
- ADR-0013 amendment: the calculator is the SCORING ENGINE within
  Rebaselining (assemble the Effective EPC picture, then score), not the
  whole of it; the Rebaseliner exposes its SapResult so the orchestrator
  composes Effective Performance AND the Bill from one scoring.
- ADR-0015 (new): mappers own cert normalization; EpcPropertyData becomes a
  strict type. Explains why fuel resolution sits in the calculator today.
- CONTEXT.md: Effective EPC = the assembled picture; Rebaselining = assemble
  (overrides / neighbour-estimation / old-schema remap) then score.
- EpcPropertyData docstring points at ADR-0015.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 18:04:55 +00:00

4.1 KiB

Status
accepted

Mappers own cert normalization; EpcPropertyData becomes a strict normalized type

Names a direction that ADR-0013 already gestured at ("the strict-typing of EpcPropertyData that will close most of those gaps is still pending") and that ADR-0014 ran into head-on. Relates to ADR-0001 (the two source paths). Decided in a /grill-with-docs session (2026-06-02). This ADR records a direction + a tracked piece of work, not a slice that has landed.

Context

EpcPropertyData is the one cert aggregate every downstream stage reads, but it is loosely typedmain_fuel_type: Union[int, str], heat_emitter_type: Union[int, str], bare Optional[int] codes (water_heating_fuel, secondary_fuel_type), str fallbacks like 'Unknown' / 'Pre 2013'. It is filled by three mappers with different conventions:

  • the EPC API mapper (int codes),
  • the Elmhurst site-notes mapper (string labels, e.g. 'Bulk LPG'),
  • pashub.

Because the cert arrives un-normalized, normalization happens downstream in the calculator (domain/sap10_calculator/rdsap/cert_to_inputs.py): _main_fuel_code resolves the union and strict-raises MissingMainFuelType on a non-int rather than defaulting; _water_heating_fuel_code applies the "HW fuel defaults to the main system" rule; CHP/community blends are reassembled. This logic is correct, but it lives in the wrong layer — it is cert-shape knowledge, not physics.

The trigger: ADR-0014's BillDerivation needs the fuel each end use burns. The fuel fields are on EpcPropertyData, but reading them raw would mean re-implementing the calculator's normalization (union resolution, HW→main default, strict-raise, CHP blend) in a second place — and risk the bill pricing the calculator's delivered kWh at a fuel the calculator never used. ADR-0014 therefore resolves fuel inside the calculator and emits it as output. That is the right call given today's loose cert, but it is a symptom: the consumer is paying for normalization that should have happened at the mapper boundary.

Decision (direction)

  1. Normalization is a mapper responsibility. Each mapper (API / Elmhurst / pashub) transforms its source into a single normalized shape, resolving fuel labels→codes, applying defaults, and raising on genuinely-missing required fields — at the boundary, once.
  2. EpcPropertyData becomes strict. Replace Union[int, str] and raw Optional[int] code fields with precise types (enums over SAP code ints; no string fallbacks in the domain object).
  3. Downstream consumers stop re-normalizing. The calculator's cert_to_inputs normalization shrinks to physics; a consumer like the bill adapter could then read fuel off a strict EpcPropertyData safely (the "read it off the cert" option ADR-0014 rejected becomes sound).

Consequences / affected areas

  • Calculatorcert_to_inputs sheds its fuel/string normalization helpers; strict-raises move to the mappers (the right place to fix a data gap).
  • Bill Derivation (ADR-0014) — calculator-side fuel resolution on SapResult is an interim measure, explicitly because the cert is not yet normalized. When this ADR lands, fuel attribution can move upstream and the SapResult fuel-code fields may be retired.
  • The three mappers — each gains normalization responsibility and its own conformance tests (the strict-typing also makes mapper bugs fail loudly at the boundary, not deep in the cascade).
  • Reduced divergence risk — one normalized vocabulary means the bill, the rating, and any future consumer cannot silently disagree about a cert's fuels.

Status of the work

Direction accepted; not yet implemented. To be broken into slices and tracked as an issue parented to the Ara backend PRD (#1128). Until then, downstream normalization (and ADR-0014's calculator-side fuel resolution) stands as the documented interim.