Model/docs/adr/0006-deterministic-kwh-no-baseline-ml.md
2026-05-16 14:15:56 +00:00

23 lines
3.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Baseline kWh and bills are deterministic — no ML on the kWh side
**Status: Superseded by [ADR-0007](0007-kwh-as-ml-target.md).** The premise here — that baseline kWh can be derived from SAP physics alone — held when the gov EPC API did not expose per-end-use kWh. The New EPC API exposes `renewable_heat_incentive.space_heating_existing_dwelling` and `.water_heating` directly, removing the need for ML on the *baseline* side; meanwhile *post-measure* kWh prediction is reintroduced as an ML target to avoid per-band UCL discontinuities at measure-application time. See ADR-0007 for the replacement design.
---
Annual kWh, fuel split, and bills are produced by `EpcEnergyDerivationService` via SAP physics + UCL per-band correction (Few et al. 2023) + per-fuel rates from `FuelRatesRepo`. There is no ML lambda on the kWh path — neither for baseline derivation nor for per-recommendation kWh impact. We considered keeping a kWh ML lambda (the current `model_engine` has two — one pre-recommendation, one post-optimisation) and rejected both.
The forcing facts:
1. The new gov EPC API exposes `energy_consumption_current` (kWh/m², primary) and per-end-use cost fields for the regulated portion of energy use. The decomposition into heating / hot water / lighting that the gov website displays is computed downstream from SAP — SAP itself defines the proportional split deterministically given heating + hot water fuel codes and floor area.
2. The EPC's recorded cost fields use fuel rates pinned to the inspection date, so we discard them and recompute bills from delivered kWh × current `FuelRatesRepo` rate + standing charges + SEG credits.
3. The UCL correction (Few et al.) is an empirical correction on **total annual PEUI**, not on heating-vs-hot-water split — but applied per-band, post-decomposition. The existing `AnnualBillSavings.adjust_energy_to_metered` already ports the per-band gradients/intercepts from Table 3 of the paper.
4. Per-recommendation kWh delta is derivable from the SAP delta predicted by `ImpactPredictionService` + heating-system fuel + COP — no separate ML call needed.
ML is reserved for SAP / carbon / heat demand — the quantities where the physical model is partial and the ML lambda earns its keep. The kWh pipeline is fully deterministic and reproducible, which makes it unit-testable against fakes without an ML lambda, and lets us refresh bills without re-running ML (a fuel-rate update or a new Defra carbon factor publishes new bill figures without touching the modelling lambdas).
## Consequences
- The pre-recommendation kWh ML lambda (`KWH_MODEL_PREFIXES` in [model_api.py](../../backend/ml_models/api.py)) is retired — no consumer in the new pipeline.
- `EpcEnergyDerivationService` becomes a fat deterministic service: SAP physics + UCL + FuelRates lookup + primary-to-delivered conversion. Long but readable.
- Site Notes have no `energy_consumption_current` field (PasHub does not produce one). The deterministic SAP-physics path handles this case naturally — same code, different source of regulated PEUI.
- UCL paper scope (gas-heated, no PV, England + Wales, SAP 2012+) is silently extrapolated to all properties by the current code. Whether to keep silent extrapolation or stratify (no correction for non-gas / PV) is flagged for the per-service grill.
- Adding back a kWh ML lambda later is a real change, not a config tweak — flag it as an ADR if proposed.