Model/docs/adr/0006-deterministic-kwh-no-baseline-ml.md
2026-05-16 14:15:56 +00:00

3.4 KiB
Raw Blame History

Baseline kWh and bills are deterministic — no ML on the kWh side

Status: Superseded by ADR-0007. The premise here — that baseline kWh can be derived from SAP physics alone — held when the gov EPC API did not expose per-end-use kWh. The New EPC API exposes renewable_heat_incentive.space_heating_existing_dwelling and .water_heating directly, removing the need for ML on the baseline side; meanwhile post-measure kWh prediction is reintroduced as an ML target to avoid per-band UCL discontinuities at measure-application time. See ADR-0007 for the replacement design.


Annual kWh, fuel split, and bills are produced by EpcEnergyDerivationService via SAP physics + UCL per-band correction (Few et al. 2023) + per-fuel rates from FuelRatesRepo. There is no ML lambda on the kWh path — neither for baseline derivation nor for per-recommendation kWh impact. We considered keeping a kWh ML lambda (the current model_engine has two — one pre-recommendation, one post-optimisation) and rejected both.

The forcing facts:

  1. The new gov EPC API exposes energy_consumption_current (kWh/m², primary) and per-end-use cost fields for the regulated portion of energy use. The decomposition into heating / hot water / lighting that the gov website displays is computed downstream from SAP — SAP itself defines the proportional split deterministically given heating + hot water fuel codes and floor area.
  2. The EPC's recorded cost fields use fuel rates pinned to the inspection date, so we discard them and recompute bills from delivered kWh × current FuelRatesRepo rate + standing charges + SEG credits.
  3. The UCL correction (Few et al.) is an empirical correction on total annual PEUI, not on heating-vs-hot-water split — but applied per-band, post-decomposition. The existing AnnualBillSavings.adjust_energy_to_metered already ports the per-band gradients/intercepts from Table 3 of the paper.
  4. Per-recommendation kWh delta is derivable from the SAP delta predicted by ImpactPredictionService + heating-system fuel + COP — no separate ML call needed.

ML is reserved for SAP / carbon / heat demand — the quantities where the physical model is partial and the ML lambda earns its keep. The kWh pipeline is fully deterministic and reproducible, which makes it unit-testable against fakes without an ML lambda, and lets us refresh bills without re-running ML (a fuel-rate update or a new Defra carbon factor publishes new bill figures without touching the modelling lambdas).

Consequences

  • The pre-recommendation kWh ML lambda (KWH_MODEL_PREFIXES in model_api.py) is retired — no consumer in the new pipeline.
  • EpcEnergyDerivationService becomes a fat deterministic service: SAP physics + UCL + FuelRates lookup + primary-to-delivered conversion. Long but readable.
  • Site Notes have no energy_consumption_current field (PasHub does not produce one). The deterministic SAP-physics path handles this case naturally — same code, different source of regulated PEUI.
  • UCL paper scope (gas-heated, no PV, England + Wales, SAP 2012+) is silently extrapolated to all properties by the current code. Whether to keep silent extrapolation or stratify (no correction for non-gas / PV) is flagged for the per-service grill.
  • Adding back a kWh ML lambda later is a real change, not a config tweak — flag it as an ADR if proposed.