Model/domain/sap10_ml/sap_efficiencies.py
Khalim Conn-Kowlessar 68401c517a refactor: lift-and-shift packages/domain/src/domain/ml → domain/sap10_ml
Sibling migration to the sap10_calculator move — `domain.ml` now lives
at the root-level layout (`domain/sap10_ml/`) matching the pattern
already used by `domain.addresses`, `domain.tasks`, `domain.postcode`,
and `domain.sap10_calculator`.

Changes:

- `git mv packages/domain/src/domain/ml → domain/sap10_ml` (19 files;
  history preserved).
- Subpackage rename: `domain.ml` → `domain.sap10_ml`. 32 references
  rewritten across .py and .md files: 11 internal + 21 external
  (datatypes/epc/domain/mapper.py, 14 files in domain/sap10_calculator,
  2 backend tests, 2 ADRs, 1 README, 1 design doc).
- Path-string updates: `pytest.ini` testpath
  `packages/domain/src/domain/ml/tests` → `domain/sap10_ml/tests` so
  ML tests stay in the default auto-discovered sweep. `CONTEXT.md`
  also updated.

`packages/domain/src/domain/` is now empty — the workspace `domain/`
tree has been fully migrated. Together with the `domain/__init__.py`
deletions from the sap10_calculator commit (29ac35cc), `domain` is
now a single root-level namespace package with subpackages
{addresses, sap10_calculator, sap10_ml, tasks} + the standalone
`postcode.py` module.

Verified:

- Focused sweep (backend mapper-chain + sap10_calculator worksheet
  e2e + golden fixtures): 99 passed / 19 failed — identical baseline.
- Wider sweep (all sap10_calculator + sap10_ml): 1654 passed / 20
  failed (same pre-existing failures).
- domain/sap10_ml/tests: 210/210 PASSED at new path.
- Pyright net-zero: heat_transmission.py 13, cert_to_inputs.py 35,
  mapper.py 33, rdsap_uvalues.py 1 (all unchanged from baseline).

Note: `packages/domain/pyproject.toml` still declares
`packages = ["src/domain"]` for the hatchling wheel — that target
directory is now empty and the wheel build is effectively a no-op.
Retiring the workspace package or repointing the wheel is a follow-up.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-26 13:01:35 +00:00

260 lines
11 KiB
Python

"""SAP10.2 seasonal-efficiency + fuel-price lookups for ML feature engineering.
Source: BRE, *SAP 10.2* (14-03-2025) — Tables 4a, 4b, and Table 32 (the
RdSAP10 fuel-price replica of SAP10.2 Table 12).
Helpers return:
- seasonal_efficiency(code) -> decimal (0.84 not 84)
- water_heating_efficiency(water_code, main_code) -> decimal
- fuel_unit_price_p_per_kwh(fuel_code) -> pence per kWh
All helpers are total: unknown codes cascade to typical-fuel defaults so the
predicted_total_fuel_cost feature is never null.
"""
from __future__ import annotations
from typing import Final, Optional
# ---------------------------------------------------------------------------
# Table 4a + Table 4b — space-heating seasonal efficiency by code
# Decimal, not percent. Codes 101-141 use Table 4b winter eff. Codes 151+
# use Table 4a "Efficiency %" column. Heat pumps: column "space".
# ---------------------------------------------------------------------------
_SPACE_EFF_BY_CODE: Final[dict[int, float]] = {
# Table 4b gas/oil boilers — winter efficiency.
101: 0.74, 102: 0.84, 103: 0.74, 104: 0.84, 105: 0.70, 106: 0.80,
107: 0.70, 108: 0.80, 109: 0.66,
110: 0.73, 111: 0.69, 112: 0.71, 113: 0.84, 114: 0.84,
115: 0.66, 116: 0.56, 117: 0.66, 118: 0.66, 119: 0.66,
120: 0.74, 121: 0.83, 122: 0.70, 123: 0.79,
124: 0.66, 125: 0.71, 126: 0.80, 127: 0.84,
128: 0.71, 129: 0.77, 130: 0.82, 131: 0.66, 132: 0.71,
133: 0.47, 134: 0.51, 135: 0.61, 136: 0.66, 137: 0.66, 138: 0.71,
139: 0.61, 140: 0.71, 141: 0.76,
# Table 4a solid-fuel boilers.
151: 0.60, 153: 0.65, 155: 0.70, 156: 0.55, 158: 0.65, 159: 0.70,
160: 0.45, 161: 0.55,
# Electric boilers (Table 4a).
191: 1.00, 192: 1.00, 193: 1.00, 194: 0.85, 195: 1.00, 196: 0.85,
# Heat pumps (Table 4a, space column).
211: 2.30, 213: 2.30, 214: 1.70, 215: 1.20, 216: 1.20, 217: 1.10,
221: 1.70, 223: 1.70, 224: 1.70, 225: 0.84, 226: 0.84, 227: 0.77,
# Heat networks (Table 4a).
301: 0.80, 302: 0.75, 304: 3.00,
# Storage / electric.
401: 1.00, 402: 1.00, 403: 1.00, 404: 1.00, 405: 1.00, 406: 1.00,
407: 1.00, 408: 1.00, 409: 1.00,
# Electric underfloor.
421: 1.00, 422: 1.00, 423: 1.00, 424: 1.00, 425: 1.00,
# Warm air.
501: 0.70, 502: 0.76, 503: 0.72, 504: 0.78, 505: 0.69,
506: 0.70, 507: 0.76, 508: 0.72, 509: 0.78, 510: 0.85, 511: 0.81,
512: 0.70, 513: 0.72, 514: 0.70, 515: 1.00, 520: 0.81,
# Warm-air heat pumps.
521: 2.30, 523: 2.30, 524: 1.70, 525: 1.20, 526: 1.20, 527: 1.10,
# Room heaters — gas mains/biogas (column A).
601: 0.50, 602: 0.50, 603: 0.63, 604: 0.63, 605: 0.40, 606: 0.40,
607: 0.45, 609: 0.58, 610: 0.72, 611: 0.85, 612: 0.20, 613: 0.90,
# Room heaters — liquid.
621: 0.55, 622: 0.65, 623: 0.60, 624: 0.70, 625: 0.94,
# Room heaters — solid (column B non-HETAS).
631: 0.32, 632: 0.50, 633: 0.60, 634: 0.65, 635: 0.65, 636: 0.70,
# Room heaters — electric.
691: 1.00, 692: 1.00, 693: 1.00, 694: 1.00,
# Other.
699: 1.00, 701: 1.00,
}
# Table 4a hot-water section — DHW seasonal efficiency for DHW-only codes.
_WATER_EFF_BY_CODE: Final[dict[int, float]] = {
999: 1.00, # No HW system present, electric immersion assumed
901: 0.0, # From main heating — sentinel: use main code
902: 0.0, # From secondary — sentinel
903: 1.00, # Electric immersion
907: 0.70, # Single-point gas at point of use
908: 0.65, # Multi-point gas
909: 1.00, # Electric instantaneous
911: 0.65, # Gas boiler/circulator for water only
912: 0.70, # Liquid fuel boiler/circulator
913: 0.55, # Solid fuel boiler for water only
914: 0.0, # From second main system — sentinel
921: 0.46, 922: 0.50, 923: 0.60, 924: 0.65, 925: 0.65, 926: 0.70,
927: 0.60, 928: 0.70, 929: 0.75, 930: 0.45, 931: 0.55,
941: 1.70, # Electric heat pump for water only
950: 0.80, 951: 0.75, 952: 3.00, # Hot-water heat networks
}
# Gov EPC API main_heating_category -> typical SAP10.2 Table 4a seasonal-eff
# fallback when `sap_main_heating_code` is null. Real certs frequently omit
# the Table 4a code but still report a category, and the silent fallback to
# 0.80 (gas boiler) catastrophically misrates heat pumps and storage.
_CATEGORY_FALLBACK_EFF: Final[dict[int, float]] = {
# 1 = central heating without separate HW (boiler typical)
1: 0.80,
2: 0.80, # central heating with separate HW
3: 0.80, # community heat network — Table 4a 301 typical
4: 2.30, # heat pump — Table 4a 211 typical (mid GSHP/ASHP)
5: 0.76, # warm air — Table 4a 502 typical
6: 0.80, # community heat network
7: 1.00, # high-heat-retention electric storage
}
# Gov EPC API main_fuel_type -> Table 4a room-heater eff column when
# category==10 ("Room heaters") and the SAP code is null.
_ROOM_HEATER_FUEL_EFF: Final[dict[int, float]] = {
1: 0.55, # mains gas (legacy)
2: 0.55, # LPG (legacy)
3: 0.55, # bottled LPG
4: 0.65, # oil (legacy)
10: 1.00, # electricity (legacy)
26: 0.55, # mains gas (not community)
27: 0.55, # LPG (not community)
28: 0.65, # oil (not community)
29: 1.00, # electricity (not community)
}
def seasonal_efficiency(
sap_main_heating_code: Optional[int],
main_heating_category: Optional[int] = None,
main_fuel_type: Optional[int] = None,
) -> float:
"""Space-heating seasonal efficiency as a decimal (0.84 = 84%).
Resolution order:
1. `sap_main_heating_code` -> Table 4a/4b lookup (most authoritative).
2. `main_heating_category` (gov API enum: 4=heat pump, 7=storage, ...)
with optional `main_fuel_type` discriminator for `category==10`
room heaters.
3. 0.80 typical-gas-boiler default.
"""
if sap_main_heating_code is not None:
eff = _SPACE_EFF_BY_CODE.get(sap_main_heating_code)
if eff is not None:
return eff
if main_heating_category == 10:
if main_fuel_type is not None:
eff = _ROOM_HEATER_FUEL_EFF.get(main_fuel_type)
if eff is not None:
return eff
return 0.55
if main_heating_category is not None:
eff = _CATEGORY_FALLBACK_EFF.get(main_heating_category)
if eff is not None:
return eff
return 0.80
def water_heating_efficiency(
water_heating_code: Optional[int],
main_heating_code: Optional[int],
) -> float:
"""Water-heating seasonal efficiency as a decimal.
Codes 901/914 ("from main / from second main") inherit the main code's
seasonal efficiency. Code 902 ("from secondary") falls back to typical.
Unknown -> 0.78 (gas-combi typical).
"""
if water_heating_code is None:
return 0.78
eff = _WATER_EFF_BY_CODE.get(water_heating_code)
if eff is None:
return 0.78
if eff == 0.0: # sentinel for "inherit"
return seasonal_efficiency(main_heating_code)
return eff
# ---------------------------------------------------------------------------
# Table 32 — fuel prices in pence per kWh
# ---------------------------------------------------------------------------
_FUEL_UNIT_PRICE: Final[dict[int, float]] = {
# Gas fuels
1: 3.48, # mains gas
2: 7.60, # bulk LPG
3: 10.30, # bottled LPG (main heating)
5: 3.48, # bottled LPG (secondary) — RdSAP10 ascribes mains-gas price; LPG bottle code
9: 7.60, # LPG SC11F
7: 0.0, # biogas — note: SAP10.2 cost not given for some biofuel codes
# Liquid fuels
4: 5.44, # heating oil
71: 7.64, 73: 7.64, 75: 6.10, 76: 47.0,
# Solid fuels
11: 3.67, 15: 3.64, 12: 4.61, 20: 4.23, 22: 5.81, 23: 5.26, 21: 3.07, 10: 3.99,
# Electricity
30: 13.19, # standard
32: 15.29, # 7h high
31: 5.50, # 7h low
34: 14.68, # 10h high
33: 7.50, # 10h low
38: 13.67, # 18h high
40: 7.41, # 18h low
35: 6.61, # 24h heating
39: 13.19, # any tariff (default to standard)
60: 13.19, # PV export (cost-neutral here)
36: 13.19, # other export
# Heat networks (cost per unit of heat)
51: 4.24, 52: 4.24, 53: 4.24, 54: 4.24, 55: 4.24, 56: 4.24, 57: 4.24, 58: 4.24,
41: 4.24, 42: 4.24, 43: 4.24, 44: 4.24, 45: 2.97, 46: 2.97, 48: 2.97, 50: 0.0,
}
# Gov EPC API fuel enum -> SAP10.2 Table 32 fuel-code mapping. The cert
# stores the API code in primary_main_fuel_type / water_heating_fuel; our
# price dict above is keyed by Table 32. Without this translation, codes
# 26-29 (the modern "not community" main_fuel codes) hit the default and
# silently pretend to be mains gas.
_API_TO_TABLE32: Final[dict[int, int]] = {
0: 30, # No system -> use standard electricity
1: 1, # mains gas (legacy) -> mains gas
2: 2, # LPG (legacy) -> bulk LPG
3: 3, # bottled LPG
4: 4, # oil (legacy) -> heating oil
5: 15, # anthracite
6: 20, # wood logs
7: 23, # bulk wood pellets
8: 21, # wood chips
9: 10, # dual fuel (mineral + wood)
10: 30, # electricity (legacy) -> standard electricity
11: 42, # waste combustion -> heat recovered from waste
12: 43, # biomass -> HN biomass equivalent
13: 44, # biogas - landfill -> HN biogas
14: 11, # house coal
15: 12, # smokeless coal -> manufactured smokeless fuel
16: 22, # wood pellets (secondary)
17: 9, # LPG special condition
18: 75, # B30K
19: 76, # bioethanol
20: 51, # mains gas (community) -> HN boilers mains gas
21: 52, # LPG (community) -> HN boilers LPG
22: 53, # oil (community) -> HN boilers oil
23: 55, # B30D (community)
24: 54, # coal (community)
25: 41, # electricity (community) -> HN electric heat pump
26: 1, # mains gas (not community) -> mains gas
27: 2, # LPG (not community) -> bulk LPG
28: 4, # oil (not community) -> heating oil
29: 30, # electricity (not community)-> standard electricity
}
def fuel_unit_price_p_per_kwh(fuel_code: Optional[int]) -> float:
"""Table 32 unit price (p/kWh). Accepts either a SAP10.2 Table 32 code
or a gov EPC API main_fuel/water_heating_fuel code (the cert's native
enum) and translates the latter via `_API_TO_TABLE32` before lookup.
Unknown -> mains gas (3.48 p/kWh), the dominant UK heating fuel."""
if fuel_code is None:
return 3.48
if fuel_code in _FUEL_UNIT_PRICE:
return _FUEL_UNIT_PRICE[fuel_code]
table32_code = _API_TO_TABLE32.get(fuel_code)
if table32_code is not None:
return _FUEL_UNIT_PRICE.get(table32_code, 3.48)
return 3.48