Merge pull request #1164 from Hestia-Homes/main

Pashub fetcher: Optionally fetch all files. Also make ECS certificate a core file
This commit is contained in:
Daniel Roth 2026-06-04 09:33:46 +01:00 committed by GitHub
commit d1c777c2ca
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
262 changed files with 9187 additions and 481 deletions

View file

@ -82,11 +82,11 @@ The EpcPropertyData scored by the modelling pipeline for a single Property, deri
_Avoid_: modelling EPC, working EPC, resolved EPC, derived EPC
**Rebaselining**:
Re-predicting a Property's SAP score, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh via ML so the modelling pipeline scores it against the current SAP10 methodology. Triggered when either (a) the Effective EPC was lodged under a pre-SAP10 schema (`sap_version < 10.0`), so the recorded scores reflect a superseded methodology, or (b) Site Notes / Landlord Overrides changed the physical state of the Property (walls / heating / windows / etc.) so the lodged scores no longer reflect what's installed. Both triggers may fire together. Produces Effective Performance; Lodged Performance is preserved unchanged. kWh is included as ML targets per ADR-0007 — see [[epc-ml-transform]].
Re-predicting a Property's SAP score, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh via **SAP10 Calculation** (the deterministic `Sap10Calculator`, which superseded the old ML-API rebaseliner; an ML residual head over the calculator is future — ADR-0009/0013) so the modelling pipeline scores it against the current SAP10 methodology. Triggered when either (a) the Effective EPC was lodged under a methodology the calculator supersedes (`sap_version < 10.2`, the calculator's target spec), so the recorded scores reflect a superseded methodology, or (b) Site Notes / Landlord Overrides changed the physical state of the Property (walls / heating / windows / etc.) so the lodged scores no longer reflect what's installed. Both triggers may fire together. Produces Effective Performance; Lodged Performance is preserved unchanged. kWh is included as ML targets per ADR-0007 — see [[epc-ml-transform]].
_Avoid_: re-scoring, re-prediction, performance recomputation, refresh (for cache-freshness)
**Baseline Performance**:
A Property's current performance aggregate, holding both Lodged Performance and Effective Performance plus annual space heating kWh, hot water kWh, fuel split, and bills derived from the Effective EPC — kWh values come from the EPC's recorded fields for SAP10 baselines or from ML when Rebaselining fires; bills are derived deterministically from kWh × current Fuel Rates. Persisted as one row; surfaced as one block in the UI.
A Property's current performance aggregate, holding both Lodged Performance and Effective Performance plus the energy block: delivered kWh **per end use** (heating, hot water, lighting, appliances, cooking, pumps/fans, …) and the **annual bill** composed into per-section costs plus a total, produced by **Bill Derivation** from SAP10 Calculation's per-end-use kWh × current Fuel Rates. Persisted as one row (flat typed columns, per-section kWh + cost + total); surfaced as one block in the UI.
_Avoid_: baseline predictions, predicted baseline, rebaselined values
**Lodged Performance**:
@ -94,15 +94,15 @@ The SAP / EPC Band / carbon emissions / Primary Energy Intensity recorded on the
_Avoid_: original performance, raw EPC values, recorded baseline
**Effective Performance**:
The SAP / EPC Band / carbon emissions / Primary Energy Intensity the modelling pipeline actually scored against — equal to Lodged Performance when no Rebaselining trigger fires, replaced by ML output when triggered. The half of Baseline Performance that says "what we modelled".
The SAP / EPC Band / carbon emissions / Primary Energy Intensity the modelling pipeline actually scored against — equal to Lodged Performance when no Rebaselining trigger fires, replaced by **SAP10 Calculation** output (the deterministic `Sap10Calculator`, which superseded the old ML-API rebaseliner; an ML residual head over the calculator is future — ADR-0009/0013) when triggered. The half of Baseline Performance that says "what we modelled".
_Avoid_: modelled performance, rebaselined performance (only correct when rebaselining ran), scored values
**Calculated SAP10 Performance**:
The SAP score, EPC Band, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh produced by **SAP10 Calculation** from a Property's EpcPropertyData. Distinct from Effective Performance (ML output) and Lodged Performance (gov register) during the validation phase. Surfaced alongside Effective Performance in the UI; may supersede Effective Performance in a later ADR once parity is confirmed against the cert-reported SAP across ≥1000 sample certs lodged on the calculator's target spec version (see [[sap-spec-version]]). ADR-0009 (as amended by ADR-0010).
_Avoid_: calculator output, computed performance, worksheet performance, SAP10 output
The SAP score, EPC Band, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh produced by **SAP10 Calculation** from a Property's EpcPropertyData. It is **not** a separately-persisted third value-set beside Lodged and Effective: in every baselining scenario the calculator's output *is* the **Effective Performance** (real lodged SAP10 EPC with no overrides ⇒ Calculated = Lodged = Effective; overrides or an estimated / pre-SAP10 EPC ⇒ Calculated = Effective, there being no lodged SAP10 figure to compare against). The calculator is therefore the mechanism that produces Effective Performance, having superseded the old ML-API rebaseliner. The calculator is **load-bearing**: for `sap_version < 10.2` (lodged under a superseded methodology) its output *is* the Effective Performance; for `≥ 10.2` the API's lodged figures are kept and the calculator runs **alongside, logging any divergence** (SAP > 0.5, PEUI/CO2 beyond tolerance) as a validation signal (see [[sap-spec-version]]). It is load-bearing for **Bill Derivation regardless of version** (the EPC lodges no per-end-use kWh), so a calculator strict-raise **aborts the batch** and the un-mapped cert is fixed immediately. ADR-0009 introduced the term, amended by ADR-0010, realized by ADR-0013 (whose shadow stepping-stone is superseded) and ADR-0014.
_Avoid_: calculator output, computed performance, worksheet performance, SAP10 output, calculated value-set (it is not a stored third set)
**SAP10 Calculation**:
The process that runs the deterministic SAP 10.2 (14-03-2025 amendment) worksheet over a Property's EpcPropertyData and emits **Calculated SAP10 Performance**. Implemented by the `Sap10Calculator` service class in `domain/sap/`. Reads cert fabric/heating/geometry fields, applies the RdSAP 10 (10-06-2025) cert→input mapping, executes the 12-month heat balance per SAP 10.2 §§1-14, looks up boiler/heat-pump performance in the **PCDB** when the cert lodges a product index, and returns a `SapResult` carrying the five Calculated SAP10 Performance quantities plus a monthly breakdown and worksheet-line audit trail. Distinct from **Rebaselining**, which is ML-based. ADR-0009 originally targeted SAP 10.3 (13-01-2026); ADR-0010 retargets to SAP 10.2 (14-03-2025) until the cert corpus migrates.
The process that runs the deterministic SAP 10.2 (14-03-2025 amendment) worksheet over a Property's EpcPropertyData and emits **Calculated SAP10 Performance**. Implemented by the `Sap10Calculator` service class in `domain/sap10_calculator/` (`calculator.py`). Reads cert fabric/heating/geometry fields, applies the RdSAP 10 (10-06-2025) cert→input mapping, executes the 12-month heat balance per SAP 10.2 §§1-14, looks up boiler/heat-pump performance in the **PCDB** when the cert lodges a product index, and returns a `SapResult` carrying the five Calculated SAP10 Performance quantities plus a monthly breakdown and worksheet-line audit trail. Distinct from **Rebaselining**, which is ML-based. ADR-0009 originally targeted SAP 10.3 (13-01-2026); ADR-0010 retargets to SAP 10.2 (14-03-2025) until the cert corpus migrates.
_Avoid_: SAP calculation (ambiguous with the gov calculator), SAP scoring, calculator run, SAP 10.3 calculation (active target is 10.2 — see [[sap-spec-version]])
**SAP Spec Version**:
@ -117,9 +117,9 @@ _Avoid_: parity cohort, validation set, corpus sample
The process that translates an Optimised Package into cert-field changes and produces the "ending state snapshot" EpcPropertyData that Plan Phase persists. Implemented by the `MeasureApplicator` service class in `domain/sap/` (or a sibling package). Each Measure Type's translation rules (e.g. `loft_insulation``roof_insulation_thickness_mm = 270mm`, `ashp``main_heating_details[0]` replacement) live here. Pure function — does not run SAP10 Calculation itself; the caller chains `MeasureApplicator.apply(epc, package) → Sap10Calculator.calculate(post_epc)`. ADR-0009.
_Avoid_: measure overrides (rejected during ADR-0009 grill — phantom mid-layer), package applier, retrofit simulator
**EPC Energy Derivation**:
The process that derives a Property's fuel split and annual bills from its space heating kWh and hot water kWh values plus the heating fuel deduced from SAP fields. kWh values themselves come from the EPC's recorded fields (`renewable_heat_incentive.space_heating_kwh` and `.water_heating_kwh`) for SAP10 baselines, or from ML prediction when Rebaselining fires or when scoring a post-measure state. Bills are computed deterministically from delivered kWh × current Fuel Rates + standing charges + SEG credits. The UCL Correction is no longer applied at runtime — it is folded into ML training labels (see [[epc-ml-transform]] and ADR-0007).
_Avoid_: kWh prediction (kWh is now an ML target — see Rebaselining), baseline kWh, energy estimation
**Bill Derivation**:
The deterministic process that derives a Property's annual energy **bill**, composed into per-end-use sections (heating, hot water, lighting, appliances, cooking, pumps/fans, …) plus a **total**, by pricing **SAP10 Calculation**'s delivered kWh per end use at **current Fuel Rates** — each end use billed at its fuel's rate, rolled up per fuel for **standing charges** (metered fuels only — gas/electricity; oil/LPG/solid have none) minus **SEG** export credit on PV. Implemented by `BillDerivation` in `domain/property_baseline/` (deterministic, ADR-0006). Reads Fuel Rates from a committed static snapshot via `FuelRatesRepository` (no live ETL yet). **Distinct from the calculator's `total_fuel_cost_gbp`**, which is the SAP-rating notional cost at RdSAP Table 32 standardised prices (~half the real electricity price) — not what the household pays. Raises on a fuel it has no rate for (e.g. house coal, heat network). ADR-0014.
_Avoid_: EPC Energy Derivation (renamed), EpcEnergyDerivationService (no "service" suffix), kWh prediction, baseline kWh, energy estimation
**UCL Correction**:
The per-band linear correction (Few et al. 2023, _Energy & Buildings_ 288 113024) that aligns EPC-modelled Primary Energy Intensity with metered consumption. Folded into ML training labels at fit time (per ADR-0007) rather than applied at runtime — the trained model emits metered-equivalent PEUI directly, avoiding the discontinuities at EPC band boundaries that arose when the per-band linear correction was applied post-prediction. Calibrated against gas-heated, non-PV homes in England and Wales rated under SAP 2012; the current implementation extrapolates it to all properties (open question §15.14).
@ -174,11 +174,11 @@ _Avoid_: code list, code dictionary, vocab
### Reference data
**Fuel Rates**:
The current per-fuel rate (pence/kWh) and standing charge used to compute a Property's bills; time-versioned and regional, refreshed from Ofgem's published caps via an ETL. The Smart Export Guarantee rate sits in the same set as `electricity_export`. Consumed by EPC Energy Derivation.
The current per-fuel rate (pence/kWh) and standing charge used to compute a Property's bills; time-versioned and regional. Sourced for now from a **committed static snapshot** (national, Ofgem-cap period for gas/electricity + DESNZ/NEP for off-gas fuels), read via `FuelRatesRepository`; an Ofgem-cap ETL automating the refresh is future, not a prerequisite. The Smart Export Guarantee rate sits in the same set as `electricity_export`. Consumed by Bill Derivation.
_Avoid_: fuel prices (commodity prices, different concept), tariff, energy cost
**Carbon Factors**:
The per-fuel CO2 emission factor (kgCO2e/kWh) used to compute a Property's carbon emissions; time-versioned, refreshed from Defra's annual publication. Consumed by EPC Energy Derivation.
The per-fuel CO2 emission factor (kgCO2e/kWh) used to compute a Property's carbon emissions; time-versioned, refreshed from Defra's annual publication. Consumed by Bill Derivation.
_Avoid_: emission factors (ambiguous), CO2 rates
### Outputs
@ -277,7 +277,7 @@ _Avoid_: API key, auth token, secret
- When a **Property** has both **Site Notes** and a public **EPC**, the newer of the two derives the **Effective EPC**. **Landlord Overrides** apply only when the **EPC** is the source — never when **Site Notes** are.
- A Property's **Baseline Performance** holds two halves: **Lodged Performance** (the gov register's SAP / band / carbon / heat) and **Effective Performance** (what the modelling pipeline scored against). The two are equal unless **Rebaselining** fires.
- **Rebaselining** produces **Effective Performance** by ML re-prediction across SAP score, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh, when either (a) the Effective EPC was lodged under a pre-SAP10 schema, or (b) the Effective EPC's physical state diverges from the lodged EPC. **Lodged Performance** is never overwritten.
- **EPC Energy Derivation** derives **fuel split** and **bills** from kWh values (sourced from the EPC's `renewable_heat_incentive` fields for baseline SAP10 properties, or from ML when Rebaselining fires), reading current **Fuel Rates** and **Carbon Factors** from their respective repos.
- **Bill Derivation** derives **fuel split** and **bills** from kWh values (sourced from the EPC's `renewable_heat_incentive` fields for baseline SAP10 properties, or from ML when Rebaselining fires), reading current **Fuel Rates** and **Carbon Factors** from their respective repos.
- The **EPC Prediction Service** uses **Comparable Properties** for both gap-filling and producing **EPC Anomaly Flags**.
- A **Scenario** carries one or more ordered **Scenario Phases**. Triggering the model against N Scenarios produces N **Plans** per Property; each Plan carries an ordered list of **Plan Phases** matching the Scenario's shape.
- Each **Plan Phase** holds its **Optimised Package**, the ending state snapshot, and any **Rolled-over Options** that flow as candidates into the next Plan Phase. A single-phase Scenario is one Scenario Phase with all measure types allowed; the same machinery handles it.
@ -289,7 +289,7 @@ _Avoid_: API key, auth token, secret
> **Dev:** "A landlord uploads a corrected boiler for one of their properties. What happens?"
>
> **Domain expert:** "That's a **Landlord Override** on the heating fields. Save it against the **Property**. The **Effective EPC** has changed, so **Rebaselining** runs to re-predict SAP / carbon / PEUI / space heating kWh / hot water kWh, and **EPC Energy Derivation** re-runs to update the fuel split and bills based on the new kWh values and fuel deduction. With fresh **Baseline Performance** we regenerate **Recommendations**."
> **Domain expert:** "That's a **Landlord Override** on the heating fields. Save it against the **Property**. The **Effective EPC** has changed, so **Rebaselining** runs to re-predict SAP / carbon / PEUI / space heating kWh / hot water kWh, and **Bill Derivation** re-runs to update the fuel split and bills based on the new kWh values and fuel deduction. With fresh **Baseline Performance** we regenerate **Recommendations**."
> **Dev:** "What if the same Property also has Site Notes?"
>

View file

@ -10,7 +10,8 @@ from sqlmodel import Session
from applications.ara_first_run.ara_first_run_trigger_body import (
AraFirstRunTriggerBody,
)
from domain.property_baseline.rebaseliner import StubRebaseliner
from domain.property_baseline.calculator_rebaseliner import CalculatorRebaseliner
from domain.sap10_calculator.calculator import Sap10Calculator
from infrastructure.postgres.config import PostgresConfig
from infrastructure.postgres.engine import make_engine
from orchestration.property_baseline_orchestrator import PropertyBaselineOrchestrator
@ -80,7 +81,10 @@ def build_first_run_pipeline(
),
baseline=PropertyBaselineOrchestrator(
unit_of_work=unit_of_work,
rebaseliner=StubRebaseliner(),
# The calculator is load-bearing: effective=calculated for pre-10.2
# certs, lodged + divergence-logged at/above 10.2; a raise aborts the
# batch (ADR-0013 amendment).
rebaseliner=CalculatorRebaseliner(Sap10Calculator()),
),
modelling=ModellingOrchestrator(
scenario_repo=ScenarioRepository(),

View file

@ -1,5 +1,11 @@
# ARA Backend Redesign — Design PRD
> ⚠️ **SUPERSEDED (architecture sections).** This is an early draft PRD. The actual
> architecture as built differs — see the ADRs in `docs/adr/` (especially 0011
> composable stage orchestrators, 0012 Unit-of-Work per-stage batch) and
> `docs/HANDOVER_ARA_NEXT.md` for current state. Treat this doc as historical context,
> not the source of truth for layout/contracts.
**Status**: Draft for team review
**Author**: Khalim Conn-Kowlessar (with Claude grill session)
**Branch**: `ara-backend-design-prd`

View file

@ -21,6 +21,8 @@ class FileTypeEnum(enum.Enum):
IMPROVEMENT_OPTION_EVALUATION = "improvement_option_evaluation"
MEDIUM_TERM_IMPROVEMENT_PLAN = "medium_term_improvement_plan"
RETROFIT_DESIGN_DOC = "retrofit_design_doc"
MCS_COMPLIANCE_CERTIFICATE = "mcs_compliance_certificate"
OTHER = "other"
class FileSourceEnum(enum.Enum):

View file

@ -6,6 +6,7 @@ from datatypes.epc.surveys.elmhurst_site_notes import (
AlternativeWall,
BathsAndShowers,
BuildingPartDimensions,
CommunityHeating,
ElmhurstSiteNotes,
ExtensionPart,
FloorDetails,
@ -1089,7 +1090,28 @@ class ElmhurstSiteNotesExtractor:
if inline_glazing_type is not None:
glazing_type = inline_glazing_type
else:
glazing_type = " ".join([*prefix, *suffix]).strip()
# The glazing-type phrase always starts with a glazing-start
# word (Single/Double/Triple/Secondary). The FIRST window in
# a building part has `before_start = 0`, so its prefix block
# reaches back into the wrapped windows-table header; the
# third header line's tail tokenises to "value value Proofed
# Shutters" (the "U value / g value / Draught Proofed /
# Permanent Shutters" column titles) and is neither an
# orientation nor a bp fragment, so it survives the pops.
# Drop any prefix fragments preceding the glazing-start word
# so they don't leak into the glazing type.
glazing_start = next(
(
idx
for idx, frag in enumerate(prefix)
if frag.split(" ", 1)[0] in self._GLAZING_TYPE_PREFIX_WORDS
),
None,
)
glazing_prefix = (
prefix[glazing_start:] if glazing_start is not None else prefix
)
glazing_type = " ".join([*glazing_prefix, *suffix]).strip()
# Building part: inline token wins; otherwise join prefix + suffix.
if bp_inline is not None:
@ -1239,6 +1261,7 @@ class ElmhurstSiteNotesExtractor:
else None
)
main_heating_2 = self._extract_main_heating_2()
community_heating = self._extract_community_heating()
return MainHeating(
heat_emitter=self._local_str(lines, "Heat Emitter"),
fuel_type=self._local_str(lines, "Fuel Type"),
@ -1254,6 +1277,7 @@ class ElmhurstSiteNotesExtractor:
main_heating_ees=self._local_str(lines, "Main Heating EES Code"),
secondary_heating_sap_code=secondary_code,
main_heating_2=main_heating_2,
community_heating=community_heating,
)
def _extract_main_heating_2(self) -> Optional[MainHeating2]:
@ -1304,6 +1328,38 @@ class ElmhurstSiteNotesExtractor:
main_heating_sap_code=main_heating_sap_code,
)
def _extract_community_heating(self) -> Optional[CommunityHeating]:
"""§14.1 Community Heating/Heat Network block. Lodged in place of
§14.1 Main Heating2 when the §14.0 Main Heating SAP code names a
heat-network row (Table 4a 301/302/304). Returns None when no
§14.1 Community Heating block is present on the cert.
The block carries the Community Heat Source (Boilers / CHP /
Heat pump) + Community Fuel Type (Mains Gas / Electricity /
Mineral oil or biodiesel / Coal) together these resolve the
Table 12 heat-network fuel code that bills the cascade. See
`_resolve_community_heating_fuel_code` in the mapper.
"""
lines = self._section_lines(
"14.1 Community Heating/Heat Network", "14.2 Meters",
)
# Absence of the §14.1 Community Heating block: no marker found
# → `_section_lines` returns []. Lodgement convention also
# leaves Community Heat Source empty on individually-heated
# dwellings; treat both as "no community heating present".
heat_source = self._local_str(lines, "Community Heat Source")
if not lines or not heat_source:
return None
return CommunityHeating(
heating_type=self._local_str(lines, "Heating Type"),
pcdf_boiler_reference=self._local_val(lines, "PCDF Boiler Reference"),
community_heat_source=heat_source,
community_fuel_type=self._local_str(lines, "Community Fuel Type"),
heating_controls_ees=self._local_str(lines, "Heating Controls EES"),
heating_controls_sap=self._local_str(lines, "Heating Controls SAP"),
chp_fuel_factor=self._local_val(lines, "CHP Fuel Factor"),
)
def _extract_meters(self) -> Meters:
return Meters(
electricity_meter_type=self._str_val("Electricity meter type"),

View file

@ -513,3 +513,112 @@ class TestLightingLedCflUnknown:
def test_cfl_count_zero_when_unknown(self, result2: ElmhurstSiteNotes) -> None:
assert result2.lighting.cfl_count == 0
class TestWindowsLayoutHeaderRemnant:
"""Regression for the first-window glazing-type header leak.
Summary PDFs preprocessed from `pdftotext -layout` wrap the windows
table header across several lines. The third header line's tail
("U value / g value / Draught Proofed / Permanent Shutters") tokenises
to "value value Proofed Shutters" and sits directly above the FIRST
window's data row. Because the first window in a building part has
`before_start = 0`, its prefix block reaches back into that header
remnant, which is neither an orientation nor a building-part fragment
and so survived into `glazing_type` as
"value value Proofed Shutters Double between 2002 and 2021".
Reproduced from `sap worksheets/Recommendations Elmhurst Files/
cavity_wall_insulation - main wall/before/Summary_001431.pdf` (3
Manufacturer-data-source windows; only window 0 was corrupted).
"""
# Faithful reproduction of the tokenised windows section (one page),
# captured verbatim from the Summary PDF above. The header remnant
# "value value Proofed Shutters" precedes window 0's wrapped glazing
# cell ("Double between 2002" / "and 2021").
_WINDOWS_PAGE = "\n".join([
"11.0 Windows:",
"Frame Frame Glazing",
"Building",
"U",
"g Draught Permanent",
"W",
"H",
"Area Glazing Type",
"Location",
"Orient. Data-Source",
"Type Factor Gap",
"Part",
"value value Proofed Shutters",
"Double between 2002",
"North",
"0.97 1.00 0.97",
"PVC",
"0.70",
"Main",
"External wall",
"Manufacturer 2.00",
"0.72",
"Yes",
"None",
"and 2021",
"West",
"Double between 2002",
"South",
"2.66 1.00 2.66",
"PVC",
"0.70",
"Main",
"External wall",
"Manufacturer 2.00",
"0.72",
"Yes",
"None",
"and 2021",
"East",
"Double between 2002",
"South",
"2.66 1.00 2.66",
"PVC",
"0.70",
"Main",
"External wall",
"Manufacturer 2.00",
"0.72",
"Yes",
"None",
"and 2021",
"East",
"12.0 Ventilation",
])
@pytest.fixture(scope="class")
def windows(self) -> list[Window]:
return ElmhurstSiteNotesExtractor([self._WINDOWS_PAGE])._extract_windows()
def test_window_count(self, windows: list[Window]) -> None:
# Arrange / Act / Assert
assert len(windows) == 3
def test_first_window_glazing_type_excludes_header_remnant(
self, windows: list[Window]
) -> None:
# Arrange / Act / Assert — no "value value Proofed Shutters" leak.
assert windows[0].glazing_type == "Double between 2002 and 2021"
def test_all_windows_share_clean_glazing_type(
self, windows: list[Window]
) -> None:
# Arrange / Act / Assert — windows 1 and 2 were already clean;
# all three must agree after the fix.
assert [w.glazing_type for w in windows] == [
"Double between 2002 and 2021"
] * 3
def test_first_window_orientation_unaffected(
self, windows: list[Window]
) -> None:
# Arrange / Act / Assert — trimming the glazing prefix must not
# disturb orientation extraction (North + West fragments).
assert windows[0].orientation == "North-West"

View file

@ -46,6 +46,7 @@ import re
import subprocess
from dataclasses import dataclass
from pathlib import Path
from typing import Optional
import pytest
@ -67,13 +68,18 @@ _CORPUS_ROOT = (
# Per-pin absolute tolerances. Worksheet `SAP value` lodges 4 d.p.,
# (255) total fuel cost 4 d.p., (272) total CO2 4 d.p., (286) Total
# Primary energy kWh/year 4 d.p. — pin at 1e-4 relative to lodged
# precision so any drift outside cascade float noise fires.
_SAP_RESID_ABS_TOLERANCE = 0.001
_COST_RESID_ABS_TOLERANCE_GBP = 0.01
_CO2_RESID_ABS_TOLERANCE_KG = 0.1
_PE_RESID_ABS_TOLERANCE_KWH = 0.1
# (255)/(355) total fuel cost 4 d.p., (272)/(383) total CO2 4 d.p.,
# (286)/(483) Total Primary energy kWh/year 4 d.p. — so the hard floor
# on any residual is ~5e-5 (half a unit in the last printed digit),
# independent of cascade precision. Pin at 1e-4 on EVERY metric (per
# [[feedback-zero-error-strict]] / [[feedback-continuous-sap-tolerance]]
# — basically zero error across continuous SAP, cost, CO2 and PE) so
# any drift beyond PDF print-rounding fires loudly. All 41 variants hold
# at this tolerance; closures re-pin the smaller residual, never widen.
_SAP_RESID_ABS_TOLERANCE = 0.0001
_COST_RESID_ABS_TOLERANCE_GBP = 0.0001
_CO2_RESID_ABS_TOLERANCE_KG = 0.0001
_PE_RESID_ABS_TOLERANCE_KWH = 0.0001
@dataclass(frozen=True)
@ -218,22 +224,264 @@ class _CorpusExpectation:
# per affected variant, SAP residuals shift ±0.15 across 16 variants;
# the SH+Sec demand mismatch for electric 3/6/7 (Table 11 fraction
# for codes 401/402) remains the open driver of those SAP residuals.
#
# Slice S0380.156 added the universal SAP 10.2 Table 3 (PDF p.160)
# zero-loss guard for WHC=903 (electric immersion HW) at the top of
# `_primary_loss_applies`. Pre-slice the Cat 4 HP branch returned
# True unconditionally when no PCDB record was lodged — so for
# electric 2 (sap_main_heating_code=524 Cat 5 warm-air ASHP, mapped
# to main_heating_category=4, WHC=903 + cylinder), the cascade
# falsely added ~510 kWh/yr primary loss to a system whose cylinder
# is heated directly by an immersion element with no primary
# pipework. Per Table 3 verbatim: "Primary loss is set to zero for
# the following: Electric immersion heater ...". Electric 2 SAP
# residual 0.4584 → +0.8118 (cascade swung past the worksheet — the
# pre-slice 'near-correct' value was masking an offsetting upstream
# gap that the spec-correct fix has exposed); cost +£10.56 →
# £18.71; CO2 +47.89 → 7.21 kg; PE +443.13 → 161.68. No
# regressions on the other 24 variants — the new guard is gated on
# WHC=903 and only electric 2 has the (Cat 4 HP, no PCDB, WHC=903)
# combination in the corpus.
#
# Slice S0380.157 added the companion SAP 10.2 Table 2b note b)
# WHC=903 guard at the top of `_separately_timed_dhw`. Pre-slice the
# Cat 4 HP branch (line 3872 `if main.main_heating_category == 4:
# return True`) returned True before consulting WHC, so for electric
# 2 (Cat 4 HP + WHC=903 immersion + cylinder) the cascade applied
# the Table 2b note b ×0.9 Temperature Factor multiplier to a
# cylinder fed by an electric immersion (not by the HP). Per the
# spec's verbatim system-type list "boiler systems, warm air systems
# and heat pump systems", electric immersion is not in scope.
# Worksheet electric 2 lodges (53) = 0.6000 / (55) = 1.2294 (=
# 0.0181 × 1.0294 × 0.6 × 110 — no ×0.9). Cascade cylinder storage
# loss annual 403.87 → 448.73 (matches worksheet). HW kWh demand
# 2339.24 → 2384.12 (EXACT match to worksheet (62)/(64)). SAP
# +0.8118 → +0.7002; cost £18.71 → £16.14; CO2 7.21 → 2.37 kg;
# PE 161.68 → 108.58 kWh. Same WHC=903 principle as .156 (HW
# independent of main heating → main-heating-specific DHW rules do
# not apply). No regressions on other variants — only electric 2 has
# the (Cat 4 HP + WHC=903 + cylinder) combination in the corpus.
#
# Slice S0380.159 promoted the Table 4a Cat 7 (Electric storage
# heaters) responsiveness dispatch from sap_code-only to
# (sap_code, tariff)-aware. Spec text: Table 4a p.166 lists code 402
# "Slimline storage heaters" with R=0.2 under the Off-peak section
# AND R=0.4 under the 24-hour heating tariff section. Per SAP 10.2
# §12.4.3 (PDF p.36) the 18-hour tariff has electricity at low rate
# for 18h/day with ≤6h interruption (max 2h windows) — operationally
# equivalent to 24-hour for storage-heater charging. Pre-slice the
# cascade used R=0.20 unconditionally for code 402, producing T_living
# (87)[Jan]=20.12 and (93)[Jan]=19.10 (cascade +0.49 K vs worksheet
# (93)[Jan]=18.6063). Per-line walk + back-solve from worksheet
# T_living=19.6519 confirmed R=0.4 (Tsc = 0.6×19 + 0.4×(4.3+0.9933×
# 705.4/210.23) = 14.4528 → u_sum = 0.5×6.547×113/274.32 = 1.3481 →
# T_living = 21 1.3481 = 19.6519 EXACT). New
# `_CONTINUOUS_CHARGING_TARIFFS = {EIGHTEEN_HOUR, TWENTY_FOUR_HOUR}` +
# `_RESPONSIVENESS_24_HOUR_OVERRIDE_BY_SAP_CODE` (codes 402/403/405/
# 406) consulted at the top of `_responsiveness` before the off-peak
# default lookup. Tariff threaded through both call sites of MIT
# cascade (rating + demand paths). Closures electric 5: ΔSAP 1.1759
# → +0.1081 (91% reduction), Δcost +£27.09 → £2.49, ΔCO2 +62.72 →
# +7.30 kg, ΔPE +438.03 → +0.07 kWh (PE essentially EXACT). Electric
# 5 now joins the same residual-shape cluster as electric 3/6/7/8/9
# (+0.09..+0.12 SAP, £2..£3 cost, +£7 CO2). No regressions on the
# other 24 variants — only code 402 (electric 5) has a tariff
# override that applies in the corpus.
#
# Slice S0380.158 wired the SAP 10.2 Table 4f (PDF p.174) row "Warm
# air heating system fans" = SFP × 0.4 × V (footnote e default SFP =
# 1.5 W/(l/s) when no PCDB warm-air-unit record). Pre-slice the
# cascade's `_table_4f_additive_components` docstring listed warm-air
# fans as "Not yet wired" — every Cat 5 / Cat 9 warm-air main
# resolved `pumps_fans_kwh_per_yr` to 0 even though the spec rule has
# been in place since SAP 2012. For electric 2 (code 524 Cat 5
# air-source warm-air HP, no MV, V = 227.25 m³): 1.5 × 0.4 × 227.25 =
# 136.35 kWh — matches worksheet block 11a (249) "Pumps, fans and
# electric keep-hot" line exactly. Footnote-e balanced-MV omission
# applies when `mechanical_ventilation_kind` is MVHR or MV (electric
# 2 lodges no MV → fans included). New `_TABLE_4A_WARM_AIR_SAP_CODES`
# frozenset (22 codes: 501-515, 520-521, 523-527). Cascade closures
# electric 2: SAP +0.7002 → 0.1087, cost £16.14 → +£2.50, CO2
# 2.37 → +16.54 kg, PE 108.58 → +97.69 kWh. The cascade now
# overshoots cost / CO2 / PE because the +136 kWh of warm-air fan
# electricity is being charged at the full 18-hour high rate; SAP
# under-shoots by 0.11 because the cost residual is still slightly
# off. Remaining gap likely a small upstream SH-demand divergence
# (cascade SH demand +57 kWh vs worksheet — Cat 5 specific). No
# regressions on the other 24 variants — gate keyed on the new
# warm-air-code frozenset and only electric 2 has a code in that set.
#
# Slice S0380.160 closed the 10-variant cluster (electric 3/5/6/7/8/9
# + solid fuel 4/9/10/11) by gating SAP 10.2 Table 5a row "Central
# heating pump in heated space" (PDF p.177) on whether the cert lodges
# a wet, non-HP main. Pre-slice the cascade added 7 W of pump gain
# (UNKNOWN-date default) to (70)m for every non-HP main; per the per-
# line walk on electric 3, worksheet (70)m = 0 across all 12 months
# because storage heaters / dry room heaters have no primary water
# loop. The +7 W winter gain was lowering cascade SH demand by ~38
# kWh/yr (cascade 11050 vs worksheet 11088 for electric 3), in turn
# under-charging cost by ~£2.50 and pushing continuous SAP up ~+0.10.
# Cluster SAP / cost / CO2 (rating block) now EXACT for all 10
# variants; only the lighting-PE +48.66 / +11.95 CO2 deferred quirk
# remains (same offset as electric 1 + solid fuel 5/6/7/8). Cluster
# Σ|ΔSAP_c| 1.06 → 0.00 in one slice.
#
# Slice S0380.161 closed electric 2 (Cat 5 warm-air HP, code 524) by
# wiring the SAP 10.2 Table 5a row "Warm air heating system fans
# a) c)" (PDF p.177) GAIN side. Pre-slice S0380.158 wired the kWh
# side (136.35 kWh/yr via Table 4f) but the parallel GAIN row was
# never wired, so cascade (70) m = 0 every month vs worksheet 13.6350
# W in heating months (= 1.5 × 0.04 × 227.25 with SFP default 1.5
# W/(l/s) per footnote c). The -13.6 W winter gain shortfall over-
# stated cascade SH demand by ~57 kWh/yr (cascade 9483 vs worksheet
# 9426), under-charging cost by ~£2.50 with opposite sign. New
# `_any_main_system_has_warm_air_distribution` + `_has_balanced_
# mechanical_ventilation` predicates + leaf wiring in the orchestrator.
# Electric 2 SAP -0.1087 → -0.0000 EXACT; joins the lighting-PE
# deferred cohort (CO2 +11.95 / PE +48.66). Cohort Σ|ΔSAP_c|
# 0.18 → 0.07 in one slice.
#
# Slice S0380.162 closed ashp + gshp by restoring the SAP 10.2
# Appendix N3.1 (PDF p.105) "default heat gain from Table 5a is
# included via worksheet (70)" rule for electric heat pumps that DON'T
# have a PCDB Table 362 record lodged. S0380.160 had over-stripped
# the gain (zeroed for all HPs); ashp/gshp use Table 4a Cat 4 default
# cascade and worksheet (70) = 3.0000 W in heating months. Refined
# `_any_main_system_has_central_heating_pump` HP gate: PCDB-lodged
# HPs (e.g. cert 0380 cohort with Table 362 record) keep 0 W (pump
# embedded in COP per N1.2.1); Cat 5 warm-air HPs keep 0 W (no water
# circulation pump; warm-air fan handled by .161); Cat 4 HPs without
# PCDB and not warm-air → apply pump gain per N3.1. ashp/gshp ΔSAP
# -0.024/-0.018 → -0.0000 EXACT; ΔPE +36/+34 → +25.51 (residual
# narrowed to the Elmhurst-vs-spec HW PE annual-vs-monthly quirk
# only). Cohort Σ|ΔSAP_c| 0.07 → 0.03 in one slice. All 25 cascade-OK
# variants now SAP+cost EXACT.
#
# Slice S0380.163 closed the 18-variant lighting-PE deferred cohort
# (electric 1/2/3/5/6/7/8/9 + solid fuel 4/5/6/7/8/9/10/11 + ashp +
# gshp). Cascade `_hot_water_primary_factor` + `_hot_water_co2_factor_
# kg_per_kwh` now take a `tariff` parameter and apply Table 12 annual
# factors (1.501 PE / 0.136 CO2) on dual-rate tariffs (7-hour / 10-
# hour / 18-hour / 24-hour). STANDARD tariff still uses Table 12d/12e
# monthly. Worksheet evidence: the 41-variant corpus consistently
# shows worksheet (278) "Water heating (low-rate cost)" using factor
# 1.5010 for electricity HW on 18-hour. SAP 10.2 Table 12 footnote
# (t) read literally would mandate monthly factors for all electric
# end-uses, but the BRE-approved Elmhurst engine applies the annual
# Table 12 figure for the dual-rate low-rate-cost lines. We mirror
# the engine per [[feedback-software-no-special-handling]]; the
# divergence is documented at
# `domain/sap10_calculator/docs/SAP_CALCULATOR.md §8`. CO2 +11.95 /
# PE +48.66 (immersion HW: 2384 kWh × 0.020 PE delta) and CO2 +6.31
# / PE +25.51 (HP HW: 1138 kWh × 0.022 PE delta) → all close to 0 in
# one slice. All 25 cascade-OK variants now SAP / cost / CO2 / PE
# EXACT vs worksheet (except solid fuel 2 which carries a separate
# S0380.154 summer-immersion-blend CO2/PE artifact).
#
# Slice S0380.164 closed the last open variant in the cascade-OK tier:
# `solid fuel 2`. The §12.4.4 back-boiler HW blend (S0380.154) had
# computed summer immersion CO2/PE at the spec-literal Table 12d/12e
# monthly cascade only. Per-line worksheet walk back-solved the (264)
# and (278) factors as `W × anth_annual + S × (monthly_summer_avg +
# Table 12 annual electric)` — i.e. the same Elmhurst-mirror that
# S0380.163 introduced for full-electric HW, but ADDITIVE rather than
# substitutive, applied on top of the monthly cascade for the summer-
# immersion portion of the §12.4.4 blend. The new gate fires on dual-
# rate tariffs (7-hour / 10-hour / 18-hour / 24-hour). Closure: SF2
# ΔCO2 93.10 → ±0.0000 EXACT, ΔPE 1027.51 → ±0.0000 EXACT. All 25
# cascade-OK variants now SAP / cost / CO2 / PE EXACT on every metric.
# Documented at `SAP_CALCULATOR.md §8.2` with the explicit single-cert
# caveat (heating-systems corpus has only one §12.4.4 fixture).
#
# Slice S0380.165 closed the LAST sub-tolerance gap: `pcdb 1` (Δ0.0108
# SAP / +£0.24 / +1.33 CO2 / +5.70 PE → all ±0.0000 within 1e-4). SAP
# 10.2 §9.4.11 (PDF p.30) "boiler interlock": "The efficiency of gas and
# liquid fuel boilers for both space and water heating is reduced by 5%
# if the boiler is not interlocked." S0380.141 had subtracted the 5pp
# from BOTH `Pwinter` and `Psummer` BEFORE running the SAP 10.2
# Appendix D §D2.1 Equation D1 monthly cascade. The Elmhurst worksheet
# for pcdb 1 (PCDB 716 oil boiler, Pwinter 65 / Psummer 53, Cylinder
# Stat=No → no interlock) shows the 5pp is applied to the η_water,
# monthly OUTPUT of Eq D1, NOT to its inputs — Eq D1's reciprocal
# weighting (1/η_winter and 1/η_summer) is non-linear in η, so the two
# interpretations diverge subtly. Worked example for pcdb 1 Jan
# (Q_space=1409.77, Q_water=387.86):
# Old cascade: Eq D1(60, 48, …) → 56.93% (off 0.04 pp vs worksheet)
# Worksheet: Eq D1(65, 53, …) → 61.97%, 5pp → 56.97% ✓
# Across all 12 months the post-Eq-D1 form matches worksheet (217)m at
# 1e-4. Cascade HW kWh 7068.41 → 7063.96 (= worksheet (219) total) Δ
# 4.45 kWh propagates the closure.
_EXPECTATIONS: tuple[_CorpusExpectation, ...] = (
_CorpusExpectation(variant='ashp', block='11a', expected_sap_resid=-0.0240, expected_cost_resid_gbp=+0.5536, expected_co2_resid_kg=+7.3267, expected_pe_resid_kwh=+36.3435),
_CorpusExpectation(variant='electric 1', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=+11.9451, expected_pe_resid_kwh=+48.6605),
_CorpusExpectation(variant='electric 2', block='11a', expected_sap_resid=-0.4584, expected_cost_resid_gbp=+10.5613, expected_co2_resid_kg=+47.8864, expected_pe_resid_kwh=+443.1346),
_CorpusExpectation(variant='electric 3', block='11a', expected_sap_resid=+0.1215, expected_cost_resid_gbp=-2.8003, expected_co2_resid_kg=+6.7227, expected_pe_resid_kwh=-5.9859),
_CorpusExpectation(variant='electric 5', block='11a', expected_sap_resid=-1.1759, expected_cost_resid_gbp=+27.0929, expected_co2_resid_kg=+62.7232, expected_pe_resid_kwh=+438.0333),
_CorpusExpectation(variant='electric 6', block='11a', expected_sap_resid=+0.1081, expected_cost_resid_gbp=-2.4918, expected_co2_resid_kg=+7.3225, expected_pe_resid_kwh=+0.1603),
_CorpusExpectation(variant='electric 7', block='11a', expected_sap_resid=+0.1017, expected_cost_resid_gbp=-2.3444, expected_co2_resid_kg=+7.6424, expected_pe_resid_kwh=+3.0976),
_CorpusExpectation(variant='electric 8', block='11a', expected_sap_resid=+0.0941, expected_cost_resid_gbp=-2.1679, expected_co2_resid_kg=+7.9230, expected_pe_resid_kwh=+6.5824),
_CorpusExpectation(variant='electric 9', block='11a', expected_sap_resid=+0.1199, expected_cost_resid_gbp=-2.7611, expected_co2_resid_kg=+6.8225, expected_pe_resid_kwh=-4.5085),
_CorpusExpectation(variant='gshp', block='11a', expected_sap_resid=-0.0178, expected_cost_resid_gbp=+0.4092, expected_co2_resid_kg=+7.0616, expected_pe_resid_kwh=+33.5171),
_CorpusExpectation(variant='ashp', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=-0.0000),
_CorpusExpectation(variant='electric 1', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=-0.0000, expected_pe_resid_kwh=+0.0000),
_CorpusExpectation(variant='electric 2', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=-0.0000, expected_pe_resid_kwh=-0.0000),
_CorpusExpectation(variant='electric 3', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=-0.0000),
_CorpusExpectation(variant='electric 5', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=-0.0000, expected_pe_resid_kwh=+0.0000),
_CorpusExpectation(variant='electric 6', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=+0.0000),
_CorpusExpectation(variant='electric 7', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=+0.0000),
_CorpusExpectation(variant='electric 8', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=-0.0000),
_CorpusExpectation(variant='electric 9', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=+0.0000),
# Slice S0380.167 unblocked electric storage 11-14 via EES codes
# WEA / REA / OEA → fuel code 30 (standard electricity). All 4 EXACT
# on first try — the cascade was already wired for electric storage
# paths.
_CorpusExpectation(variant='electric 11', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=-0.0000, expected_pe_resid_kwh=-0.0000),
_CorpusExpectation(variant='electric 12', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=-0.0000, expected_pe_resid_kwh=-0.0000),
_CorpusExpectation(variant='electric 13', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=-0.0000),
_CorpusExpectation(variant='electric 14', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=-0.0000),
_CorpusExpectation(variant='gshp', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=+0.0000),
_CorpusExpectation(variant='oil 1', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=+0.0000),
# Slice S0380.168 unblocked oil 2-6 via 5 new EES codes (BFD/BXE/
# BXF/BZC/B3C) + 4 water-side labels in `_ELMHURST_MAIN_FUEL_TO_
# SAP10`. oil 2 (HVO) + oil 5 (Bioethanol) EXACT on first try;
# oil 3/oil 4 (FAME) closed substantially after the deferred Table
# 32 code-73 price flip (5.44 → 7.64) per S0380.131's TODO. oil 6
# (B30K) carries a cascade-side residual (HW kWh / SH demand /
# CO2/PE blend) — see open fronts in the post-S0380.168 handover.
#
# Slice S0380.176 closed oil 3 + oil 4 fully via Table 4b combi sub-
# row dispatch in `_table_3a_combi_loss_default_applies`. Pre-slice
# the helper gated only on `main_heating_category` ∈ {1, 2, 3, 6};
# the Elmhurst mapper leaves `main_heating_category=None` on Table
# 4b liquid-fuel boilers, so the cascade fell through to (61)m=0
# despite codes 128/129 being explicit combi sub-rows per SAP 10.2
# Table 4b row names ("Combi oil boiler, pre-/post-1998"). Adding
# the `_TABLE_4B_COMBI_OR_CPSU_CODES` fall-through lands (61)m at
# the spec Table 3a row 1 keep-hot 600 kWh/yr default. Both oil 3
# and oil 4 now EXACT on SAP / cost / CO2 / PE.
_CorpusExpectation(variant='oil 2', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=+0.0000),
_CorpusExpectation(variant='oil 3', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=+0.0000),
_CorpusExpectation(variant='oil 4', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=+0.0000),
_CorpusExpectation(variant='oil 5', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=-0.0000, expected_pe_resid_kwh=+0.0000),
# Slice S0380.177 closed oil 6 (B30K, Table 4b regular boiler code
# 126) main heating + HW efficiency via the SAP 10.2 Table 4c(2)
# (PDF p.169) "No thermostatic control of room temperature regular
# boiler" -5pp adjustment. The cert lodges control code 2101 (no
# room thermostat) WITH a cylinder thermostat; per RdSAP 10 §3 (PDF
# p.57) boiler interlock needs BOTH a room thermostat AND a cylinder
# thermostat, so the 2101 control means NO interlock despite the
# cylinderstat (P960 header "Boiler Interlock: No"). Pre-slice the
# `no_interlock` gate only checked the cylinder thermostat, so oil 6
# kept raw efficiency: space 0.80 vs ws (210) 0.75, HW (217)m summer
# 68 vs ws 63. Post-slice space fuel (211) = 13446.3457 EXACT and HW
# fuel (219) = 4099.5872 EXACT. ΔSAP +3.0518 → +0.0782; Δcost
# -£69.79 → -£1.68; ΔCO2 -240.66 → -1.71; ΔPE -1112.66 → -18.61.
#
# Slice S0380.178 then closed the residual S0380.177 exposed — the
# central heating pump (230c). SAP 10.2 Table 4f (PDF p.175) footnote
# a) on the "Circulation pump" rows: "Multiply by a factor of 1.3 if
# room thermostat is absent." Control 2101 has no room thermostat, so
# the cert's "2013 or later" pump (Table 4f 41 kWh) scales to 41 x
# 1.3 = 53.3 kWh = ws (230c); pumps/fans (231) = 53.3 + 100 (oil aux)
# = 153.3 EXACT. Same root cause (no room thermostat) as the .177
# interlock fix. oil 6 now FULLY EXACT on all four metrics. The
# sibling oil 5 (same pump age but control 2106 WITH a room
# thermostat) keeps the bare 41 kWh and is unaffected.
_CorpusExpectation(variant='oil 6', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=+0.0000),
_CorpusExpectation(variant='oil pcdb 1', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=-0.0000, expected_pe_resid_kwh=+0.0000),
_CorpusExpectation(variant='oil pcdb 2', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=-0.0000, expected_pe_resid_kwh=+0.0000),
_CorpusExpectation(variant='oil pcdb 3', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=-0.0000),
_CorpusExpectation(variant='pcdb 1', block='11a', expected_sap_resid=-0.0108, expected_cost_resid_gbp=+0.2420, expected_co2_resid_kg=+1.3254, expected_pe_resid_kwh=+5.6974),
_CorpusExpectation(variant='pcdb 1', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=-0.0000),
# Slice S0380.133 unblocked 10 solid-fuel variants by routing the
# Elmhurst §14.0 "Main Heating EES Code" through the new
# `_ELMHURST_MAIN_HEATING_EES_TO_FUEL_CODE` dict. Pre-slice the
@ -241,16 +489,280 @@ _EXPECTATIONS: tuple[_CorpusExpectation, ...] = (
# cost / CO2 / PE all route via the correct Table 32 fuel code.
# Remaining residuals are likely heating-system efficiency or
# control-type gaps — separate slices.
_CorpusExpectation(variant='solid fuel 2', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=-93.0988, expected_pe_resid_kwh=-1027.5099),
_CorpusExpectation(variant='solid fuel 2', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=+0.0000),
_CorpusExpectation(variant='solid fuel 3', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=-0.0000),
_CorpusExpectation(variant='solid fuel 4', block='11a', expected_sap_resid=+0.0850, expected_cost_resid_gbp=-1.9582, expected_co2_resid_kg=-9.3050, expected_pe_resid_kwh=-5.7762),
_CorpusExpectation(variant='solid fuel 5', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=+11.9451, expected_pe_resid_kwh=+48.6604),
_CorpusExpectation(variant='solid fuel 6', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=+11.9452, expected_pe_resid_kwh=+48.6604),
_CorpusExpectation(variant='solid fuel 7', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=+11.9451, expected_pe_resid_kwh=+48.6604),
_CorpusExpectation(variant='solid fuel 8', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=+11.9451, expected_pe_resid_kwh=+48.6604),
_CorpusExpectation(variant='solid fuel 9', block='11a', expected_sap_resid=+0.1072, expected_cost_resid_gbp=-2.4702, expected_co2_resid_kg=+9.6917, expected_pe_resid_kwh=-5.0715),
_CorpusExpectation(variant='solid fuel 10', block='11a', expected_sap_resid=+0.1134, expected_cost_resid_gbp=-2.6121, expected_co2_resid_kg=+9.3131, expected_pe_resid_kwh=-13.9149),
_CorpusExpectation(variant='solid fuel 11', block='11a', expected_sap_resid=+0.0912, expected_cost_resid_gbp=-2.1006, expected_co2_resid_kg=+10.5547, expected_pe_resid_kwh=-0.7387),
_CorpusExpectation(variant='solid fuel 4', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=-0.0000),
_CorpusExpectation(variant='solid fuel 5', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=-0.0000, expected_pe_resid_kwh=-0.0000),
_CorpusExpectation(variant='solid fuel 6', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=+0.0000),
_CorpusExpectation(variant='solid fuel 7', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=-0.0000),
_CorpusExpectation(variant='solid fuel 8', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=-0.0000, expected_pe_resid_kwh=+0.0000),
_CorpusExpectation(variant='solid fuel 9', block='11a', expected_sap_resid=-0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=-0.0000, expected_pe_resid_kwh=+0.0000),
_CorpusExpectation(variant='solid fuel 10', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=-0.0000, expected_pe_resid_kwh=-0.0000),
_CorpusExpectation(variant='solid fuel 11', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=-0.0000, expected_pe_resid_kwh=-0.0000),
# Slice S0380.166 unblocked `pcdb 3` (PCDB 8262 Vokera Linea LPG combi
# 83.10 %, Bulk LPG fuel, no cylinder, 18-hour tariff) by adding
# `"Bulk LPG": 27` to `_ELMHURST_MAIN_FUEL_TO_SAP10` (API code 27
# = "LPG (not community)" → Table 32 / Table 12 code 2 = bulk LPG).
# Pre-slice the cascade raised `MissingMainFuelType` because the
# mapper produced `main_fuel_type=''`. Post-slice all 4 metrics
# EXACT on first try — the cascade was fully wired for the gas/oil/
# LPG path; only the Elmhurst label mapping was missing.
_CorpusExpectation(variant='pcdb 3', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=+0.0000),
# Slice S0380.169 unblocked `no system` (Elmhurst §14.0 Main Heating
# EES = NON + SAP code 699). Per SAP 10.2 §A.2.2 the spec assumes
# portable electric heaters when no main heating is identified;
# cascade routes via `"NON": 30` in the EES → fuel dict (standard
# electricity). Cascade closes most of the way but carries a small
# residual (SAP +1.18, cost £27 / CO2 50 / PE 562) — likely a
# cascade-side §A.2.2 efficiency or tariff-routing gap; pinned as
# forcing function for follow-up.
# Slice S0380.179 closed `no system` via RdSAP 10 §10.7 (PDF p.55)
# "No water heating system": the cert lodges §15.0 water code 999
# (NON) + §15.1 "Cylinder Present: No", but per spec the calculation
# is done for an electric immersion heater on a Table 28 row-1 110 L
# cylinder with Table 29 row-1 age-band insulation (25 mm foam at age
# G). The P960 worksheet header confirms the engine's substitution
# (WHS 903 Single immersion, 110 L). Pre-slice the cascade trusted
# the lodged "no cylinder" → no storage loss (56) + a spurious Table
# 3a combi loss, and the wrong HW heat-gains propagated through §5/§7
# to over-state the base MIT (+0.25 K), over-stating space fuel by
# +228 kWh. `_apply_rdsap_no_water_heating_system_default` injects
# the default cylinder before the section cascades, closing HW fuel
# (219) 1935.37 → 2529.69 EXACT AND the space residual in one move.
# ΔSAP +1.18 → <1e-4, all four metrics EXACT.
_CorpusExpectation(variant='no system', block='11a', expected_sap_resid=+0.0000, expected_cost_resid_gbp=+0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=-0.0000),
# Slice S0380.170 unblocked the 5 community-heating variants. Per
# SAP 10.2 Table 12 (PDF p.189) the heat-network fuel code comes
# from the §14.1 Community Heat Source × Community Fuel Type pair:
# `Boilers × Mains Gas` → 51, `CHP × *` → 48 (fuel-agnostic),
# `Heat pump × Electricity` → 41. New CommunityHeating dataclass
# on `ElmhurstSiteNotes.main_heating` + extractor `_extract_
# community_heating()` + mapper `_resolve_community_heating_fuel_
# code(heat_source, fuel)` + dispatch wired before the strict-raise.
#
# CH1 (301 / Boilers / Mains Gas / code 51): cascade lands ~£14
# under-cost; the gap is the missing electricity-for-heat-
# distribution kWh stream not propagating to (340)/(342) at the
# heat-network rate. CO2/PE residuals reflect the heat-network
# overall CO2 / PE factor calc not yet matching Elmhurst's (386)/
# (486) blended-factor cascade.
#
# CH2/CH4 (302 / CHP / fuel / code 48): cascade overshoots SAP by
# +4.5 because it treats CHP+Boilers as 100% CHP at 2.97 p/kWh,
# missing the SAP 10.2 Appendix C 35% CHP / 65% boiler heat-
# fraction split for "Existing CHP (2015+), flexible operation".
# The boiler-side fuel-code dispatch + CHP-credit emissions for
# exported electricity (worksheet rows (464)/(466)) are the next
# cascade-side work.
#
# CH3 (304 / Heat pump / Electricity / code 41): cascade SAP +0.59
# (same as CH1 — both worksheet SAP=64.2427 with identical Block
# 10b shapes). CO2/PE residuals are large because the cascade
# doesn't yet divide by the community-HP COP — Table 12 code 41
# carries electricity factors but the worksheet divides delivered
# heat by COP first.
#
# CH6 (302 / CHP / Coal / code 48): same CHP split gap as CH2/CH4
# but with upstream coal — cascade under-CO2 by ~2935 kg and
# over-PE by ~7865 kWh because the boiler-side code-54 coal CO2/PE
# factors are not applied.
#
# All 5 pinned as forcing functions for follow-up cascade work
# (CHP heat-fraction split, community-HP COP cascade, heat-network
# overall factor calc). Mapper-side closure complete.
#
# Slice S0380.171 closed the CHP heat-fraction split for CH2 / CH4
# via RdSAP 10 §C / SAP 10.2 Appendix C (PDF p.58 default 35% CHP /
# 65% boilers when no PCDB record). New MainHeatingDetail fields
# `community_heating_chp_fraction` + `community_heating_boiler_
# fuel_type` populated by the Elmhurst mapper from §14.1 Community
# Heat Source + Community Fuel Type; cascade `_fuel_cost_gbp_per_
# kwh` blends 0.35 × CHP_price + 0.65 × boiler_price when the
# fields are set. CH2 / CH4 cost gap £104 → +£0.17 (~1e-3 of
# worksheet); SAP +4.50 → 0.008.
#
# CH6 regression (-3.52 SAP / +£81 → -8.03 / +£185) is exposed by
# the spec-correct split. Pre-slice the CHP-only pricing (2.97 p/
# kWh) cancelled with cascade DLF=1.45 (Table 12c age G default)
# vs the CH6 worksheet's lodged DLF=1.0 — the offset-bugs
# cancellation hid the gap. Post-slice the blended price (3.79
# p/kWh) shows the true magnitude of the DLF mismatch. CH6
# Summary §14.1 is otherwise IDENTICAL to CH4 (only the Community
# Fuel Type "Coal" vs "Mineral oil or biodiesel" differs), but
# CH6's worksheet (306) = 1.0000 while CH4's = 1.4500 — a cert-
# side quirk not currently surfaced through the Summary PDF. Per
# [[feedback-software-no-special-handling]] apply spec-correct
# fix uniformly; CH6 closure needs a separate slice for the
# assessor-lodged DLF override.
#
# CO2 / PE residuals on the 5 CH variants are unchanged (CHP-split
# touches cost only; CO2 / PE need (1) CHP electricity-credit line
# (worksheet (464)/(466)/(364)/(366) per SAP 10.2 §13b spec) +
# (2) community-HP COP cascade for CH3 + (3) heat-network overall
# factor (486)/(386) calc — separate follow-up slices).
#
# Slice S0380.172 closed the CH1 (boiler) + CH3 (HP) CO2 / PE
# residuals via SAP 10.2 Table 4a (PDF p.164) heat-network heat-
# source efficiency scaling: code 301 (boilers) eff = 80%, code
# 304 (HP) eff = 300%. Spec block 13a (467) = (307+310) × 100 /
# heat_source_eff × Table 12 PE factor; cascade meters network_
# input directly so PE/CO2 factors are scaled by 1/heat_source_eff
# at lookup time. CH1 ΔCO2 787 → 126 (~84% closed) and ΔPE
# 3827 → 967 (~75% closed); CH3 ΔCO2 +1614 → +473 (~71%
# closed) and ΔPE +11879 → +1749 (~85% closed). Code 302 (CHP+
# boilers) is omitted from the scaling table — the 35%/65% split
# requires the displaced-electricity credit line per spec block
# 13b (464)/(466); follow-up slice scope. Residual CH1/CH3 gap is
# the WHC=901 HW path (cascade reads cert-lodged "Mains gas" as
# HW fuel; should fall through to main fuel for community heating)
# + the Elmhurst 0.8523 multiplier on heat-network energy column.
#
# Slice S0380.173 routed CH1 + CH3 HW cost / CO2 / PE through the
# main heat-network fuel + Table 4a heat-source-eff scaling via a
# new `_is_community_heating_hw_from_main(epc)` predicate (WHC ∈
# {901, 902, 914} + heat-network main + SAP code in
# `_HEAT_NETWORK_HEAT_SOURCE_EFFICIENCY` table from S0380.172).
# Pre-slice the cascade honoured Elmhurst's §15.0 placeholder
# `water_heating_fuel_type = "Mains gas"` for community-heated
# certs, mis-routing HW through the Mains-gas Table 12 code
# (3.48 p/kWh / 0.21 CO2 / 1.13 PE) instead of the heat-network
# code (4.24 p/kWh + scaled factors). Closures:
#
# CH1 (Boilers/Gas) ΔPE 967 → 9 (essentially closed)
# CH1 ΔCO2 126 → +52 (shift)
# CH3 (HP/Elec) ΔPE +1749 → 387 (~78% closed)
# CH3 ΔCO2 +473 → 86 (~82% closed)
#
# Cost / SAP signs flip on CH1 / CH3 (was £14 / +0.59 SAP, now
# +£12 / 0.53 SAP) — HW cost now matches the worksheet exactly,
# exposing a +£12 lighting / standing overage that was previously
# masked by the HW under-charge. The exposed lighting / standing
# gap is the next closure front (likely the £120 heat-network
# standing charge being applied to lighting kWh instead).
#
# SAP 302 (CHP+boilers) gated out per `_is_community_heating_hw_
# from_main`'s `_HEAT_NETWORK_HEAT_SOURCE_EFFICIENCY` check — the
# 35%/65% split + displaced-electricity credit must converge on
# both SH and HW in a single follow-up slice. CH2 / CH4 / CH6
# residuals unchanged from S0380.172 / S0380.171 pins.
#
# Slice S0380.174 closed the (62)m HW useful-kWh path on all 5 CH
# variants by adding the spec-required storage (57)m + primary (59)m
# loss components that the §4 cascade omitted for heat-network mains.
# Per SAP 10.2 §4 "Heat networks" (PDF p.17 line 1482): "Primary
# circuit loss for insulated pipework and cylinderstat should be
# included (see Table 3)." And per SAP 10.2 Table 2b note b (PDF
# p.159) verbatim — the ×0.9 Temperature Factor reduction applies
# only to "boiler systems, warm air systems and heat pump systems",
# excluding community heating. CH1's HW path closes EXACTLY (cascade
# 3854.12 = worksheet 3854.12 at 4.24 p/kWh = £163.41), but the spec-
# correct fix exposes a separate +0.46 K MIT (92) over-count in §7
# that drives a residual SH demand over-count of ~396 kWh/yr per CH
# variant. Pre-S0380.174 the §4 (65)m heat-gains under-count
# offset the §7 MIT over-count, masking the bug. Per
# [[feedback-software-no-special-handling]] apply spec-correct fix
# uniformly; the exposed §7 MIT residual is the next closure front.
#
# Slice S0380.175 wired the §14.1 Community Heating "Heating Controls
# SAP" lodging (bare 4-digit form like "2306") into the
# `main_heating_control` field on the mapper-produced Main 1. Pre-
# slice the mapper only read §14.0 Main Heating "Main Heating
# Controls Sap" which is empty for community heating certs; the
# cascade defaulted to control_type=2, mis-routing the §7 elsewhere-
# zone off-hours to (7, 8) when SAP code 2306 ("Charging system
# linked to use of heating, programmer and TRVs") dispatches via
# Table 4e Group 3 to control_type=3 / off-hours (9, 8). The fix
# closes CH1 and CH3 SAP / cost EXACTLY; CH2/CH4 cost flip from
# +£9.65 to -£12.16 (CHP-split blend now sees lower SH kWh × CHP
# rate); CH6 SAP narrows -8.44 → -7.49. Remaining CH1/CH3 CO2/PE
# residuals are the §13a (372) "Electrical energy for heat
# distribution" line — 118.38 kWh billed at electricity factors
# (CO2 0.1993, PE 1.760), not heat-network factors — the cascade
# doesn't currently meter this. Next follow-up slice.
# Slice S0380.180 wired the SAP 10.2 Appendix C §C3.2 (PDF p.51)
# heat-network distribution pumping electricity (worksheet (313) =
# 0.01 × [(307)+(310)]; CO2 (372) / PE (472) on Table 12d/12e fuel-
# code-50 monthly factors weighted by the monthly heat profile).
# CH1 (Boilers/Gas) closes FULLY — the (372)/(472) line was its
# entire remaining residual (un-defers the front the predecessor
# handover flagged "don't guess"; the factor source is §C3.2 +
# Table 12f, not an empirical constant). CH3 (HP/Elec) closes its
# distribution component (CO2 98.92→75.32, PE 457.54→249.32);
# the remainder is the code-304 community-HP COP cascade (separate
# follow-up). CH2/CH4/CH6 gain their (372)/(472) component (CO2
# +23.6, PE +208.2/+208.2/+208.2); their dominant CHP displaced-
# electricity credit residual (Table 12f + block 12b/13b) remains
# for the next slice. Elmhurst DISPLAYS the (372) energy column as
# 0.01 × (307) (space only) but computes emissions on 0.01 ×
# (307+310) per the §C3.2 text — verified EXACT line-by-line.
#
# Slice S0380.182 wired the SAP 10.2 §12b/13b community-heating
# "CHP and boilers" (SAP code 302) CO2/PE cascade: per unit of
# network heat fuel H = (307)+(310), the effective generation factor
# = chp_frac × 100/(362) × f_fuel chp_frac × (361)/(362) × f_disp
# + (1chp_frac) × 100/(367) × f_fuel, where f_fuel is the Table 12
# heat-network fuel factor (CHP + back-up boilers burn the same
# community fuel) and f_disp is the Table 12f credit factor for the
# CHP-generated electricity (Elmhurst uses "flexible operation"
# 0.420 CO2 / 2.369 PE). RdSAP 10 §C (p.58) defaults: heat eff 50% /
# electrical eff 25% / boiler eff 80%; CHP frac 0.35 per-cert. Also
# fixed Table 12 heat-network-oil CO2 (codes 53/56 0.298→0.335 per
# Table 12 p.189 — the code-302 oil cascade was the first to use it).
# CH2 (gas) + CH4 (oil) CO2 + PE now EXACT (<1e-4). CH6 (coal) CO2/PE
# shift sign: its worksheet lodges a manual DLF=1.0 (two adjoining
# dwellings) the Summary doesn't carry, so the cascade's DLF=1.45
# over-scales H — pin + the CH6 SAP 7.49 / cost +£172 are the same
# DLF quirk (separate front, likely pin-forever). CH2/CH4 SAP +0.5277
# / cost £12.16 is the heat-network cost/standing residual exposed
# by S0380.175 (cost-side, untouched by this CO2/PE slice). CH3
# unchanged (code 304 community-HP COP front).
#
# Slice S0380.183 closed the CH2/CH4 HW cost residual: per SAP 10.2
# §10b the community-heating HW bills at the heat-network rate, not
# the Elmhurst §15.0 "Mains gas" placeholder. Worksheet (342) =
# (310) × the S0380.171 CHP heat-fraction blend (= the same rate as
# space heating (340)), not (310) × 3.48 p/kWh gas. Extended
# `_is_community_heating_hw_from_main` to include code 302 — the
# S0380.182 CO2/PE interception sits above this predicate's branch,
# so it now affects only the cost path. CH2 + CH4 are FULLY EXACT
# on all four metrics. CH6 SAP 7.49→8.02 / cost +£172.68→+£184.84
# (its HW now also bills the blend, compounding the DLF=1.0 quirk —
# same root, still the separate CH6 DLF front).
#
# Slice S0380.184 closed CH3 (HP/Elec, code 304) CO2 + PE: an
# electric-HP heat network meters grid electricity, so per SAP 10.2
# Table 12 note (s)/(t) + block 12b/13b footnote (a) its (367)/(467)
# factor is the MONTHLY Table 12d/12e (fuel code 41) weighted by the
# network heat profile, then × 1/COP — not the annual 0.136/1.501.
# New `_is_heat_network_electric_main` routes the four factor helpers
# through the monthly cascade for code 304 (fuel 41). CH3 was
# SAP/cost EXACT; CO2 75.32→+0.0000 (= (307+310)/3 × (0.15040.136))
# and PE 249.32→0.0000 (× (1.55691.501)) now EXACT. Non-electric
# heat networks (CH1 gas 51, CH6 coal 54) have no monthly factor set
# → unchanged.
#
# CH6 — PROVEN PIN-FOREVER (Summary-export gap, not a mapper miss).
# CH6's P960 *worksheet input* lodges Distribution Loss = "Two
# adjoining dwellings sharing a single heating system" → Value 0.0 →
# (306) DLF = 1.0000, whereas CH4 lodges "Calculated" → 1.5 → (306) =
# 1.4500. That DLF choice swings SAP / cost / CO2 / PE materially.
# But it is NOT in the Summary PDF: a controlled pair differing ONLY
# by the adjoining-dwellings setting (`CH adjoined dwellings/Summary_
# 001431 (1) vs (2).pdf`) is byte-identical across every RdSAP INPUT
# field — the two Summaries differ solely in the derived header
# (SAP 80 vs 75, bill £954 vs £1237, emissions 5.407 vs 7.394 t). A
# case-insensitive scan of the CH6 Summary for "distribution"/"adjoin"
# returns 0 hits. Since CH4 and CH6 Summaries are themselves identical
# bar fuel type, no Summary-derivable rule can yield CH4=1.45 AND
# CH6=1.0. Closing CH6 would require the P960 worksheet as a mapper
# input or an Elmhurst Summary-export change — neither is available.
# Pin held; do not re-litigate (verified 2026-06-02 with the
# user-supplied adjoining-dwellings pair).
_CorpusExpectation(variant='community heating 1', block='11b', expected_sap_resid=+0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=-0.0000, expected_pe_resid_kwh=+0.0000),
_CorpusExpectation(variant='community heating 2', block='11b', expected_sap_resid=-0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=+0.0000),
_CorpusExpectation(variant='community heating 3', block='11b', expected_sap_resid=+0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=+0.0000, expected_pe_resid_kwh=-0.0000),
_CorpusExpectation(variant='community heating 4', block='11b', expected_sap_resid=-0.0000, expected_cost_resid_gbp=-0.0000, expected_co2_resid_kg=-0.0000, expected_pe_resid_kwh=-0.0000),
_CorpusExpectation(variant='community heating 6', block='11b', expected_sap_resid=-8.0219, expected_cost_resid_gbp=+184.8376, expected_co2_resid_kg=+2411.5399, expected_pe_resid_kwh=+5023.4766),
)
@ -270,25 +782,15 @@ _EXPECTATIONS: tuple[_CorpusExpectation, ...] = (
# - Solid-fuel boilers (Table 4a 150-160, 600-636) ×10
# - PCDB-lodged "Bulk LPG" mapper-dict gap ×1
_BLOCKED_BY_MISSING_MAIN_FUEL_TYPE: tuple[str, ...] = (
'community heating 1',
'community heating 2',
'community heating 3',
'community heating 4',
'community heating 6',
'electric 11',
'electric 12',
'electric 13',
'electric 14',
'no system',
'oil 2',
'oil 3',
'oil 4',
'oil 5',
'oil 6',
'pcdb 3',
# Slice S0380.133 unblocked all 10 solid-fuel variants via the
# §14.0 EES-code-driven fuel derivation; they now appear in
# `_EXPECTATIONS` above with their post-derivation residual pins.
# Slice S0380.166 unblocked `pcdb 3` via `"Bulk LPG": 27` in the
# Elmhurst label dict; it now lives in `_EXPECTATIONS` at ±0.0000.
# Slice S0380.170 unblocked all 5 community-heating variants via
# the new CommunityHeating extractor field + the §14.1 Heat
# Source × Fuel Type → Table 12 fuel-code dispatch. They now
# appear in `_EXPECTATIONS` with pinned cascade-side residuals.
)
@ -458,9 +960,100 @@ def test_heating_systems_corpus_residual_matches_pin(
)
def test_oil_6_no_room_thermostat_applies_table_4c2_minus_5pp_space_efficiency() -> None:
# Arrange — oil 6 (B30K standard liquid-fuel boiler, Table 4b code
# 126 winter 80 / summer 68) lodges "Main Heating Controls Sap: SAP
# code 2101, No time or thermostatic control of room temperature"
# WITH a cylinder thermostat present. Per RdSAP 10 §3 (PDF p.57)
# boiler interlock is "assumed present if there is a room thermostat
# and (for stored hot water systems heated by the boiler) a cylinder
# thermostat. Otherwise not interlocked." Control 2101 provides no
# room thermostat, so the boiler is NOT interlocked despite the
# cylinder thermostat. SAP 10.2 Table 4c(2) (PDF p.169) "No
# thermostatic control of room temperature regular boiler" deducts
# 5pp from BOTH the space and DHW seasonal efficiency. The worksheet
# confirms it: P960 header "Boiler Interlock: No"; (210) space
# efficiency = 75.0000 = 80 - 5; (217)m summer = 63.0000 = 68 - 5.
summary_pdf, _ = _variant_paths('oil 6')
pages = _summary_pdf_to_textract_style_pages(summary_pdf)
site_notes = ElmhurstSiteNotesExtractor(pages).extract()
epc = EpcPropertyDataMapper.from_elmhurst_site_notes(site_notes)
# Act — run the rating cascade and read the resolved space efficiency.
inputs = cert_to_inputs(epc, prices=SAP_10_2_SPEC_PRICES)
# Assert — Table 4b 80% winter less the Table 4c(2) -5pp interlock
# penalty = 75% (matches worksheet (210)).
assert abs(inputs.main_heating_efficiency - 0.75) <= 1e-9, (
f"oil 6 space efficiency {inputs.main_heating_efficiency:.4f} "
f"!= 0.75 (Table 4b 0.80 - Table 4c(2) 0.05 interlock penalty)"
)
def test_oil_6_absent_room_thermostat_applies_table_4f_pump_1_3_multiplier() -> None:
# Arrange — oil 6 lodges Main Heating Controls Sap code 2101 ("No
# time or thermostatic control of room temperature") = no room
# thermostat. SAP 10.2 Table 4f (PDF p.175) footnote a) on the
# "Circulation pump" rows reads verbatim: "Multiply by a factor of
# 1.3 if room thermostat is absent." The cert's central heating
# pump is "2013 or later" -> Table 4f 41 kWh; with the absent-room-
# thermostat x1.3 it becomes 41 x 1.3 = 53.3 kWh, matching worksheet
# (230c) = 53.3000. With the liquid-fuel boiler flue-fan/pump 100
# kWh (230d), total pumps/fans (231) = 153.3000. The sibling oil 5
# (same "2013 or later" pump age but control 2106 WITH a room
# thermostat) keeps the bare 41 kWh — worksheet (230c) = 41.0000.
summary_pdf, _ = _variant_paths('oil 6')
pages = _summary_pdf_to_textract_style_pages(summary_pdf)
site_notes = ElmhurstSiteNotesExtractor(pages).extract()
epc = EpcPropertyDataMapper.from_elmhurst_site_notes(site_notes)
# Act — run the rating cascade and read the resolved pumps/fans kWh.
inputs = cert_to_inputs(epc, prices=SAP_10_2_SPEC_PRICES)
# Assert — 41 x 1.3 (circulation pump) + 100 (oil flue fan/pump) =
# 153.3 kWh (matches worksheet (231)).
assert abs(inputs.pumps_fans_kwh_per_yr - 153.3) <= 1e-9, (
f"oil 6 pumps/fans {inputs.pumps_fans_kwh_per_yr:.4f} kWh "
f"!= 153.3 (41 x 1.3 absent-room-thermostat pump + 100 oil aux)"
)
def test_no_system_assumes_rdsap_10_7_electric_immersion_default_cylinder() -> None:
# Arrange — the "no system" cert lodges §15.0 "Water Heating Code:
# NON / SapCode 999" and §15.1 "Hot Water Cylinder Present: No". Per
# RdSAP 10 §10.7 (PDF p.55) "No water heating system" verbatim: "the
# calculation is done for an electric immersion heater... for a
# cylinder defined by the first row of Table 28 (110 litres) and the
# first row of Table 29." The BRE-approved Elmhurst engine confirms
# it — the P960 worksheet header lodges "WHS: 903 Electric immersion,
# Single", a 110 L cylinder, and Table 29 age-band insulation (the
# corpus property is age G -> 25 mm foam), giving storage loss (56) =
# 594.32 kWh/yr. Worksheet HW (64) = (45) 1935.37 + (56) 594.32 =
# 2529.6927. Pre-slice the cascade trusted the lodged "no cylinder"
# so it added no storage loss (and a spurious Table 3a combi loss).
summary_pdf, _ = _variant_paths('no system')
pages = _summary_pdf_to_textract_style_pages(summary_pdf)
site_notes = ElmhurstSiteNotesExtractor(pages).extract()
epc = EpcPropertyDataMapper.from_elmhurst_site_notes(site_notes)
# Act — run the rating cascade and read the resolved HW fuel kWh.
inputs = cert_to_inputs(epc, prices=SAP_10_2_SPEC_PRICES)
# Assert — HW fuel = (45) + Table 29 110 L / 25 mm-foam storage loss
# = 2529.6927 (matches worksheet (64)/(219)).
assert abs(inputs.hot_water_kwh_per_yr - 2529.6927) <= 1e-3, (
f"no system HW {inputs.hot_water_kwh_per_yr:.4f} kWh != 2529.6927 "
f"(RdSAP 10 §10.7 electric-immersion default 110 L cylinder)"
)
@pytest.mark.skipif(
not _BLOCKED_BY_MISSING_MAIN_FUEL_TYPE,
reason="all blocked variants have been unblocked (latest: S0380.170)",
)
@pytest.mark.parametrize(
"variant",
_BLOCKED_BY_MISSING_MAIN_FUEL_TYPE,
_BLOCKED_BY_MISSING_MAIN_FUEL_TYPE or ("__placeholder__",),
ids=lambda v: v,
)
def test_heating_systems_corpus_blocked_variant_raises_missing_main_fuel_type(
@ -487,3 +1080,162 @@ def test_heating_systems_corpus_blocked_variant_raises_missing_main_fuel_type(
# Act / Assert
with pytest.raises(MissingMainFuelType):
cert_to_inputs(epc, prices=SAP_10_2_SPEC_PRICES)
# S0380.170 — Community heating mapper dispatch coverage tests.
#
# These focused tests document the per-variant resolution path
# independently of the cascade. The parametrized `_EXPECTATIONS` test
# above is the load-bearing assertion that the cascade lands at the
# pinned residual; these unit tests assert the mapper's `main_fuel_type`
# resolves to the correct Table 12 heat-network code per
# `_resolve_community_heating_fuel_code(heat_source, fuel)`.
_COMMUNITY_HEATING_EXPECTED_FUEL_CODES: tuple[tuple[str, int], ...] = (
# (variant, SAP 10.2 Table 12 fuel code)
('community heating 1', 51), # Boilers + Mains Gas
('community heating 2', 48), # CHP + Mains Gas
('community heating 3', 41), # Heat pump + Electricity
('community heating 4', 48), # CHP + Mineral oil or biodiesel
('community heating 6', 48), # CHP + Coal
)
@pytest.mark.parametrize(
("variant", "expected_table_12_code"),
_COMMUNITY_HEATING_EXPECTED_FUEL_CODES,
ids=lambda v: v if isinstance(v, str) else str(v),
)
def test_community_heating_mapper_resolves_table_12_fuel_code(
variant: str, expected_table_12_code: int,
) -> None:
# Arrange — community-heating Summary lodges §14.0 EES='COM' + a
# Table 4a heat-network SAP code, with §14.0 Fuel Type empty. The
# §14.1 Community Heating/Heat Network block carries the upstream
# Heat Source + Fuel Type pair, which the mapper's
# `_resolve_community_heating_fuel_code` translates to a SAP 10.2
# Table 12 (PDF p.189) heat-network code per the dispatch:
# Boilers + Mains Gas → 51
# Combined Heat and Power → 48 (fuel-agnostic)
# Heat pump + Electricity → 41
summary_pdf, _ = _variant_paths(variant)
pages = _summary_pdf_to_textract_style_pages(summary_pdf)
site_notes = ElmhurstSiteNotesExtractor(pages).extract()
# Act
epc = EpcPropertyDataMapper.from_elmhurst_site_notes(site_notes)
# Assert — Main 1 picks up the Table 12 fuel code derived from the
# §14.1 Community Heat Source + Community Fuel Type pair.
main_heating_details = epc.sap_heating.main_heating_details
assert main_heating_details is not None and len(main_heating_details) >= 1
assert main_heating_details[0].main_fuel_type == expected_table_12_code
# S0380.171 — Community heating CHP-split mapper coverage tests.
#
# Per RdSAP 10 §C / SAP 10.2 Appendix C (PDF p.58): for CHP+boilers
# heat networks without a PCDB record, the heat split defaults to 35%
# CHP + 65% boilers. The mapper populates both fields on Main 1 so the
# cascade's `_fuel_cost_gbp_per_kwh` returns a blended price weighted
# by the heat fractions. Non-CHP heat networks leave both fields None
# (single-fuel-code path stays unchanged).
_COMMUNITY_HEATING_EXPECTED_CHP_SPLIT: tuple[
tuple[str, Optional[float], Optional[int]], ...
] = (
# (variant, chp_fraction, boiler_fuel_code)
('community heating 1', None, None), # Boilers only — no split
('community heating 2', 0.35, 51), # CHP + Mains Gas boilers
('community heating 3', None, None), # Heat pump only — no split
('community heating 4', 0.35, 53), # CHP + Oil boilers
('community heating 6', 0.35, 54), # CHP + Coal boilers
)
@pytest.mark.parametrize(
("variant", "expected_chp_fraction", "expected_boiler_fuel_code"),
_COMMUNITY_HEATING_EXPECTED_CHP_SPLIT,
ids=lambda v: v if isinstance(v, str) else str(v),
)
def test_community_heating_mapper_populates_chp_split_fields(
variant: str,
expected_chp_fraction: Optional[float],
expected_boiler_fuel_code: Optional[int],
) -> None:
# Arrange — CHP+boilers heat networks lodge "Combined Heat and
# Power" in §14.1 Community Heat Source. Per RdSAP 10 §C the
# mapper sets chp_fraction = 0.35 + resolves the boiler fuel code
# from the §14.1 Community Fuel Type (Mains Gas → 51, Mineral oil
# → 53, Coal → 54). Boilers-only and Heat-pump networks leave both
# fields None — the single main_fuel_type code handles them.
summary_pdf, _ = _variant_paths(variant)
pages = _summary_pdf_to_textract_style_pages(summary_pdf)
site_notes = ElmhurstSiteNotesExtractor(pages).extract()
# Act
epc = EpcPropertyDataMapper.from_elmhurst_site_notes(site_notes)
# Assert
main_heating_details = epc.sap_heating.main_heating_details
assert main_heating_details is not None and len(main_heating_details) >= 1
main_1 = main_heating_details[0]
assert main_1.community_heating_chp_fraction == expected_chp_fraction
assert main_1.community_heating_boiler_fuel_type == expected_boiler_fuel_code
# S0380.175 — Community heating main_heating_control extraction.
#
# Per SAP 10.2 Table 4e Group 3 (PDF p.173): heat-network control codes
# 2301-2314 dispatch to control_type 1, 2, or 3. The cert lodges the
# code in §14.1 Community Heating "Heating Controls SAP" rather than
# §14.0 Main Heating's "Main Heating Controls Sap". Pre-slice the mapper
# only read the §14.0 field, leaving `main_heating_control=''` and the
# cascade defaulting to type 2 (modal RdSAP default). The §14.1 lodging
# carries the actual control code, which feeds Table 9 elsewhere-zone
# off-hours selection (type 1/2 → (7,8); type 3 → (9,8)) and the §7
# T_h2 MIT cascade.
@pytest.mark.parametrize(
("variant", "expected_main_heating_control"),
(
# All 5 CH variants lodge "Heating Controls SAP: 2306" in §14.1
# Community Heating. SAP 10.2 Table 4e Group 3 row 2306 =
# "Charging system linked to use of heating, programmer and TRVs"
# → control_type 3, temperature_adjustment 0 °C.
('community heating 1', 2306),
('community heating 2', 2306),
('community heating 3', 2306),
('community heating 4', 2306),
('community heating 6', 2306),
),
ids=lambda v: v if isinstance(v, str) else str(v),
)
def test_community_heating_mapper_picks_up_section_14_1_heating_controls_sap(
variant: str, expected_main_heating_control: int,
) -> None:
# Arrange — community heating Summary lodges the SAP control code in
# §14.1 Community Heating "Heating Controls SAP", NOT in §14.0 Main
# Heating "Main Heating Controls Sap" (which is empty for community
# heating certs). Mapper must read from the community block when the
# main block is empty.
summary_pdf, _ = _variant_paths(variant)
pages = _summary_pdf_to_textract_style_pages(summary_pdf)
site_notes = ElmhurstSiteNotesExtractor(pages).extract()
# Act
epc = EpcPropertyDataMapper.from_elmhurst_site_notes(site_notes)
# Assert — Main 1 picks up the §14.1 community heating control code.
main_heating_details = epc.sap_heating.main_heating_details
assert main_heating_details is not None and len(main_heating_details) >= 1
assert main_heating_details[0].main_heating_control == expected_main_heating_control
# S0380.172 — Heat-network heat-source-eff scaling residual coverage.
#
# Per SAP 10.2 Table 4a (PDF p.164): "Boilers (RdSAP)" eff=80%, "Heat
# pump (RdSAP)" eff=300%. The cascade's CO2/PE factor functions scale
# Table 12 factors by 1/heat_source_eff so that network_input × scaled
# factor lands on the spec block 13a (467) / 12b (367) "(307+310) ×
# 100 / eff × Table 12 factor" formula. SAP code 302 (CHP+boilers) is
# excluded — 35%/65% split + displaced-electricity credit is follow-up.
# Coverage is asserted via the residual-pin test above (CH1 / CH3
# closure; CH2 / CH4 / CH6 unchanged).

View file

@ -1,6 +1,6 @@
"""End-to-end validation for the Elmhurst Summary→EpcPropertyData chain.
The 6 Elmhurst worksheet fixtures in `domain.sap10_calculator.worksheet.tests`
The 6 Elmhurst worksheet fixtures in `tests.domain.sap10_calculator.worksheet`
build their `EpcPropertyData` synthetically they validate the
calculator + cascade in isolation from the mapper. This file pins
the OTHER half of the chain: `from_elmhurst_site_notes` must produce
@ -46,7 +46,7 @@ from datatypes.epc.domain.mapper import (
from domain.sap10_calculator.calculator import calculate_sap_from_inputs
from domain.sap10_calculator.rdsap.cert_to_inputs import SAP_10_2_SPEC_PRICES, cert_to_inputs
from domain.sap10_ml.rdsap_uvalues import u_party_wall
from domain.sap10_calculator.worksheet.tests import (
from tests.domain.sap10_calculator.worksheet import (
_elmhurst_worksheet_000474 as _w000474,
_elmhurst_worksheet_000477 as _w000477,
_elmhurst_worksheet_000480 as _w000480,
@ -84,7 +84,7 @@ _SUMMARY_000565_PDF = _FIXTURES / "Summary_000565.pdf" # cert 000565 (5-bp Elmh
# matches worksheet continuous SAP at 1e-4".
_API_001479_JSON = (
Path(__file__).parents[3]
/ "domain/sap10_calculator/rdsap/tests/fixtures/golden"
/ "tests/domain/sap10_calculator/rdsap/fixtures/golden"
/ "0535-9020-6509-0821-6222.json"
)
@ -129,7 +129,7 @@ def _summary_pdf_to_textract_style_pages(pdf_path: Path) -> list[str]:
def test_summary_000474_mapper_produces_three_building_parts() -> None:
# Arrange — cert U985-0001-000474 is a mid-terrace with 3 building
# parts (Main + 2 extensions) per the hand-built worksheet fixture
# at domain/sap10_calculator/worksheet/tests/
# at tests/domain/sap10_calculator/worksheet/
# _elmhurst_worksheet_000474.py. Routing the Summary PDF through
# extractor + mapper must yield the same count.
pages = _summary_pdf_to_textract_style_pages(_SUMMARY_000474_PDF)
@ -2978,7 +2978,7 @@ def test_summary_mapper_raises_on_unmapped_party_wall_type_code() -> None:
_GOLDEN_FIXTURES_DIR = (
Path(__file__).parents[3]
/ "domain/sap10_calculator/rdsap/tests/fixtures/golden"
/ "tests/domain/sap10_calculator/rdsap/fixtures/golden"
)
@ -3292,13 +3292,13 @@ def test_summary_0380_full_chain_sap_within_spec_floor_of_worksheet() -> None:
_API_0330_JSON = (
Path(__file__).parents[3]
/ "domain/sap10_calculator/rdsap/tests/fixtures/golden"
/ "tests/domain/sap10_calculator/rdsap/fixtures/golden"
/ "0330-2249-8150-2326-4121.json"
)
_API_9501_JSON = (
Path(__file__).parents[3]
/ "domain/sap10_calculator/rdsap/tests/fixtures/golden"
/ "tests/domain/sap10_calculator/rdsap/fixtures/golden"
/ "9501-3059-8202-7356-0204.json"
)
@ -3358,7 +3358,7 @@ def test_api_9501_photovoltaic_array_surfaced() -> None:
_API_0380_JSON = (
Path(__file__).parents[3]
/ "domain/sap10_calculator/rdsap/tests/fixtures/golden"
/ "tests/domain/sap10_calculator/rdsap/fixtures/golden"
/ "0380-2471-3250-2596-8761.json"
)
@ -3479,20 +3479,20 @@ def test_api_0380_heat_pump_no_pumps_fans_kwh_per_table_4f() -> None:
_API_9418_JSON = (
Path(__file__).parents[3]
/ "domain/sap10_calculator/rdsap/tests/fixtures/golden"
/ "tests/domain/sap10_calculator/rdsap/fixtures/golden"
/ "9418-3062-8205-3566-7200.json"
)
_API_2225_JSON = (
Path(__file__).parents[3]
/ "domain/sap10_calculator/rdsap/tests/fixtures/golden"
/ "tests/domain/sap10_calculator/rdsap/fixtures/golden"
/ "2225-3062-8205-2856-7204.json"
)
_API_2636_JSON = (
Path(__file__).parents[3]
/ "domain/sap10_calculator/rdsap/tests/fixtures/golden"
/ "tests/domain/sap10_calculator/rdsap/fixtures/golden"
/ "2636-0525-2600-0401-2296.json"
)
@ -3765,17 +3765,17 @@ def test_api_001479_full_chain_sap_matches_worksheet_pdf_exactly() -> None:
_API_0350_JSON = (
Path(__file__).parents[3]
/ "domain/sap10_calculator/rdsap/tests/fixtures/golden"
/ "tests/domain/sap10_calculator/rdsap/fixtures/golden"
/ "0350-2968-2650-2796-5255.json"
)
_API_3800_JSON = (
Path(__file__).parents[3]
/ "domain/sap10_calculator/rdsap/tests/fixtures/golden"
/ "tests/domain/sap10_calculator/rdsap/fixtures/golden"
/ "3800-8515-0922-3398-3563.json"
)
_API_9285_JSON = (
Path(__file__).parents[3]
/ "domain/sap10_calculator/rdsap/tests/fixtures/golden"
/ "tests/domain/sap10_calculator/rdsap/fixtures/golden"
/ "9285-3062-0205-7766-7200.json"
)
@ -3878,7 +3878,7 @@ def test_api_9418_full_chain_sap_within_spec_floor_of_worksheet() -> None:
# SAP cascade is the load-bearing equivalence check. Each cert in this
# cohort has both a Summary PDF (under `sap worksheets/additional with
# api 2/<cert>/Summary_*.pdf`) and an API JSON fixture (fetched into
# `domain/sap10_calculator/rdsap/tests/fixtures/golden/<cert>.json` in
# `tests/domain/sap10_calculator/rdsap/fixtures/golden/<cert>.json` in
# Slice S0380.39). Worksheet SAP is the source of truth.
#
# Cohort-2 API-path closure history (each slice closed a distinct
@ -3893,7 +3893,7 @@ def test_api_9418_full_chain_sap_within_spec_floor_of_worksheet() -> None:
_COHORT_2_API_FIXTURE_DIR: Path = (
Path(__file__).parents[3]
/ "domain/sap10_calculator/rdsap/tests/fixtures/golden"
/ "tests/domain/sap10_calculator/rdsap/fixtures/golden"
)
# (cert_dir, worksheet_unrounded_sap) — 34 cohort-2 certs whose API-path

View file

@ -17,6 +17,7 @@ class CoreFiles(Enum):
IMPROVEMENT_OPTION_EVALUATION = "Improvement Option Evaluation"
MEDIUM_TERM_IMPROVEMENT_PLAN = "Medium Term Improvement Plan"
RETROFIT_DESIGN_DOC = "Retrofit Design Doc"
MCS_COMPLIANCE_CERTIFICATE = "MCS Compliance Certificate"
_CORE_FILE_TO_FILE_TYPE: dict[CoreFiles, str] = {
@ -32,14 +33,21 @@ _CORE_FILE_TO_FILE_TYPE: dict[CoreFiles, str] = {
CoreFiles.IMPROVEMENT_OPTION_EVALUATION: FileTypeEnum.IMPROVEMENT_OPTION_EVALUATION.value,
CoreFiles.MEDIUM_TERM_IMPROVEMENT_PLAN: FileTypeEnum.MEDIUM_TERM_IMPROVEMENT_PLAN.value,
CoreFiles.RETROFIT_DESIGN_DOC: FileTypeEnum.RETROFIT_DESIGN_DOC.value,
CoreFiles.MCS_COMPLIANCE_CERTIFICATE: FileTypeEnum.MCS_COMPLIANCE_CERTIFICATE.value,
}
def get_core_file_type(
filename: str, evidence_category: Optional[str] = None
) -> Optional[CoreFiles]:
# Identify retrofit design doc using evidence category as the name is possibly unreliable.
# Identify MCS certificate and design doc using evidence category as the names are possibly unreliable.
# We might change to always use evidence category, but needs more investigation
if (
evidence_category is not None
and evidence_category.lower() == "mcs compliance certificate"
):
return CoreFiles.MCS_COMPLIANCE_CERTIFICATE
if evidence_category is not None and evidence_category.lower() == "retrofit design":
return CoreFiles.RETROFIT_DESIGN_DOC
@ -56,6 +64,7 @@ def get_core_file_type(
CoreFiles.RETROFIT_DESIGN_DOC,
CoreFiles.IMPROVEMENT_OPTION_EVALUATION,
CoreFiles.MEDIUM_TERM_IMPROVEMENT_PLAN,
CoreFiles.MCS_COMPLIANCE_CERTIFICATE,
}
for core_file in CoreFiles:
@ -68,8 +77,10 @@ def get_core_file_type(
return None
def get_file_type_string(filename: str) -> Optional[str]:
core_file: Optional[CoreFiles] = get_core_file_type(filename)
def get_file_type_string(
filename: str, evidence_category: Optional[str] = None
) -> Optional[str]:
core_file: Optional[CoreFiles] = get_core_file_type(filename, evidence_category)
if core_file is None:
return None

View file

@ -0,0 +1,63 @@
EVIDENCE_CATEGORIES = [
"Advice report",
"Air Tests - BGV",
"Air Tightness Strategy",
"Assessment report",
"Blue Site Notes (PAS Assessment)",
"Building Assessment report",
"Building Condition report",
"Building Regulations Sign-off",
"Claim of compliance PAS2030",
"Claim of compliance PAS2035",
"Commissioning checklist",
"Condition report",
"Contract / Invoice",
"Electrical Certificate",
"Energy report",
"Evidence of submission to CPS",
"Floor Plan",
"Full Property Assessment",
"Gas Appliance Benchmarking Certificate",
"Gas Appliance Commissioning Checklist",
"Gas Inspection Certificate",
"Handover and Commissioning Documents",
"Handover Documents",
"Handover documents for client",
"Heat Demand Calculations",
"Heritage Impact Assessment",
"Improvement option evaluation",
"Installation Guides",
"Insurance guarantee",
"Intended outcomes",
"MCS Compliance Certificate",
"Medium term improvement plan",
"Medium term low carbon plan",
"Mid Photo",
"Mid-Install Inspection",
"Minor Works Electrical Certificate",
"Monitoring and evaluation outcomes",
"Occupancy assessment",
"Other",
"Other commissioning certificates",
"Photo",
"Post Energy Performance Report (EPR)",
"Post installation RdSAP",
"Post Photo",
"Pre Energy Performance Report (EPR)",
"Pre installation RdSAP",
"Pre Photo",
"Pre-Design Building Survey",
"Pre-Installation Building Inspection",
"Product Data sheets",
"Product warranty",
"Property Assessment",
"Qualifications",
"Retrofit design",
"Risk assessment",
"Significance survey",
"Site Note (Green /Blue) and Certificate(s)",
"Ventilation Assessment",
"Ventilation Assessment Checklist",
"Ventilation Report",
"Welsh - Checklist",
]

View file

@ -1,6 +1,6 @@
from collections import defaultdict
import os
from typing import Dict, List, Optional
from typing import Dict, List, NamedTuple, Optional
from datetime import datetime
import requests
@ -13,6 +13,22 @@ from utils.logger import setup_logger
logger = setup_logger()
class DownloadedFile(NamedTuple):
file_path: str
evidence_category: Optional[str]
created_utc: datetime
class _EvidenceFileGroups(NamedTuple):
core: Dict[CoreFiles, EvidenceFileData]
other: List[EvidenceFileData]
class DownloadedFiles(NamedTuple):
core: List[DownloadedFile]
other: List[DownloadedFile]
class UnauthorizedError(Exception):
pass
@ -33,42 +49,60 @@ class PashubClient:
)
logger.info("Finished initialising CotalityClient")
def get_core_evidence_files_by_job_id(self, job_id: str) -> List[str]:
logger.info(f"Getting Core Evidence Files for job ID {job_id}")
def get_evidence_files_by_job_id(
self, job_id: str, include_other: bool = False
) -> DownloadedFiles:
logger.info(f"Getting evidence files for job ID {job_id}")
evidence_list: List[EvidenceFileData] = self._get_evidence_list(job_id)
logger.info(f"Found {len(evidence_list)} Evidence files to get")
logger.info(f"Found {len(evidence_list)} evidence files")
if not evidence_list:
return []
return DownloadedFiles(core=[], other=[])
saved_files: List[str] = []
core_files: Dict[CoreFiles, EvidenceFileData] = self._select_latest_core_files(
grouped: _EvidenceFileGroups = self._group_into_core_and_other_files(
evidence_list
)
logger.info(f"Number of core files to download is {len(core_files)}")
for _, evidence in core_files.items():
evidence_id = evidence.file_id
if not evidence_id:
core_files: List[DownloadedFile] = []
for _, evidence in grouped.core.items():
if not evidence.file_id:
continue
logger.info(f"Getting metadata for file {evidence.file_name}")
metadata: EvidenceMetadata = self._get_evidence_metadata(
job_id, evidence_id
job_id, evidence.file_id
)
download_url: str = self._build_download_url(metadata, evidence.file_id)
output_dir: str = "/tmp"
file_name: str = evidence.file_name
file_path: str = os.path.join(output_dir, file_name)
file_path: str = os.path.join("/tmp", evidence.file_name)
self._download_file(download_url, file_path)
logger.info("Successfully downloaded file")
saved_files.append(file_path)
core_files.append(
DownloadedFile(
file_path=file_path,
evidence_category=evidence.evidence_category,
created_utc=datetime.fromisoformat(evidence.created_utc),
)
)
return saved_files
other_files: List[DownloadedFile] = []
if include_other:
for evidence in grouped.other:
if not evidence.file_id:
continue
metadata = self._get_evidence_metadata(job_id, evidence.file_id)
download_url = self._build_download_url(metadata, evidence.file_id)
file_path = os.path.join("/tmp", evidence.file_name)
self._download_file(download_url, file_path)
logger.info("Successfully downloaded other file")
other_files.append(
DownloadedFile(
file_path=file_path,
evidence_category=evidence.evidence_category,
created_utc=datetime.fromisoformat(evidence.created_utc),
)
)
return DownloadedFiles(core=core_files, other=other_files)
def get_uprn_by_job_id(self, job_id: str) -> Optional[str]:
logger.info(f"Getting UPRN for job ID {job_id}")
@ -92,30 +126,32 @@ class PashubClient:
)
return None
def _select_latest_core_files(
def _group_into_core_and_other_files(
self,
files: List[EvidenceFileData],
) -> Dict[CoreFiles, EvidenceFileData]:
) -> _EvidenceFileGroups:
grouped: Dict[CoreFiles, List[EvidenceFileData]] = defaultdict(list)
other: List[EvidenceFileData] = []
for file in files:
core_type: Optional[CoreFiles] = get_core_file_type(
file.file_name, file.evidence_category
)
if not core_type:
other.append(file)
continue
grouped[core_type].append(file)
latest_files: Dict[CoreFiles, EvidenceFileData] = {}
latest_core_files: Dict[CoreFiles, EvidenceFileData] = {}
for core_type, group in grouped.items():
if core_type == CoreFiles.RETROFIT_DESIGN_DOC and len(group) > 1:
osm_candidates = [f for f in group if "-OSM-" in f.file_name]
group = osm_candidates if osm_candidates else group
latest = max(group, key=lambda f: datetime.fromisoformat(f.created_utc))
latest_files[core_type] = latest
latest_core_files[core_type] = latest
return latest_files
return _EvidenceFileGroups(core=latest_core_files, other=other)
def _get_evidence_list(self, job_id: str) -> List[EvidenceFileData]:
url = f"{self.base}/jobs/{job_id}/evidence"

View file

@ -11,11 +11,15 @@ from backend.app.db.models.uploaded_file import (
from backend.documents_parser.db_writer import save_epc_property_data
from backend.documents_parser.parser import parse_site_notes_pdf
from backend.pashub_fetcher.core_files import get_file_type_string
from backend.pashub_fetcher.pashub_client import PashubClient, UnauthorizedError
from backend.pashub_fetcher.pashub_client import (
DownloadedFile,
DownloadedFiles,
PashubClient,
UnauthorizedError,
)
from backend.pashub_fetcher.pashub_to_ara_trigger_request import (
PashubToAraTriggerRequest,
)
from backend.pashub_fetcher.sharepoint_subfolders import SharepointSubfolders
from datatypes.epc.domain.epc_property_data import EpcPropertyData
from utils.logger import setup_logger
from utils.s3 import upload_file_to_s3
@ -75,14 +79,16 @@ class PashubService:
logger.info(f"No UPRN found for job {job_id}")
try:
job_files: List[str] = active_client.get_core_evidence_files_by_job_id(
job_id
downloaded: DownloadedFiles = active_client.get_evidence_files_by_job_id(
job_id, include_other=request.get_other_files
)
except UnauthorizedError:
if active_client is not self._pashub_client:
raise
active_client = self._get_coordination_client()
job_files = active_client.get_core_evidence_files_by_job_id(job_id)
downloaded: DownloadedFiles = active_client.get_evidence_files_by_job_id(
job_id, include_other=request.get_other_files
)
if uprn or hubspot_deal_id:
logger.info("Uploading files to s3")
@ -92,29 +98,47 @@ class PashubService:
else FileSourceEnum.COORDINATION_HUB
)
upload_records = self._upload_to_s3_and_update_db(
job_files, uprn, hubspot_deal_id, file_source
downloaded.core, uprn, hubspot_deal_id, file_source
)
self._save_site_notes(upload_records)
# SharePoint upload disabled: pashub sharepoint_link is inconsistent
# (points to property or project unpredictably)
# if request.sharepoint_link:
# self._upload_to_sharepoint(request.sharepoint_link, job_files)
if downloaded.other:
self._upload_to_s3_and_update_db(
downloaded.other,
uprn,
hubspot_deal_id,
file_source,
default_file_type=FileTypeEnum.OTHER.value,
)
for file_path in job_files:
if request.sharepoint_link and request.address:
folder_name = request.address.split("|")[0].strip()
folders = self._sharepoint_client.get_folders_in_path(request.sharepoint_link)
match = next(
(f["name"] for f in folders.get("value", []) if f["name"].lower() == folder_name.lower()),
None,
)
if match is None:
logger.warning(f"SharePoint folder not found for '{folder_name}' in {request.sharepoint_link}")
else:
property_folder_path = f"{request.sharepoint_link}/{match}"
self._upload_to_sharepoint(property_folder_path, downloaded.core + downloaded.other)
for df in downloaded.core + downloaded.other:
try:
os.remove(file_path)
os.remove(df.file_path)
except OSError:
logger.warning(f"Failed to delete temp file {file_path}")
logger.warning(f"Failed to delete temp file {df.file_path}")
return job_files
return [df.file_path for df in downloaded.core + downloaded.other]
def _upload_to_s3_and_update_db(
self,
job_files: List[str],
job_files: List[DownloadedFile],
uprn: Optional[str],
hubspot_deal_id: Optional[str],
file_source: FileSourceEnum,
default_file_type: Optional[str] = None,
) -> List[_FileUploadRecord]:
if not uprn and not hubspot_deal_id:
return []
@ -128,11 +152,11 @@ class PashubService:
file_paths: List[str] = []
uploaded_files: List[UploadedFile] = []
for file_path in job_files:
filename = os.path.basename(file_path)
for df in job_files:
filename = os.path.basename(df.file_path)
file_key = f"{base_path}/{filename}"
upload_file_to_s3(file_path, self._s3_bucket, file_key)
upload_file_to_s3(df.file_path, self._s3_bucket, file_key)
uploaded_file = UploadedFile(
s3_file_bucket=self._s3_bucket,
@ -141,9 +165,9 @@ class PashubService:
uprn=int(uprn) if uprn else None,
hubspot_deal_id=hubspot_deal_id,
file_source=file_source.value,
file_type=get_file_type_string(filename),
file_type=get_file_type_string(filename, df.evidence_category) or default_file_type,
)
file_paths.append(file_path)
file_paths.append(df.file_path)
uploaded_files.append(uploaded_file)
with db_session() as session:
@ -180,11 +204,12 @@ class PashubService:
def _upload_to_sharepoint(
self,
sharepoint_link: str,
job_files: List[str],
property_folder_path: str,
files: List[DownloadedFile],
) -> None:
assessment_path = f"{sharepoint_link}/{SharepointSubfolders.ASSESSMENT.value}"
for file_path in job_files:
filename = file_path.split("/")[-1]
self._sharepoint_client.upload_file(file_path, assessment_path, filename)
for df in files:
filename = os.path.basename(df.file_path)
try:
self._sharepoint_client.upload_file(df.file_path, property_folder_path, filename)
except Exception:
logger.warning(f"Failed to upload {filename} to SharePoint", exc_info=True)

View file

@ -14,6 +14,8 @@ class PashubToAraTriggerRequest(BaseModel):
hubspot_listing_id: Optional[int] = None
hubspot_deal_id: Optional[str] = None
get_other_files: bool = False
@property
def pashub_job_id(self) -> str:
match = re.search(r"/jobs/([^/]+)", self.pashub_link)

View file

@ -183,3 +183,44 @@ def test_core_file_for_osm_fallback_does_not_fire_when_evidence_category_present
# Assert
assert result is None
def test_core_file_for_mcs_compliance_certificate_returns_mcs_compliance_certificate() -> None:
# Arrange
filename = "MCS_cert_job123.pdf"
# Act
result = get_core_file_type(
filename, evidence_category="mcs compliance certificate"
)
# Assert
assert result == CoreFiles.MCS_COMPLIANCE_CERTIFICATE
def test_core_file_for_mcs_compliance_certificate_is_case_insensitive() -> None:
# Arrange
filename = "some_cert.pdf"
# Act
result = get_core_file_type(
filename, evidence_category="MCS Compliance Certificate"
)
# Assert
assert result == CoreFiles.MCS_COMPLIANCE_CERTIFICATE
def test_get_file_type_string_with_mcs_evidence_category_returns_mcs_compliance_certificate() -> (
None
):
# Arrange
filename = "some_cert.pdf"
# Act
result = get_file_type_string(
filename, evidence_category="MCS Compliance Certificate"
)
# Assert
assert result == "mcs_compliance_certificate"

View file

@ -1,9 +1,22 @@
# pyright: reportPrivateUsage=false
from typing import Optional
from unittest.mock import patch
from backend.pashub_fetcher.core_files import CoreFiles
from backend.pashub_fetcher.evidence_file_data import EvidenceFileData
from backend.pashub_fetcher.pashub_client import PashubClient
from backend.pashub_fetcher.evidence_metadata import EvidenceMetadata
from backend.pashub_fetcher.pashub_client import (
DownloadedFile,
DownloadedFiles,
PashubClient,
)
def make_metadata() -> EvidenceMetadata:
return EvidenceMetadata(
container_name="my-container",
blob_uri="https://storage.example.com/blob?sas=token",
)
def make_client() -> PashubClient:
@ -26,11 +39,27 @@ def make_file(
# ---------------------------------------------------------------------------
# _select_latest_core_files
# _group_into_core_and_other_files
# ---------------------------------------------------------------------------
def test_select_latest_core_files_returns_single_retrofit_design_doc() -> None:
def test_group_into_core_and_other_files_classifies_core_and_other_correctly() -> None:
# Arrange
client = make_client()
files = [
make_file(file_name="SiteNote_001.pdf"),
make_file(file_name="some_unknown_document.pdf"),
]
# Act
result = client._group_into_core_and_other_files(files)
# Assert
assert CoreFiles.SITENOTE in result.core
assert [f.file_name for f in result.other] == ["some_unknown_document.pdf"]
def test_group_into_core_and_other_files_returns_single_retrofit_design_doc() -> None:
# Arrange
client = make_client()
files = [
@ -42,13 +71,16 @@ def test_select_latest_core_files_returns_single_retrofit_design_doc() -> None:
]
# Act
result = client._select_latest_core_files(files)
result = client._group_into_core_and_other_files(files)
# Assert
assert result[CoreFiles.RETROFIT_DESIGN_DOC].file_name == "2512-OSM-H21M900-XX-DR-N-A_Lord Nelson Street 018.pdf"
assert (
result.core[CoreFiles.RETROFIT_DESIGN_DOC].file_name
== "2512-OSM-H21M900-XX-DR-N-A_Lord Nelson Street 018.pdf"
)
def test_select_latest_core_files_osm_candidate_wins_over_non_osm() -> None:
def test_group_into_core_and_other_files_osm_candidate_wins_over_non_osm() -> None:
# Arrange - the non-OSM file is newer but should lose to the OSM file
client = make_client()
files = [
@ -65,13 +97,18 @@ def test_select_latest_core_files_osm_candidate_wins_over_non_osm() -> None:
]
# Act
result = client._select_latest_core_files(files)
result = client._group_into_core_and_other_files(files)
# Assert
assert result[CoreFiles.RETROFIT_DESIGN_DOC].file_name == "2512-OSM-H21M900-XX-DR-N-A_Lord Nelson Street 018.pdf"
assert (
result.core[CoreFiles.RETROFIT_DESIGN_DOC].file_name
== "2512-OSM-H21M900-XX-DR-N-A_Lord Nelson Street 018.pdf"
)
def test_select_latest_core_files_picks_latest_when_both_candidates_have_osm() -> None:
def test_group_into_core_and_other_files_picks_latest_when_both_candidates_have_osm() -> (
None
):
# Arrange
client = make_client()
files = [
@ -88,13 +125,62 @@ def test_select_latest_core_files_picks_latest_when_both_candidates_have_osm() -
]
# Act
result = client._select_latest_core_files(files)
result = client._group_into_core_and_other_files(files)
# Assert
assert result[CoreFiles.RETROFIT_DESIGN_DOC].file_name == "2603-OSM-B06M901-XX-DR-N-A_Alvaston Walk 022.pdf"
assert (
result.core[CoreFiles.RETROFIT_DESIGN_DOC].file_name
== "2603-OSM-B06M901-XX-DR-N-A_Alvaston Walk 022.pdf"
)
def test_select_latest_core_files_falls_back_to_latest_when_no_osm_candidates() -> None:
def test_group_into_core_and_other_files_classifies_mcs_cert_as_core() -> None:
# Arrange
client = make_client()
files = [
make_file(
file_name="MCS_cert_job123.pdf",
evidence_category="MCS Compliance Certificate",
),
]
# Act
result = client._group_into_core_and_other_files(files)
# Assert
assert CoreFiles.MCS_COMPLIANCE_CERTIFICATE in result.core
assert result.other == []
def test_group_into_core_and_other_files_picks_most_recent_mcs_cert() -> None:
# Arrange
client = make_client()
files = [
make_file(
file_name="mcs_cert_old.pdf",
evidence_category="MCS Compliance Certificate",
created_utc="2024-01-01T00:00:00",
),
make_file(
file_name="mcs_cert_new.pdf",
evidence_category="MCS Compliance Certificate",
created_utc="2024-06-01T00:00:00",
),
]
# Act
result = client._group_into_core_and_other_files(files)
# Assert
assert (
result.core[CoreFiles.MCS_COMPLIANCE_CERTIFICATE].file_name
== "mcs_cert_new.pdf"
)
def test_group_into_core_and_other_files_falls_back_to_latest_when_no_osm_candidates() -> (
None
):
# Arrange
client = make_client()
files = [
@ -111,7 +197,84 @@ def test_select_latest_core_files_falls_back_to_latest_when_no_osm_candidates()
]
# Act
result = client._select_latest_core_files(files)
result = client._group_into_core_and_other_files(files)
# Assert
assert result[CoreFiles.RETROFIT_DESIGN_DOC].file_name == "retrofit_design_v2.pdf"
assert (
result.core[CoreFiles.RETROFIT_DESIGN_DOC].file_name == "retrofit_design_v2.pdf"
)
# ---------------------------------------------------------------------------
# get_evidence_files_by_job_id
# ---------------------------------------------------------------------------
def test_get_evidence_files_by_job_id_returns_downloaded_files_with_empty_other_when_include_other_false() -> (
None
):
# Arrange
client = make_client()
files = [
make_file(file_name="SiteNote_001.pdf"),
make_file(file_name="unknown_doc.pdf"),
]
# Act
with (
patch.object(client, "_get_evidence_list", return_value=files),
patch.object(client, "_get_evidence_metadata", return_value=make_metadata()),
patch.object(client, "_download_file"),
):
result = client.get_evidence_files_by_job_id("job-1", include_other=False)
# Assert
assert isinstance(result, DownloadedFiles)
assert [df.file_path for df in result.core] == ["/tmp/SiteNote_001.pdf"]
assert result.other == []
def test_get_evidence_files_by_job_id_core_files_carry_evidence_category() -> None:
# Arrange
client = make_client()
files = [
make_file(
file_name="MCS_cert.pdf",
evidence_category="MCS Compliance Certificate",
),
]
# Act
with (
patch.object(client, "_get_evidence_list", return_value=files),
patch.object(client, "_get_evidence_metadata", return_value=make_metadata()),
patch.object(client, "_download_file"),
):
result = client.get_evidence_files_by_job_id("job-1", include_other=False)
# Assert
assert len(result.core) == 1
assert result.core[0].evidence_category == "MCS Compliance Certificate"
def test_get_evidence_files_by_job_id_downloads_other_files_when_include_other_true() -> (
None
):
# Arrange
client = make_client()
files = [
make_file(file_name="SiteNote_001.pdf"),
make_file(file_name="unknown_doc.pdf"),
]
# Act
with (
patch.object(client, "_get_evidence_list", return_value=files),
patch.object(client, "_get_evidence_metadata", return_value=make_metadata()),
patch.object(client, "_download_file"),
):
result = client.get_evidence_files_by_job_id("job-1", include_other=True)
# Assert
assert [df.file_path for df in result.core] == ["/tmp/SiteNote_001.pdf"]
assert [df.file_path for df in result.other] == ["/tmp/unknown_doc.pdf"]

View file

@ -1,17 +1,22 @@
import pytest
from datetime import datetime
from typing import Any, Callable, Optional
from unittest.mock import MagicMock, call, patch
from backend.app.db.models.uploaded_file import FileSourceEnum
from backend.pashub_fetcher.pashub_client import PashubClient, UnauthorizedError
from backend.app.db.models.uploaded_file import FileSourceEnum, FileTypeEnum
from backend.pashub_fetcher.pashub_client import (
DownloadedFile,
DownloadedFiles,
PashubClient,
UnauthorizedError,
)
from backend.pashub_fetcher.pashub_service import PashubService
from backend.pashub_fetcher.pashub_to_ara_trigger_request import (
PashubToAraTriggerRequest,
)
from utils.sharepoint.domna_sharepoint_client import DomnaSharepointClient
FAKE_JOB_LINK = "https://pashub.net/jobs/job-id-123/details"
@ -20,12 +25,16 @@ def make_request(
uprn: Optional[str] = None,
hubspot_deal_id: Optional[str] = None,
sharepoint_link: Optional[str] = None,
get_other_files: bool = False,
address: Optional[str] = None,
) -> PashubToAraTriggerRequest:
return PashubToAraTriggerRequest(
pashub_link=pashub_link,
uprn=uprn,
hubspot_deal_id=hubspot_deal_id,
sharepoint_link=sharepoint_link,
get_other_files=get_other_files,
address=address,
)
@ -43,6 +52,16 @@ def make_service(
)
_DEFAULT_UTC = datetime(2024, 1, 1)
def make_downloaded(core: list[str], other: list[str] = []) -> DownloadedFiles:
return DownloadedFiles(
core=[DownloadedFile(fp, None, _DEFAULT_UTC) for fp in core],
other=[DownloadedFile(fp, None, _DEFAULT_UTC) for fp in other],
)
# ---------------------------------------------------------------------------
# run(): returns file paths
# ---------------------------------------------------------------------------
@ -51,10 +70,9 @@ def make_service(
def test_run_returns_file_paths() -> None:
mock_client = MagicMock(spec=PashubClient)
mock_client.get_uprn_by_job_id.return_value = None
mock_client.get_core_evidence_files_by_job_id.return_value = [
"/tmp/a.pdf",
"/tmp/b.pdf",
]
mock_client.get_evidence_files_by_job_id.return_value = make_downloaded(
core=["/tmp/a.pdf", "/tmp/b.pdf"]
)
service = make_service(pashub_client=mock_client)
@ -64,6 +82,30 @@ def test_run_returns_file_paths() -> None:
assert result == ["/tmp/a.pdf", "/tmp/b.pdf"]
# ---------------------------------------------------------------------------
# run(): returns core + other file paths when get_other_files=True
# ---------------------------------------------------------------------------
def test_run_returns_core_and_other_file_paths() -> None:
# Arrange
mock_client = MagicMock(spec=PashubClient)
mock_client.get_uprn_by_job_id.return_value = None
mock_client.get_evidence_files_by_job_id.return_value = make_downloaded(
core=["/tmp/core.pdf"],
other=["/tmp/other.pdf"],
)
service = make_service(pashub_client=mock_client)
# Act
with patch("backend.pashub_fetcher.pashub_service.os.remove"):
result = service.run(make_request(get_other_files=True))
# Assert
assert result == ["/tmp/core.pdf", "/tmp/other.pdf"]
# ---------------------------------------------------------------------------
# run(): skips upload when neither uprn nor hubspot_deal_id
# ---------------------------------------------------------------------------
@ -72,7 +114,9 @@ def test_run_returns_file_paths() -> None:
def test_run_skips_upload_when_no_uprn_and_no_deal_id() -> None:
mock_client = MagicMock(spec=PashubClient)
mock_client.get_uprn_by_job_id.return_value = None
mock_client.get_core_evidence_files_by_job_id.return_value = ["/tmp/a.pdf"]
mock_client.get_evidence_files_by_job_id.return_value = make_downloaded(
core=["/tmp/a.pdf"]
)
service = make_service(pashub_client=mock_client)
@ -93,10 +137,9 @@ def test_run_skips_upload_when_no_uprn_and_no_deal_id() -> None:
def test_run_uploads_files_to_s3_using_uprn_path() -> None:
mock_client = MagicMock(spec=PashubClient)
mock_client.get_uprn_by_job_id.return_value = None
mock_client.get_core_evidence_files_by_job_id.return_value = [
"/tmp/SiteNote_001.pdf",
"/tmp/Photopack_002.pdf",
]
mock_client.get_evidence_files_by_job_id.return_value = make_downloaded(
core=["/tmp/SiteNote_001.pdf", "/tmp/Photopack_002.pdf"]
)
service = make_service(pashub_client=mock_client, s3_bucket="my-bucket")
@ -132,9 +175,9 @@ def test_run_uploads_files_to_s3_using_uprn_path() -> None:
def test_run_persists_uploaded_file_records_to_db() -> None:
mock_client = MagicMock(spec=PashubClient)
mock_client.get_uprn_by_job_id.return_value = None
mock_client.get_core_evidence_files_by_job_id.return_value = [
"/tmp/SiteNote_001.pdf"
]
mock_client.get_evidence_files_by_job_id.return_value = make_downloaded(
core=["/tmp/SiteNote_001.pdf"]
)
fake_session = MagicMock()
service = make_service(pashub_client=mock_client)
@ -163,9 +206,9 @@ def test_run_persists_uploaded_file_records_to_db() -> None:
def test_run_uses_hubspot_deal_id_path_when_no_uprn() -> None:
mock_client = MagicMock(spec=PashubClient)
mock_client.get_uprn_by_job_id.return_value = None
mock_client.get_core_evidence_files_by_job_id.return_value = [
"/tmp/SiteNote_001.pdf"
]
mock_client.get_evidence_files_by_job_id.return_value = make_downloaded(
core=["/tmp/SiteNote_001.pdf"]
)
service = make_service(pashub_client=mock_client, s3_bucket="my-bucket")
@ -191,9 +234,9 @@ def test_run_uses_hubspot_deal_id_path_when_no_uprn() -> None:
def test_run_parses_and_saves_site_notes_for_rd_sap_site_note_file() -> None:
mock_client = MagicMock(spec=PashubClient)
mock_client.get_uprn_by_job_id.return_value = None
mock_client.get_core_evidence_files_by_job_id.return_value = [
"/tmp/RdSAP_SiteNote_001.pdf"
]
mock_client.get_evidence_files_by_job_id.return_value = make_downloaded(
core=["/tmp/RdSAP_SiteNote_001.pdf"]
)
fake_epc_data = MagicMock()
fake_session = MagicMock()
@ -241,11 +284,15 @@ def test_run_uses_coordination_client_when_pas_401_on_uprn_lookup() -> None:
coord_client = MagicMock(spec=PashubClient)
coord_client.get_uprn_by_job_id.return_value = "99999"
coord_client.get_core_evidence_files_by_job_id.return_value = ["/tmp/a.pdf"]
coord_client.get_evidence_files_by_job_id.return_value = make_downloaded(
core=["/tmp/a.pdf"]
)
factory = MagicMock(return_value=coord_client)
service = make_service(pashub_client=pas_client, coordination_client_factory=factory)
service = make_service(
pashub_client=pas_client, coordination_client_factory=factory
)
with (
patch("backend.pashub_fetcher.pashub_service.upload_file_to_s3"),
@ -256,20 +303,24 @@ def test_run_uses_coordination_client_when_pas_401_on_uprn_lookup() -> None:
assert result == ["/tmp/a.pdf"]
coord_client.get_uprn_by_job_id.assert_called_once()
coord_client.get_core_evidence_files_by_job_id.assert_called_once()
coord_client.get_evidence_files_by_job_id.assert_called_once()
assert factory.call_count == 1
def test_run_uses_coordination_client_when_pas_401_on_file_listing() -> None:
pas_client = MagicMock(spec=PashubClient)
pas_client.get_core_evidence_files_by_job_id.side_effect = UnauthorizedError()
pas_client.get_evidence_files_by_job_id.side_effect = UnauthorizedError()
coord_client = MagicMock(spec=PashubClient)
coord_client.get_core_evidence_files_by_job_id.return_value = ["/tmp/a.pdf"]
coord_client.get_evidence_files_by_job_id.return_value = make_downloaded(
core=["/tmp/a.pdf"]
)
factory = MagicMock(return_value=coord_client)
service = make_service(pashub_client=pas_client, coordination_client_factory=factory)
service = make_service(
pashub_client=pas_client, coordination_client_factory=factory
)
with (
patch("backend.pashub_fetcher.pashub_service.upload_file_to_s3"),
@ -279,7 +330,7 @@ def test_run_uses_coordination_client_when_pas_401_on_file_listing() -> None:
result = service.run(make_request(uprn="12345"))
assert result == ["/tmp/a.pdf"]
coord_client.get_core_evidence_files_by_job_id.assert_called_once()
coord_client.get_evidence_files_by_job_id.assert_called_once()
pas_client.get_uprn_by_job_id.assert_not_called()
@ -302,24 +353,32 @@ def test_run_raises_unauthorized_when_both_clients_401() -> None:
factory = MagicMock(return_value=coord_client)
service = make_service(pashub_client=pas_client, coordination_client_factory=factory)
service = make_service(
pashub_client=pas_client, coordination_client_factory=factory
)
with pytest.raises(UnauthorizedError):
service.run(make_request())
def test_run_persists_coordination_hub_file_source_when_pas_401_on_uprn_lookup() -> None:
def test_run_persists_coordination_hub_file_source_when_pas_401_on_uprn_lookup() -> (
None
):
pas_client = MagicMock(spec=PashubClient)
pas_client.get_uprn_by_job_id.side_effect = UnauthorizedError()
coord_client = MagicMock(spec=PashubClient)
coord_client.get_uprn_by_job_id.return_value = "99999"
coord_client.get_core_evidence_files_by_job_id.return_value = ["/tmp/a.pdf"]
coord_client.get_evidence_files_by_job_id.return_value = make_downloaded(
core=["/tmp/a.pdf"]
)
factory = MagicMock(return_value=coord_client)
fake_session = MagicMock()
service = make_service(pashub_client=pas_client, coordination_client_factory=factory)
service = make_service(
pashub_client=pas_client, coordination_client_factory=factory
)
with (
patch("backend.pashub_fetcher.pashub_service.upload_file_to_s3"),
@ -334,17 +393,23 @@ def test_run_persists_coordination_hub_file_source_when_pas_401_on_uprn_lookup()
assert added[0].file_source == FileSourceEnum.COORDINATION_HUB.value
def test_run_persists_coordination_hub_file_source_when_pas_401_on_file_listing() -> None:
def test_run_persists_coordination_hub_file_source_when_pas_401_on_file_listing() -> (
None
):
pas_client = MagicMock(spec=PashubClient)
pas_client.get_core_evidence_files_by_job_id.side_effect = UnauthorizedError()
pas_client.get_evidence_files_by_job_id.side_effect = UnauthorizedError()
coord_client = MagicMock(spec=PashubClient)
coord_client.get_core_evidence_files_by_job_id.return_value = ["/tmp/a.pdf"]
coord_client.get_evidence_files_by_job_id.return_value = make_downloaded(
core=["/tmp/a.pdf"]
)
factory = MagicMock(return_value=coord_client)
fake_session = MagicMock()
service = make_service(pashub_client=pas_client, coordination_client_factory=factory)
service = make_service(
pashub_client=pas_client, coordination_client_factory=factory
)
with (
patch("backend.pashub_fetcher.pashub_service.upload_file_to_s3"),
@ -359,12 +424,204 @@ def test_run_persists_coordination_hub_file_source_when_pas_401_on_file_listing(
assert added[0].file_source == FileSourceEnum.COORDINATION_HUB.value
# ---------------------------------------------------------------------------
# run(): get_other_files=True → other temp files deleted after run
# ---------------------------------------------------------------------------
def test_run_deletes_other_temp_files_when_get_other_files_true() -> None:
# Arrange
mock_client = MagicMock(spec=PashubClient)
mock_client.get_uprn_by_job_id.return_value = None
mock_client.get_evidence_files_by_job_id.return_value = make_downloaded(
core=["/tmp/core.pdf"],
other=["/tmp/other.pdf"],
)
service = make_service(pashub_client=mock_client)
# Act
with patch("backend.pashub_fetcher.pashub_service.os.remove") as mock_remove:
service.run(make_request(get_other_files=True))
# Assert
mock_remove.assert_any_call("/tmp/core.pdf")
mock_remove.assert_any_call("/tmp/other.pdf")
# ---------------------------------------------------------------------------
# run(): get_other_files=True → other files uploaded to S3
# ---------------------------------------------------------------------------
def test_run_uploads_other_files_to_s3_when_get_other_files_true() -> None:
# Arrange
mock_client = MagicMock(spec=PashubClient)
mock_client.get_uprn_by_job_id.return_value = None
mock_client.get_evidence_files_by_job_id.return_value = make_downloaded(
core=["/tmp/SiteNote_001.pdf"],
other=["/tmp/unknown_file.pdf"],
)
service = make_service(pashub_client=mock_client, s3_bucket="my-bucket")
# Act
with (
patch("backend.pashub_fetcher.pashub_service.upload_file_to_s3") as mock_s3,
patch("backend.pashub_fetcher.pashub_service.db_session"),
patch("backend.pashub_fetcher.pashub_service.os.remove"),
):
service.run(make_request(uprn="12345", get_other_files=True))
# Assert
mock_s3.assert_any_call(
"/tmp/unknown_file.pdf",
"my-bucket",
"documents/uprn/12345/unknown_file.pdf",
)
# ---------------------------------------------------------------------------
# run(): get_other_files=True → other files persisted with file_type OTHER
# ---------------------------------------------------------------------------
def test_run_persists_other_files_with_other_file_type() -> None:
# Arrange
mock_client = MagicMock(spec=PashubClient)
mock_client.get_uprn_by_job_id.return_value = None
mock_client.get_evidence_files_by_job_id.return_value = make_downloaded(
core=[],
other=["/tmp/unknown_file.pdf"],
)
fake_session = MagicMock()
service = make_service(pashub_client=mock_client)
# Act
with (
patch("backend.pashub_fetcher.pashub_service.upload_file_to_s3"),
patch("backend.pashub_fetcher.pashub_service.db_session") as mock_db,
patch("backend.pashub_fetcher.pashub_service.os.remove"),
):
mock_db.return_value.__enter__.return_value = fake_session
service.run(make_request(uprn="12345", get_other_files=True))
# Assert
all_added = [item for c in fake_session.add_all.call_args_list for item in c[0][0]]
assert len(all_added) == 1
assert all_added[0].file_type == FileTypeEnum.OTHER.value
def test_run_persists_mcs_cert_with_mcs_compliance_certificate_file_type() -> None:
# Arrange
mock_client = MagicMock(spec=PashubClient)
mock_client.get_uprn_by_job_id.return_value = None
mock_client.get_evidence_files_by_job_id.return_value = DownloadedFiles(
core=[
DownloadedFile(
"/tmp/MCS_cert.pdf", "MCS Compliance Certificate", datetime(2024, 1, 1)
)
],
other=[],
)
fake_session = MagicMock()
service = make_service(pashub_client=mock_client)
# Act
with (
patch("backend.pashub_fetcher.pashub_service.upload_file_to_s3"),
patch("backend.pashub_fetcher.pashub_service.db_session") as mock_db,
patch("backend.pashub_fetcher.pashub_service.os.remove"),
):
mock_db.return_value.__enter__.return_value = fake_session
service.run(make_request(uprn="12345"))
# Assert
fake_session.add_all.assert_called_once()
added: list[Any] = fake_session.add_all.call_args[0][0]
assert added[0].file_type == FileTypeEnum.MCS_COMPLIANCE_CERTIFICATE.value
# ---------------------------------------------------------------------------
# run(): SharePoint upload
# ---------------------------------------------------------------------------
def test_sharepoint_uploads_all_files_to_property_folder() -> None:
# Arrange
mock_client = MagicMock(spec=PashubClient)
mock_client.get_uprn_by_job_id.return_value = None
mock_client.get_evidence_files_by_job_id.return_value = make_downloaded(
core=["/tmp/core.pdf"],
other=["/tmp/other.pdf"],
)
mock_sharepoint = MagicMock(spec=DomnaSharepointClient)
mock_sharepoint.get_folders_in_path.return_value = {
"value": [{"name": "123 Main St"}]
}
service = make_service(pashub_client=mock_client, sharepoint_client=mock_sharepoint)
# Act
with patch("backend.pashub_fetcher.pashub_service.os.remove"):
service.run(
make_request(
sharepoint_link="Retrofit/Properties",
get_other_files=True,
address="123 Main St | some deal",
)
)
# Assert
mock_sharepoint.upload_file.assert_any_call(
"/tmp/core.pdf", "Retrofit/Properties/123 Main St", "core.pdf"
)
mock_sharepoint.upload_file.assert_any_call(
"/tmp/other.pdf", "Retrofit/Properties/123 Main St", "other.pdf"
)
def test_sharepoint_skips_upload_when_folder_not_found() -> None:
# Arrange
mock_client = MagicMock(spec=PashubClient)
mock_client.get_uprn_by_job_id.return_value = None
mock_client.get_evidence_files_by_job_id.return_value = make_downloaded(
core=["/tmp/core.pdf"]
)
mock_sharepoint = MagicMock(spec=DomnaSharepointClient)
mock_sharepoint.get_folders_in_path.return_value = {
"value": [{"name": "Different Property"}]
}
service = make_service(pashub_client=mock_client, sharepoint_client=mock_sharepoint)
# Act
with (
patch("backend.pashub_fetcher.pashub_service.os.remove"),
patch("backend.pashub_fetcher.pashub_service.logger") as mock_logger,
):
service.run(
make_request(
sharepoint_link="Retrofit/Properties",
address="No Such Property | deal",
)
)
# Assert
mock_sharepoint.upload_file.assert_not_called()
mock_logger.warning.assert_called()
def test_run_warns_and_continues_when_site_notes_parsing_fails() -> None:
mock_client = MagicMock(spec=PashubClient)
mock_client.get_uprn_by_job_id.return_value = None
mock_client.get_core_evidence_files_by_job_id.return_value = [
"/tmp/RdSAP_SiteNote_001.pdf"
]
mock_client.get_evidence_files_by_job_id.return_value = make_downloaded(
core=["/tmp/RdSAP_SiteNote_001.pdf"]
)
service = make_service(pashub_client=mock_client)

View file

@ -10,19 +10,19 @@ from backend.pashub_fetcher.pashub_to_ara_trigger_request import (
)
from backend.pashub_fetcher.handler.handler import handler
if __name__ == "__main__":
BASE_DIR = os.path.dirname(os.path.dirname(__file__))
filepath: str = os.path.join(
BASE_DIR,
"pashub_fetcher",
"The_Guinness_Partnership_AtkinsR_alis_Coordination_Design_Board_1774881298.xlsx",
"local_run_02-06-2026",
"ECO_Approach_Coordination_Design_KN.xlsx",
)
wb = load_workbook(filepath, data_only=True)
ws = wb["filtered_2"]
ws = wb["filtered"]
HEADER_ROW = 3
HEADER_ROW = 1
headers: Dict[str, int] = {}
for col in range(1, ws.max_column + 1):
@ -31,7 +31,7 @@ if __name__ == "__main__":
headers[value.strip()] = col
name_col = headers["Name"]
link_col = headers["PasHub Link"]
link_col = headers["PasHub ID"]
hubspot_deal_id_col = headers["HubSpot ID"]
trigger_requests: List[PashubToAraTriggerRequest] = []
@ -50,7 +50,10 @@ if __name__ == "__main__":
trigger_requests.append(
PashubToAraTriggerRequest(
pashub_link=str(link), hubspot_deal_id=str(hubspot_deal_id)
pashub_link=str(link),
hubspot_deal_id=str(hubspot_deal_id),
address=str(name),
get_other_files=True,
)
)

View file

@ -16,40 +16,44 @@ logger: logging.Logger = logging.getLogger(__name__)
DRY_RUN: bool = False
DEAL_ID_FILTER: frozenset[str] = frozenset(
{
"379452094688",
"379466504437",
"379660170452",
"380016925932",
"379848065216",
"379466504434",
"379452094690",
"379965924567",
"380016925923",
"379792072898",
"379654754502",
"379560262861",
"379969670369",
"379248717001",
"379971468493",
"379999888607",
"379606372580",
"379969603797",
"379967743213",
"379263155434",
"379855267025",
"379889899719",
"379071064307",
"379867925741",
}
)
# DEAL_ID_FILTER: frozenset[str] = frozenset(
# {
# "379452094688",
# "379466504437",
# "379660170452",
# "380016925932",
# "379848065216",
# "379466504434",
# "379452094690",
# "379965924567",
# "380016925923",
# "379792072898",
# "379654754502",
# "379560262861",
# "379969670369",
# "379248717001",
# "379971468493",
# "379999888607",
# "379606372580",
# "379969603797",
# "379967743213",
# "379263155434",
# "379855267025",
# "379889899719",
# "379071064307",
# "379867925741",
# }
# )
DEAL_ID_FILTER = None
EXCEL_PATH: str = os.path.join(
os.path.dirname(__file__),
"united-infrastructure-exports-all-deals-2026-05-14.xlsx",
"local_run_02-06-2026/ECO_Approach_Coordination_Design_KN.xlsx",
)
SHAREPOINT_PROPERTIES_FOLDER: str = ""
def _build_requests(excel_path: str) -> list[PashubToAraTriggerRequest]:
wb = load_workbook(excel_path, data_only=True)
@ -61,10 +65,10 @@ def _build_requests(excel_path: str) -> list[PashubToAraTriggerRequest]:
if header_val is not None:
headers[str(header_val).strip()] = col
pashub_col: int = headers["PasHub link"]
record_id_col: int = headers["Record ID"]
deal_name_col: int = headers["Deal Name"]
deal_stage_col: int = headers["Deal Stage"]
pashub_col: int = headers["PasHub ID"]
record_id_col: int = headers["HubSpot ID"]
deal_name_col: int = headers["Name"]
deal_stage_col: Optional[int] = headers.get("Deal Stage", None)
requests: list[PashubToAraTriggerRequest] = []
@ -77,7 +81,9 @@ def _build_requests(excel_path: str) -> list[PashubToAraTriggerRequest]:
record_id_raw = ws.cell(row=row, column=record_id_col).value
deal_name_raw = ws.cell(row=row, column=deal_name_col).value
deal_stage_raw = ws.cell(row=row, column=deal_stage_col).value
deal_stage_raw = (
ws.cell(row=row, column=deal_stage_col).value if deal_stage_col else None
)
hubspot_deal_id: Optional[str] = (
str(record_id_raw) if record_id_raw is not None else None
@ -95,6 +101,7 @@ def _build_requests(excel_path: str) -> list[PashubToAraTriggerRequest]:
hubspot_deal_id=hubspot_deal_id,
address=address,
deal_stage=deal_stage,
sharepoint_link=SHAREPOINT_PROPERTIES_FOLDER or None,
)
)

View file

@ -109,6 +109,15 @@ class MainHeatingDetail:
main_heating_data_source: Optional[int] = None
condensing: Optional[bool] = None
weather_compensator: Optional[bool] = None
# Community-heating CHP split (RdSAP 10 §C / SAP 10.2 Appendix C):
# when the heat network combines CHP + back-up boilers, the worksheet
# splits heat 35% CHP / 65% boilers and prices each share at its own
# Table 12 fuel-code rate. Populated by the Elmhurst mapper for SAP
# code 302 ("Community heating with CHP") when the §14.1 Community
# Heat Source is "Combined Heat and Power"; None for non-CHP heat
# networks and individually-heated dwellings.
community_heating_chp_fraction: Optional[float] = None
community_heating_boiler_fuel_type: Optional[int] = None
@dataclass

View file

@ -65,6 +65,7 @@ from domain.sap10_calculator.tables.pcdb import heat_pump_record
from datatypes.epc.surveys.elmhurst_site_notes import (
AlternativeWall as ElmhurstAlternativeWall,
BuildingPartDimensions as ElmhurstBuildingPartDimensions,
CommunityHeating,
ElmhurstSiteNotes,
FloorDetails as ElmhurstFloorDetails,
MainHeating as ElmhurstMainHeating,
@ -3823,6 +3824,23 @@ _ELMHURST_MAIN_FUEL_TO_SAP10: Dict[str, int] = {
# main_fuel row for "oil (not community)", which routes via
# `API_FUEL_TO_TABLE_32` → Table 32 code 4 for cost / CO2 / PE.
"Heating oil": 28,
# Elmhurst Summary §14.0 / §15.0 lodging form for SAP 10.2 Table 12
# bulk LPG (£62 standing, 6.74 p/kWh, 0.241 kg CO2/kWh, 1.141 PE).
# 27 = epc_codes.csv main_fuel row for "LPG (not community)", which
# routes via `API_FUEL_TO_TABLE_32` / `API_FUEL_TO_TABLE_12` → fuel
# code 2 (bulk LPG) for cost / CO2 / PE. Distinct from the legacy
# "LPG bulk" label above (API code 6 = "wood logs" — same pre-
# existing oddity as "Oil" → 8; both labels are unused by any live
# fixture). Live form on Elmhurst worksheets is "Bulk LPG".
"Bulk LPG": 27,
# Elmhurst Summary §15.0 "Water Heating Fuel Type" labels for the
# bio-liquid fuels added to the EES dict above. Values are Table 32
# codes verbatim (no API enum collision). Spec: SAP 10.2 Table 12
# (PDF p.189) notes (d)/(e)/(f).
"Bio-liquid HVO from used cooking oil": 71,
"Bio-liquid FAME from animal/vegetable oils": 73,
"Bioethanol": 76,
"B30K": 75,
"Coal": 11,
"Electricity": 30,
"Electricity (off-peak 7hr)": 33,
@ -4024,11 +4042,27 @@ def _elmhurst_secondary_fuel_from_sap_code(
def _elmhurst_sap_control_code(sap_control: str) -> Optional[int]:
"""Extract the SAP code integer from a heating-controls field like
'SAP code 2106, Programmer, room thermostat and TRVs' 2106. The
cascade reads `main_heating_control` as int when present."""
"""Extract the SAP code integer from a heating-controls field.
Two lodgement forms across the Elmhurst Summary corpus:
1. '§14.0 Main Heating Controls Sap: SAP code 2106, Programmer,
room thermostat and TRVs' (individually-heated dwellings).
2. '§14.1 Community Heating Heating Controls SAP: 2306' bare
4-digit integer string (community heating dwellings, per
SAP 10.2 Table 4e Group 3 codes 2301-2314).
Either form yields the cascade-readable int. Returns None when the
lodgement is empty or doesn't carry a recognisable code.
"""
if not sap_control:
return None
m = re.match(r"SAP code\s+(\d+)", sap_control)
return int(m.group(1)) if m else None
if m:
return int(m.group(1))
bare = sap_control.strip()
if bare.isdigit():
return int(bare)
return None
# SAP10.2 Table 4a main-heating-category codes. The cascade reads
@ -4109,6 +4143,59 @@ _LIQUID_FUEL_BOILER_SAP_MAIN_HEATING_CODES: Final[frozenset[int]] = (
frozenset(range(120, 142))
)
# SAP 10.2 Table 4b gas-boiler code range (PDF p.168). Rows 101-119 are
# "Gas boilers (including mains gas, LPG and biogas)" — 101-109 are
# 1998-or-later, 110-114 pre-1998 fan-assisted flue, 115-119 pre-1998
# balanced/open flue. The code identifies the boiler TYPE/efficiency, not
# the specific carrier: the same row applies to mains gas, bulk/bottled
# LPG and biogas alike. The older Elmhurst export lodged §14.0 "Fuel
# Type: Mains gas" explicitly, but the newer form leaves §14.0 "Fuel
# Type" empty and lodges only the SAP code (e.g. 104 condensing combi,
# EES "BGW"). For these, §15.0 "Water Heating Fuel Type" names the
# carrier — a combi/boiler heats space + water from the one appliance —
# so it disambiguates mains-gas-vs-LPG. Codes 120-141 (CPSU + range
# cookers) are already covered by
# `_LIQUID_FUEL_BOILER_SAP_MAIN_HEATING_CODES`.
_GAS_BOILER_SAP_MAIN_HEATING_CODES: Final[frozenset[int]] = (
frozenset(range(101, 120))
)
# SAP10 main-fuel codes in the gas / LPG family — the only carriers a
# Table 4b gas-boiler row (101-119) can have (mains gas, mains gas
# community, bottled/bulk/special-condition LPG). Per
# `_ELMHURST_MAIN_FUEL_TO_SAP10`: mains gas = 26, mains gas community =
# 1, LPG bottled/bulk/special = 5/6/7, "Bulk LPG" = 27. The §15.0
# water-heating-fuel derivation is gated on the resolved fuel being one
# of these so it can't mis-assign electricity from a separate immersion
# (where §15.0 lodges the immersion's fuel, not the boiler's) — that
# case still strict-raises `MissingMainFuelType` to force a mapper fix.
_GAS_LPG_MAIN_FUEL_CODES: Final[frozenset[int]] = frozenset({1, 5, 6, 7, 26, 27})
def _elmhurst_gas_boiler_main_fuel(
sap_main_heating_code: Optional[int],
water_heating_fuel_code: Optional[int],
) -> Optional[int]:
"""Derive a gas/LPG main-fuel code for a Table 4b gas boiler whose
§14.0 "Fuel Type" string is absent (newer Elmhurst export form).
Returns the §15.0 water-heating fuel code when, and only when, the
SAP main-heating code is a Table 4b gas-boiler row (101-119) AND the
§15.0 fuel resolves to a gas/LPG carrier the same combi/boiler
heats space + water, so §15.0 names the boiler's carrier. Returns
None otherwise (non-gas-boiler code, or §15.0 lodges a non-gas fuel
such as an electric immersion), leaving the caller to strict-raise.
Spec: SAP 10.2 Table 4b "Seasonal efficiency for gas and liquid fuel
boilers" (PDF p.168) — rows 101-119 are gas-family boilers.
"""
if (
sap_main_heating_code in _GAS_BOILER_SAP_MAIN_HEATING_CODES
and water_heating_fuel_code in _GAS_LPG_MAIN_FUEL_CODES
):
return water_heating_fuel_code
return None
# Elmhurst §14.0 "Main Heating EES Code" → Table 32 main fuel code.
# Empirically derived from the heating-systems corpus at
@ -4161,9 +4248,133 @@ _ELMHURST_MAIN_HEATING_EES_TO_FUEL_CODE: Final[dict[str, int]] = {
# Wood Logs — Table 32 code 20 (4.23 / 0.028 / 1.046). Corpus
# variant solid fuel 11 (SAP 634).
"RWN": 20,
# Electric storage / direct-acting main heating systems — Table 32
# code 30 (standard electricity; tariff resolved separately from
# `meter_type` per `_rdsap_tariff`). Three EES codes share the
# electricity fuel route:
# WEA — corpus variant electric 11 (SAP 515 = electric warm-air)
# REA — corpus variant electric 12 (SAP 691)
# OEA — corpus variants electric 13 + 14 (SAP 701)
# The §14.0 "Fuel Type" field is absent on these certs (same
# lodging pattern as the solid-fuel block above); the EES code is
# the only fuel discriminator and unambiguously identifies electric
# storage main heating. Fuel cost / CO2 / PE billed via Table 32
# standard-electricity codes (30 high-rate, 31/33/35/40 low-rate
# per tariff).
"WEA": 30,
"REA": 30,
"OEA": 30,
# "No heating system" lodging — Elmhurst §14.0 Main Heating EES =
# NON + SAP code 699. SAP 10.2 §A.2.2 assumes portable electric
# heaters when no heating system is identified, so the fuel routes
# to standard electricity (code 30). Corpus variant "no system".
"NON": 30,
# Bio-liquid main heating fuels — Table 12 / Table 32 codes verbatim
# (the bio-liquid Table 32 codes 71/73/75/76 are not collided by any
# API enum value, so they pass through `unit_price_p_per_kwh` etc.
# unchanged). Spec: SAP 10.2 Table 12 (PDF p.189) notes (d)/(e)/(f).
#
# BFD — bio-liquid HVO from used cooking oil — Table 32 code 71
# (6.79 p/kWh, 0.036 CO2, 1.180 PE). Corpus variant oil 2
# (SAP 127).
# BXE — bio-liquid FAME from animal/vegetable oils — Table 32
# code 73 (6.79 p/kWh, 0.018 CO2, 1.180 PE). Corpus
# variant oil 3 (SAP 128).
# BXF — bio-liquid FAME alt — Table 32 code 73 (same fuel as
# BXE; different SAP code 129). Corpus variant oil 4.
# BZC — bioethanol from any biomass source — Table 32 code 76
# (47.0 p/kWh, 0.105 CO2, 1.472 PE). Corpus variant
# oil 5 (SAP 126).
# B3C — B30K (30% FAME + 70% kerosene) — Table 32 code 75
# (5.49 p/kWh, 0.214 CO2, 1.136 PE). Corpus variant
# oil 6 (SAP 126).
"BFD": 71,
"BXE": 73,
"BXF": 73,
"BZC": 76,
"B3C": 75,
}
# Elmhurst §14.1 "Community Fuel Type" labels mapped to the SAP 10.2
# Table 12 heat-network boiler fuel code (PDF p.189). Used when
# `community_heat_source == "Boilers"` — the upstream fuel determines
# which 51-58 row applies. CHP is fuel-agnostic at the Table 12 cost /
# CO2 / PE level (code 48 carries the same factors irrespective of
# upstream fuel); Heat-pump networks always route to code 41.
#
# Spec-correct codes from SAP 10.2 Table 12:
# 51 = heat from boilers — mains gas
# 52 = heat from boilers — LPG
# 53 = heat from boilers — oil
# 54 = heat from boilers — coal
# 43 = heat from boilers — biomass
_ELMHURST_COMMUNITY_BOILER_FUEL_TO_TABLE_12: Final[dict[str, int]] = {
"Mains Gas": 51,
"Mineral oil or biodiesel": 53,
"Coal": 54,
"Biomass": 43,
}
def _resolve_community_heating_fuel_code(
heat_source: str, community_fuel: str,
) -> Optional[int]:
"""Resolve the SAP 10.2 Table 12 (PDF p.189) heat-network fuel code
from the §14.1 "Community Heat Source" + "Community Fuel Type"
pair. Returns None when the heat-source string isn't recognised
(mapper-coverage gap for a future fixture).
Dispatch table (verified against corpus block 10b/11b/12b/13b):
- "Combined Heat and Power" 48 (heat from CHP; fuel-agnostic)
- "Heat pump" 41 (heat from electric heat pump)
- "Boilers" + upstream fuel 51/52/53/54/43 per
`_ELMHURST_COMMUNITY_BOILER_FUEL_TO_TABLE_12`
"""
if heat_source == "Combined Heat and Power":
return 48
if heat_source == "Heat pump":
return 41
if heat_source == "Boilers":
return _ELMHURST_COMMUNITY_BOILER_FUEL_TO_TABLE_12.get(community_fuel)
return None
# RdSAP 10 §C / SAP 10.2 Appendix C default CHP heat fraction (PDF p.58).
# Spec text verbatim: "If CHP (waste heat or geothermal treat as CHP):
# fraction of heat from CHP = 0.35; CHP overall efficiency 75%; heat to
# power ratio = 2.0; boiler efficiency 80%." Applied when no PCDB
# record overrides — the modal case for non-PCDB community-heated certs.
_RDSAP_COMMUNITY_CHP_FRACTION_DEFAULT: Final[float] = 0.35
def _elmhurst_community_chp_split(
community: Optional[CommunityHeating],
) -> tuple[Optional[float], Optional[int]]:
"""Return the (chp_fraction, boiler_fuel_code) pair for the cascade
to use when computing CHP+boilers heat-network cost / CO2 / PE.
Returns (None, None) when:
- the §14.1 block is absent (individually-heated dwelling);
- the §14.1 Heat Source is not CHP (Boilers-only or Heat-pump
networks bill at a single Table 12 code via the main fuel).
Returns (0.35, boiler_fuel_code) for CHP+boilers configurations.
The boiler fuel code is resolved from the §14.1 Community Fuel
Type via `_ELMHURST_COMMUNITY_BOILER_FUEL_TO_TABLE_12`; per Table
12 PDF p.189 all heat-network-boiler codes 51-58 carry the same
cost rate (4.24 p/kWh) but distinct CO2 / PE factors keyed on the
upstream fuel.
"""
if community is None:
return None, None
if community.community_heat_source != "Combined Heat and Power":
return None, None
boiler_code = _ELMHURST_COMMUNITY_BOILER_FUEL_TO_TABLE_12.get(
community.community_fuel_type,
)
return _RDSAP_COMMUNITY_CHP_FRACTION_DEFAULT, boiler_code
class UnmappedElmhurstLabel(ValueError):
"""An Elmhurst Summary lodged a finite-enum label that the mapper
does not yet know how to translate to the SAP10 cascade enum.
@ -4555,7 +4766,15 @@ def _map_elmhurst_main_heating_2(
def _map_elmhurst_sap_heating(survey: ElmhurstSiteNotes) -> SapHeating:
mh = survey.main_heating
# Community heating dwellings lodge the SAP control code in §14.1
# Community Heating "Heating Controls SAP" (bare 4-digit form, e.g.
# "2306"), not in §14.0 Main Heating "Main Heating Controls Sap".
# Fall through to the §14.1 lodging when §14.0 is empty so the
# cascade reads `main_heating_control` as the lodged Table 4e Group 3
# code instead of defaulting to type 2.
sap_control = mh.heating_controls_sap
if not sap_control and mh.community_heating is not None:
sap_control = mh.community_heating.heating_controls_sap
control = (
sap_control.split(", ", 1)[1]
if sap_control.startswith("SAP code") and ", " in sap_control
@ -4604,6 +4823,19 @@ def _map_elmhurst_sap_heating(survey: ElmhurstSiteNotes) -> SapHeating:
and mh.main_heating_sap_code in _LIQUID_FUEL_BOILER_SAP_MAIN_HEATING_CODES
):
main_fuel_int = water_heating_fuel
# Gas / LPG boilers: SAP 10.2 Table 4b codes 101-119 (PDF p.168)
# identify a gas-family boiler but not the specific carrier (mains
# gas vs LPG vs biogas). The newer Elmhurst export leaves §14.0
# "Fuel Type" empty and lodges only the SAP code (e.g. 104 condensing
# combi, EES "BGW"); the §15.0 "Water Heating Fuel Type" names the
# carrier because the same combi/boiler heats space + water. Adopt it
# only when it resolves to a gas/LPG fuel, so a regular boiler paired
# with an electric immersion (where §15.0 lodges "Electricity") still
# strict-raises rather than mis-billing the gas boiler as electric.
if main_fuel_int is None:
main_fuel_int = _elmhurst_gas_boiler_main_fuel(
mh.main_heating_sap_code, water_heating_fuel,
)
# Solid-fuel main heating: SAP code rows 150-160 (open / closed
# room heaters with boiler) and 600-636 (independent solid-fuel
# boilers) cover multiple distinct fuels under a single Table 4a
@ -4617,6 +4849,22 @@ def _map_elmhurst_sap_heating(survey: ElmhurstSiteNotes) -> SapHeating:
and mh.main_heating_ees in _ELMHURST_MAIN_HEATING_EES_TO_FUEL_CODE
):
main_fuel_int = _ELMHURST_MAIN_HEATING_EES_TO_FUEL_CODE[mh.main_heating_ees]
# Community heating: §14.0 lodges EES='COM' + a Table 4a heat-network
# SAP code (301/302/304) but no §14.0 Fuel Type. The §14.1 Community
# Heating/Heat Network block carries the actual heat source (Boilers
# / CHP / Heat pump) + upstream fuel (Mains Gas / Electricity /
# Mineral oil or biodiesel / Coal) which together resolve the
# Table 12 heat-network fuel code (PDF p.189, codes 41/43/48/51-58).
# Cascade routes through `_is_heat_network_main` (which keys on the
# SAP code) for the DLF and seasonal-efficiency overrides.
if (
main_fuel_int is None
and mh.community_heating is not None
):
main_fuel_int = _resolve_community_heating_fuel_code(
mh.community_heating.community_heat_source,
mh.community_heating.community_fuel_type,
)
heat_emitter_int = _elmhurst_heat_emitter_int(
mh.heat_emitter,
main_floor=survey.floor,
@ -4659,6 +4907,15 @@ def _map_elmhurst_sap_heating(survey: ElmhurstSiteNotes) -> SapHeating:
1 for s in survey.baths_and_showers.showers
if s.outlet_type != "Electric shower"
)
# Community heating CHP-split: RdSAP 10 §C / SAP 10.2 Appendix C
# default for heat networks combining CHP and back-up boilers
# (SAP code 302 "Community heating with CHP" + §14.1 Community Heat
# Source = "Combined Heat and Power"). Per RdSAP 10 PDF p.58: 35%
# heat from CHP, 65% from boilers (default when no PCDB record).
# The cascade prices each share at its own Table 12 fuel-code rate.
chp_fraction, chp_boiler_fuel_int = _elmhurst_community_chp_split(
mh.community_heating,
)
main_1_detail = MainHeatingDetail(
has_fghrs=survey.renewables.flue_gas_heat_recovery_present,
# Prefer SAP integer codes when the Elmhurst string maps
@ -4686,6 +4943,8 @@ def _map_elmhurst_sap_heating(survey: ElmhurstSiteNotes) -> SapHeating:
# The cascade's `seasonal_efficiency` reads this when
# there is no PCDB Table 105/362 record to override.
sap_main_heating_code=mh.main_heating_sap_code,
community_heating_chp_fraction=chp_fraction,
community_heating_boiler_fuel_type=chp_boiler_fuel_int,
)
# §14.1 Main Heating2 — second main system, when lodged. Typically
# services DHW via `Water Heating SapCode 914` ("from second main

View file

@ -246,6 +246,41 @@ class MainHeating2:
main_heating_sap_code: Optional[int] = None
@dataclass
class CommunityHeating:
"""Elmhurst §14.1 "Community Heating/Heat Network" block. Lodged
when the §14.0 Main Heating SAP code identifies a heat-network row
(Table 4a 301-304). Mutually exclusive with `MainHeating2` at the
§14.1 level (the extractor closes §14.0 at whichever §14.1 form
appears first).
The §14.0 "Main Heating SAP Code" identifies the Table 4a category
(301 = community boilers, 302 = CHP + boilers, 304 = community heat
pump), but the fuel that ultimately bills the cascade comes from
the Community Fuel Type field combined with the Community Heat
Source. See SAP 10.2 Table 12 (PDF p.189) heat-network fuel codes:
- Boilers + Mains Gas code 51
- Boilers + Mineral oil code 53
- Boilers + Coal code 54
- Boilers + Biomass code 43
- Combined Heat and Power code 48 (fuel-agnostic)
- Heat pump + Electricity code 41
"""
heating_type: str = "" # "Space and Water Heating"
pcdf_boiler_reference: Optional[str] = None
community_heat_source: str = "" # "Boilers" / "Combined Heat and Power" / "Heat pump"
community_fuel_type: str = "" # "Mains Gas" / "Electricity" / "Mineral oil or biodiesel" / "Coal"
heating_controls_ees: str = ""
heating_controls_sap: str = ""
# SAP 10.2 Appendix C — CHP Fuel Factor lookup label. Drives the
# CHP-vs-boiler heat-fraction split when `community_heat_source ==
# "Combined Heat and Power"`. Absent on non-CHP networks (e.g.
# CH1 boilers-only / CH3 heat-pump only).
chp_fuel_factor: Optional[str] = None
@dataclass
class MainHeating:
heat_emitter: str # e.g. "Radiators"
@ -289,6 +324,11 @@ class MainHeating:
# the §14.1 block is absent OR lodges only placeholder zeros (PCDB-
# only certs). See `MainHeating2` docstring above.
main_heating_2: Optional[MainHeating2] = None
# §14.1 "Community Heating/Heat Network" block — Optional, lodged
# in place of Main Heating2 when the §14.0 SAP code identifies a
# heat-network row (Table 4a 301/302/304). Mutually exclusive with
# `main_heating_2`. None on individually-heated dwellings.
community_heating: Optional[CommunityHeating] = None
@dataclass

153
docs/HANDOVER_ARA_NEXT.md Normal file
View file

@ -0,0 +1,153 @@
# Handover — Ara backend: Property Baseline (SAP calculator) + Modelling
You are picking up a clean, merged baseline. The `ara_first_run` backend rebuild is
**done and shipped**; the next two fronts are (1) wiring the SAP calculator into
Property Baseline, and (2) starting Modelling. This doc is the orientation — the ADRs
and CONTEXT.md are authoritative for decisions; don't re-derive them.
## Where things stand
- The **`ara_first_run` rebuild is complete and merged to `main`** (via
`feature/per-cert-mapper-validation`): the full pipeline spine
**Ingestion → Baseline → Modelling(stub)** on a flat-hexagonal layout with a
per-stage Unit-of-Work. Issues #1129#1138 (parent PRD #1128) are all done.
- **Branch + worktree:** you are on `feature/property-baseline-sap10`, cut from the
up-to-date `feature/per-cert-mapper-validation` (which contains `main` + the merged
ara work + the ongoing per-cert SAP-calculator validation slices). Worktree:
`/workspaces/home/hestia-worktrees/model-assemble-new-backend`. The
`/workspaces/model` worktree holds `feature/per-cert-mapper-validation` itself.
- **PRs go into `feature/per-cert-mapper-validation`, NOT `main` directly** — one PR
per slice, the rhythm used for #1129#1138.
## Read first (authoritative — don't re-derive)
- **ADRs** `docs/adr/`: 0002 (Property aggregate root), 0003 (strict Ingestion→Modelling
separation, amended), 0004 (BaselinePerformance = Lodged+Effective pair, amended for
the standalone table), 0005 (multi-phase Scenarios, per-phase recompute — **governs
Modelling**), 0006/0007 (deterministic kWh / kWh-as-ML-target), 0009+0010
(deterministic SAP calculator + its spec target & validation cohort), 0011 (composable
stage orchestrators, one lambda per use case, stages talk through repos), 0012
(Unit-of-Work per-stage batch transaction).
- **CONTEXT.md** — the glossary; use this vocabulary in code + commits.
- **`ara_backend_design.md`** is a **stale draft PRD** — its architecture sections are
superseded by ADR-0011/0012 (a banner now says so). Trust the ADRs, not it.
## Architecture (current — flat hexagonal at repo root)
```
applications/<lambda>/ thin handler + trigger body + Dockerfile + local_handler
orchestration/ stage orchestrators + AraFirstRunPipeline (deps injected)
domain/ pure aggregates + services
repositories/<agg>/ port (ABC) + adapter (*_postgres_repository / *_s3_repository)
infrastructure/ clients + SQLModel rows (*_table.py) + engine/config
```
Stages communicate **only through repos**, threading just `property_ids` — never an
in-memory hand-off (ADR-0011/0003). Each stage runs its batch in **one Unit of Work and
commits once** (ADR-0012); all-or-nothing per batch, fail noisily → subtask FAILED →
debug & re-run; re-runs are idempotent (replace-by-`property_id`). Ingestion is
fetch-then-write so a DB connection is never held during external IO.
## Key files (note the recent rename: baseline → property_baseline; FirstRun → AraFirstRun)
- `orchestration/ara_first_run_pipeline.py``AraFirstRunPipeline`, `AraFirstRunCommand`,
the `IngestionStage`/`PropertyBaselineStage`/`ModellingStage` Protocols.
- `orchestration/property_baseline_orchestrator.py``PropertyBaselineOrchestrator`
(**this is where the SAP calculator gets wired**).
- `orchestration/ingestion_orchestrator.py`, `orchestration/modelling_orchestrator.py` (stub).
- `domain/property_baseline/``PropertyBaselinePerformance`, `Performance`,
`lodged_performance()`, `Rebaseliner`/`StubRebaseliner`.
- `repositories/property_baseline/` (port + postgres adapter),
`repositories/unit_of_work.py` + `repositories/postgres_unit_of_work.py`.
- `repositories/scenario/`, `repositories/materials/`**empty seam ports** for Modelling.
- `infrastructure/postgres/property_baseline_performance_table.py` — flat-column row.
- `applications/ara_first_run/handler.py``build_first_run_pipeline` wiring +
`_source_clients_from_env` (a seam that **raises** — see Stubs below).
- **SAP calculator (for task 1):** `domain/sap10_calculator/calculator.py`, class
`Sap10Calculator`, returns a `SapResult` (5 quantities + monthly + worksheet audit).
It is mature and heavily validated by the per-cert work on this branch.
## Conventions + gotchas
- **TDD**, one test → one impl; `# Arrange / # Act / # Assert` headers; **commit per
slice** with a spec/ADR citation and the
`Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>` trailer.
- Tests: real ephemeral PostgreSQL via the `db_engine` fixture (JSONB needs real PG).
**Orchestrator/repo unit tests use fakes**`tests/orchestration/fakes.py`
(`FakeUnitOfWork` exposing `property`/`epc`/`solar`/`property_baseline` repos + commit
count). Run with `-p no:cacheprovider`; ignore coverage spam.
- **pyright strict, zero errors.** Known noise to ignore: a `venvPath` warning; the
`moto`-not-installed import errors in `test_postcode_splitter_orchestrator.py` +
`test_user_address_csv_s3_repository.py` (those modules don't collect — `--ignore`
them); and 4 pre-existing failures outside `tests/` (summary_pdf_mapper_chain ×3 +
from_rdsap_schema total_floor_area).
- **Pushing from this worktree:** the VS Code git credential helpers are broken
(missing node binaries), so use a one-shot gh override:
`git -c credential.helper= -c credential.helper='!gh auth git-credential' push`.
## Next task 1 — SAP calculator on Property Baseline (the user expects this to be simple)
Wire `Sap10Calculator` into `PropertyBaselineOrchestrator` to produce **Calculated SAP10
Performance** per property. Per CONTEXT (≈line 100), this is a quantity **distinct from**
Lodged/Effective Performance — surfaced *alongside* them during the validation phase; it
may supersede Effective Performance in a later ADR once parity is confirmed (ADR-0009/0010).
**Grill these two before coding (`/grill-with-docs`):**
1. **Where it sits.** Recommended: a *third* value-set on `PropertyBaselinePerformance`
(`calculated: Performance` + its space/water kWh), persisted as `calculated_*` columns
on `property_baseline_performance`**not** an overwrite of `effective`. Pin the
aggregate shape + table migration in one pass (the table migration is FE-owned/Drizzle —
see `docs/migrations/property-baseline-performance-table.md`).
2. **Failure posture.** The calculator strict-raises (`UnmappedSapCode`, etc.) on certs it
can't yet handle. Running it over a real cohort *surfaces those gaps* — which is the
validation work `feature/per-cert-mapper-validation` exists for. Decide: let the raise
abort the batch (ADR-0012 all-or-nothing), or collect/skip-and-report. This is the main
judgment call; "simple to wire" but it lights up the validation surface.
Then TDD: inject the calculator into `PropertyBaselineOrchestrator`, call it on the
Effective EPC, persist the calculated set in the same unit.
## Next task 2 — Modelling (Recommendations / Optimiser / Plans)
`ModellingOrchestrator.run(property_ids, scenario_ids)` is a **no-op stub**;
`ScenarioRepository` and `MaterialsRepository` are **empty seam ports**. Building this out
is the third stage. ADR-0005 (multi-phase Scenarios, per-phase recompute) governs it.
Relevant CONTEXT terms: Modelling (stage), Scenario, Scenario Phase, Scenario Snapshot,
Optimised Package, Plans, Recommendations, Optimiser Service.
Before coding, grill the port shapes + the Scenario/Materials domain aggregates. Two
known open points:
- **`MaterialsRepository` naming.** A PR reviewer suggested `BuildingMaterialsRepository`;
this was **deliberately deferred to this grill** because "building materials" may
under-describe retrofit measures (a heat pump / ASHP is a *measure/product*, not a
building material). Settle the term (Materials / Measures / Products / BuildingMaterials)
here.
- **Modelling will need a Unit of Work** when it writes Plans — the stub currently takes
no `unit_of_work`; it gains one (ADR-0012) when its body is built.
## Stubs / seams that raise or no-op (do NOT mistake for "done")
- `applications/ara_first_run/handler.py::_source_clients_from_env` — **raises**
`NotImplementedError`. EPC-API / Google-Solar / geospatial-S3 client config + env-var
names + pandas/s3fs deps + Terraform wiring are a separate deploy piece (out of scope so
far). The lambda is not end-to-end runnable until this is filled in.
- `ModellingOrchestrator.run` — no-op.
- `ScenarioRepository` / `MaterialsRepository` — empty ABC ports.
- `StubRebaseliner` — raises `RebaselineNotImplemented` on pre-SAP10 certs (`sap_version
< 10`); ML Rebaselining is not implemented.
- **EPC Energy Derivation** (fuel split + bills + the Ofgem-cap Fuel Rates ETL) is
deferred — kWh is carried on `PropertyBaselinePerformance`, the rest is not.
## Known doc drift to be aware of (flagged, intentionally not auto-fixed)
- **CONTEXT.md term vs code class.** The glossary term is **"Baseline Performance"**; the
code class is **`PropertyBaselinePerformance`** (renamed on PR review). The glossary was
*deliberately* left un-renamed — treat "Baseline Performance" as the spoken concept and
`PropertyBaselinePerformance` as its class. If you want them aligned, rename the term to
"Property Baseline Performance" across CONTEXT + ADR prose (a quick, mechanical change).
## Issues / process
Parent PRD: `gh issue view 1128 --repo Hestia-Homes/Model`. #1129#1138 done (each with a
"Done." comment). New work → new issues (use `/to-issues` or `/triage`), `ready-for-agent`
labelled, parented to #1128.

View file

@ -0,0 +1,109 @@
---
Status: accepted
---
# The `Sap10Calculator` produces Effective Performance (it is the Rebaseliner); Calculated SAP10 Performance is not a persisted third value-set, and is wired in shadow first
Refines [ADR-0004](0004-baseline-performance-lodged-effective-pair.md) (the Lodged/Effective
pair), [ADR-0009](0009-deterministic-sap-calculator.md)/[ADR-0010](0010-sap10-calculator-spec-target-and-validation.md)
(the calculator + the **Calculated SAP10 Performance** term), [ADR-0011](0011-composable-stage-orchestrators.md)
(the `Rebaseliner` seam) and [ADR-0012](0012-unit-of-work-per-stage-batch-transaction.md)
(all-or-nothing per batch). Decided in a `/grill-with-docs` session (2026-06-01) before wiring
`Sap10Calculator` into `PropertyBaselineOrchestrator`.
## Context
The old `model_engine` (`backend/engine/engine.py`) called out to an **ML API**
(`model_api.predict_all` over `BASELINE_MODEL_PREFIXES`) to rebaseline the properties that needed
it. The rebuild replaces that round-trip with the **deterministic `Sap10Calculator`, run live**.
The handover and CONTEXT (line 100) framed **Calculated SAP10 Performance** as a *third* value-set
persisted *alongside* Lodged and Effective (`calculated_*` columns). Walking the baselining
scenarios shows that framing reifies a distinction that does not exist in the domain:
- real lodged SAP10 EPC, no overrides ⇒ Calculated = Lodged = Effective;
- real EPC + property/landlord overrides ⇒ Calculated = Lodged-plus-overrides = Effective;
- estimated EPC (± overrides), or a pre-SAP10 EPC ⇒ Calculated = Effective (no lodged SAP10 to
compare against — Lodged Performance exists only for a *real lodged* EPC).
In every scenario **Effective = Calculated**. There is no third quantity.
## Decision
**The calculator is the mechanism that produces Effective Performance** — i.e. the deterministic
`Rebaseliner` (ADR-0011's seam), superseding the old ML-API rebaseliner. "Calculated SAP10
Performance" is the *name of that output during validation*, **not** a separately-persisted third
value-set. No `calculated_*` columns are added; `property_baseline_performance` keeps its
Lodged/Effective shape (ADR-0004). The ADR-0009 ML model is repositioned as a *future residual head*
over the calculator, not the baseline producer.
**Shadow-first, then promotion.** The calculator still strict-raises (`UnmappedSapCode`,
`MissingMainFuelType`, `UnresolvedPcdbCombiLoss`) on cert mappings it has not yet hardened, and the
strict-typing of `EpcPropertyData` that will close most of those gaps is still pending. A ~40,000
property test cohort is about to flow through baselining. So this lands in two steps:
1. **This slice — shadow.** Performance is still **defined by the input data**: `StubRebaseliner`
keeps producing Effective (`= Lodged` for the only live scenario, real SAP10 + no overrides).
The calculator runs *beside* it, on every Property's Effective EPC, **purely to be battle-tested
in the wild**. It is **not load-bearing**, therefore:
- a calculator raise is **caught and logged at `error`, never aborts the batch** — otherwise one
unmappable cert would lose the load-bearing Lodged/Effective write for the whole batch, and
over a 40k run most batches would never baseline;
- on success, its output is **compared to Lodged and logged, not persisted**`warning` when
`|sap_continuous lodged_sap| > 0.5`, or PEUI / CO2 diverge beyond tolerance (CO2 after the
kg→tonnes conversion). Each log is tagged with the cert's `sap_version` so SAP-10.2 divergence
(a real calculator signal) is separable from older-spec drift (expected — see
[ADR-0010](0010-sap10-calculator-spec-target-and-validation.md) Validation Cohort).
2. **Next slice or two — load-bearing.** When overrides + EPC estimation land (days away),
`StubRebaseliner` is replaced by a calculator-backed `Rebaseliner`: the calculator's output
**becomes Effective Performance**. The failure posture **flips to abort** per ADR-0012 — now that
the calculator *is* the baseline, a silent wrong answer is the expensive outcome, so a raise must
fail the batch noisily. Same exception, opposite handling, because the calculator went from
shadow to load-bearing. The shadow logging is then retired.
## Considered options
- **A third persisted `calculated_*` value-set on `PropertyBaselinePerformance`** (the handover's
recommendation) — rejected: `Effective = Calculated` in every scenario, so the columns would
store a distinction with no domain reality, and the future "supersede effective" promotion would
be a data move instead of nothing.
- **Promote the calculator to drive Effective immediately** — rejected for this one slice: it still
strict-raises on un-hardened mappings, so over the imminent 40k run it would gate the
load-bearing baseline write. Shadow-first surfaces every gap as an aggregatable error log without
blocking baselining.
- **A separate `calculator_shadow` validation table** — held in reserve: log-only is enough while
the calculator is moving and the shadow step is a 12 day stepping stone; we add a queryable table
only if log aggregation proves too weak.
## Consequences
- `property_baseline_performance` is **unchanged** this slice — no migration.
- CONTEXT **Calculated SAP10 Performance**, **Effective Performance**, and **Rebaselining** are
updated: the calculator (not ML) is the rebaseliner mechanism in the rebuilt engine; Calculated is
not a stored third set.
- The shadow runner's broad `except` is deliberate (the point is to discover *what* breaks in the
wild); each caught exception is logged with its type and `property_id`.
- This decision is short-lived in its shadow form by design; the durable half — "the calculator
produces Effective Performance; there is no third value-set" — outlives it.
## Amendment (2026-06-02): shadow collapsed — the calculator is load-bearing now
The shadow stepping-stone was right in shape but wrong in duration: the calculator was ready, and
wiring [Bill Derivation](0014-bill-derivation-from-real-fuel-rates.md) onto its delivered-kWh
breakdown makes it load-bearing for *bills on every property* — so the "shadow until overrides /
estimation land" timeline collapses to now. The durable decision stands (calculator produces
Effective Performance; no third value-set); only the timing changes:
- **`sap_version < 10.2`** → effective performance **is** the calculator's output (the
`StubRebaseliner` floor moves `10.0 → 10.2`; mechanism is the calculator, not ML).
- **`sap_version ≥ 10.2`** → effective = the API's lodged figures; the calculator still runs
**alongside, logging divergence** (the surviving half of the shadow runner) as a validation signal.
- **Failure posture flips to abort:** the calculator is load-bearing for Bill Derivation regardless
of version, so a strict-raise **aborts the batch** (ADR-0012) — the un-mapped cert is fixed
immediately rather than skipped. The shadow's catch-and-log of raises is retired; divergence
*warnings* on `≥ 10.2` certs remain.
The `≥1000-cert parity` gate from ADR-0009/0010 still governs whether the calculator's figures are
*trusted as definitive* for the SAP-10.2 cohort, but it no longer gates *wiring* — pre-10.2 certs
have no current-spec lodged figure to fall back to, so the calculator is the only source there.

View file

@ -0,0 +1,103 @@
---
Status: accepted
---
# Bill Derivation: whole-home annual bill from the calculator's delivered kWh × real Fuel Rates (not SAP prices)
Lifts the bills/fuel-split deferral in [ADR-0004](0004-baseline-performance-lodged-effective-pair.md)
and its migration note, and builds on [ADR-0013](0013-calculator-produces-effective-performance-shadow-first.md)
(the calculator is load-bearing). Decided in a `/grill-with-docs` session (2026-06-02).
## Context
ADR-0004's amendment deferred fuel split + bills "because bills require a current Fuel Rates
source (Ofgem-cap ETL) that does not yet exist." A static snapshot lifts that blocker. The old
`backend/ml_models/AnnualBillSavings.py` is the fragile reference (a blended `PRICE_FACTOR`, two
disagreeing rate sources, a standing-charge precedence bug, a 10× unit slip) — we rewrite, not port.
## Decisions
### 1. The bill is whole-home, composed per end use, from the calculator's delivered kWh
`SAP10 Calculation` already emits delivered (post-efficiency, billable) kWh for every regulated end
use — main/secondary heating, hot water, pumps/fans, lighting, cooling — and computes appliances +
cooking electricity internally (Appendix L L13-L20). **`BillDerivation`** consumes that per-end-use
breakdown and produces per-section costs + a total. The EPC lodges no per-end-use kWh, so the
calculator is the only source — which is why it is **load-bearing for bills regardless of
`sap_version`** (a raise aborts the batch, ADR-0013).
### 2. Bills use real Fuel Rates, not the calculator's `total_fuel_cost_gbp`
The calculator's fuel cost is the SAP-rating notional cost at **RdSAP Table 32 standardised
prices** — deliberately frozen for rating comparability, and ~half the real electricity price
(Table 32 elec ~13 p/kWh vs Ofgem AprJun 2026 cap ~24.7 p/kWh). Billing on it would roughly halve
an electric/heat-pump home's bill. So `BillDerivation` **re-prices** the delivered kWh at current
**Fuel Rates**, and the calculator's `total_fuel_cost_gbp` is used only for the SAP rating.
### 3. Fuel Rates = committed static snapshot, read via `FuelRatesRepository`
A national snapshot (Ofgem-cap period for gas/electricity, DESNZ/NEP for off-gas fuels), keyed by a
canonical **`Fuel`** enum (`MAINS_GAS, ELECTRICITY, ELECTRICITY_OFF_PEAK, OIL, LPG, SMOKELESS,
WOOD_LOGS, WOOD_PELLETS, HEAT_NETWORK`), each entry carrying `unit_rate_p_per_kwh` +
`standing_charge_p_per_day`, plus a top-level `seg_export_p_per_kwh`. The calculator's per-end-use
SAP fuel codes map to this enum via the existing `is_gas_code` / `is_electric_fuel_code` /
`is_liquid_fuel_code` helpers — so the snapshot and the calculator meet at one vocabulary, not raw
SAP codes. Read through a `FuelRatesRepository` port (ADR-0011: a Repo reads stored reference data
by key); an Ofgem-cap ETL automating the refresh is future, behind the same port — not a
prerequisite. National now; the 14 cap regions are a later refinement behind the same port.
### 4. Bill arithmetic
Total = Σ (per-end-use delivered kWh × that end use's fuel unit rate) + per-meter **standing
charges** (metered fuels only — gas/electricity; oil/LPG/solid have none) **SEG** export credit on
PV. Off-peak electricity splits day/night via the calculator's existing Table 12a high/low-rate
fractions.
### 5. Strict-raise on an unpriced fuel
`BillDerivation` **raises** on a fuel it has no rate for — same discipline as the calculator. Two
named gaps surface immediately rather than billing at a wrong default:
- **House coal** — no standard domestic price (its domestic sale is illegal in England).
- **Communal / heat network** — scheme-specific, no national tariff. The one common case (flats);
a heat-network rate model is a named follow-up.
### 6. Persistence: flat per-section columns on `property_baseline_performance`
The energy block lands as **flat typed columns** on the existing row (ADR-0004's flat-column rule
holds — the SAP end-uses are a *fixed enumerable set*, so there is no column explosion and no
variable-shape JSON): per-section `*_kwh` + `*_cost_gbp` (heating, hot water, lighting, appliances,
cooking, pumps/fans), `standing_charges_gbp`, `seg_credit_gbp`, and `total_annual_bill_gbp`. The
production migration is FE-owned (Drizzle); `docs/migrations/` updated.
## Consequences
- `BillDerivation` is named for the operation, **no "Service" suffix** (user preference).
- A `Fuel` enum + a SAP-code→`Fuel` mapping become first-class; `FuelRates` + `FuelRatesRepository`
+ a committed snapshot file are new.
- Carbon emissions are unaffected (they stay on Lodged/Effective Performance from the calculator's
CO2 factors); this ADR is about £ bills only.
- The snapshot goes stale on the Ofgem-cap cadence (quarterly); the file records its period, and the
ETL that automates refresh is the deferred follow-up.
## Deferred / TODO
- **Appliances + cooking kWh** are computed inside `cert_to_inputs` (Appendix L L13-L20) but not
yet threaded onto `SapResult`. Until they are, the `SapResult``EnergyBreakdown` adapter
**stubs them at 0 kWh**, so the bill total currently understates by the unregulated electricity
load. Khalim is adding the fields to `SapResult` directly; the adapter wires the
`APPLIANCES`/`COOKING` sections in as soon as they land.
- **Off-peak (Economy 7) day/night split** — the snapshot carries the E7 day/night rates, but
`FuelRates` exposes single-rate fuels only; the day/night accessor + the calculator's Table 12a
high/low-rate split land in a later slice.
- **Heat-network rate model** — heat-network certs raise `UnpricedFuel` for now (the one common gap).
- **Regional rates + Ofgem-cap ETL** — national snapshot now; both are later refinements behind the
same `FuelRatesRepository` port.
## Considered alternatives
- **Bill from `RenewableHeatIncentive` heating+HW kWh only** (CONTEXT's original scope) — rejected:
the user wants the whole-home bill, and heating+HW omits lighting/appliances/cooking, which only
the calculator supplies.
- **Bill at SAP Table 32 prices** — rejected: standardised rating prices, ~half real electricity.
- **JSON `bill_breakdown` block** — rejected: end-uses are fixed-cardinality, so flat columns are
clean and stay queryable (ADR-0004).

View file

@ -27,17 +27,45 @@ straight lift-and-shift of the columns below.
| `effective_co2_emissions_t_per_yr` | float | tonnes CO₂/yr (whole dwelling) |
| `effective_primary_energy_intensity_kwh_per_m2_yr` | int | kWh/m²/yr |
| `rebaseline_reason` | text | `none` \| `pre_sap10` \| `physical_state_changed` \| `both` |
| `space_heating_kwh` | float | off `renewable_heat_incentive`; deterministic (ADR-0006) |
| `water_heating_kwh` | float | off `renewable_heat_incentive` |
| `space_heating_kwh` | float | EPC `renewable_heat_incentive` recorded demand. **Superseded** by `heating_kwh` (delivered) when the bill block populates; kept until then to avoid an empty-kWh gap, dropped in the population slice. |
| `water_heating_kwh` | float | EPC `renewable_heat_incentive`; **superseded** by `hot_water_kwh`. |
This slice has no ML rebaselining, so `effective_* == lodged_*` and `rebaseline_reason = 'none'`
for every row written (a pre-SAP10 cert raises rather than persisting a wrong-but-plausible row —
see #1135). The `effective_*` columns exist now so the table shape is stable when ML lands.
### Bill block (ADR-0014) — the energy bill, composed per section
## Deferred (follow-up — EPC Energy Derivation + Fuel Rates)
Produced by **Bill Derivation**: the calculator's **delivered** kWh per end use priced at current
**Fuel Rates** (a committed snapshot, not SAP's standardised prices), per section + the total.
Per-section kWh is *delivered fuel* (demand ÷ efficiency — what the household pays for), distinct
from the recorded-demand `space_heating_kwh`/`water_heating_kwh` above which it supersedes.
`fuel_split` and `bills` are **not** in this table yet. They are produced by
`EpcEnergyDerivationService`, which needs a current **Fuel Rates** source (Ofgem-cap ETL) that does
not exist yet. They land together in the follow-up so this table is not migrated twice. Likely
shape: a `bills`-style block (per-fuel kWh + standing charge + SEG) — to be specified in that
slice's migration note.
| Column | Type | Notes |
|---|---|---|
| `fuel_rates_period` | text | which Fuel Rates snapshot priced this bill (e.g. `"2026-04 to 2026-06"`) — provenance |
| `heating_kwh` | float | delivered fuel kWh (main + secondary heating) |
| `heating_cost_gbp` | float | priced at the heating fuel's current rate |
| `hot_water_kwh` | float | |
| `hot_water_cost_gbp` | float | |
| `lighting_kwh` | float | |
| `lighting_cost_gbp` | float | |
| `appliances_kwh` | float | unregulated load — **0 until the appliances/cooking fields land on `SapResult`** (ADR-0014 TODO) |
| `appliances_cost_gbp` | float | |
| `cooking_kwh` | float | unregulated load — 0 until `SapResult` carries it |
| `cooking_cost_gbp` | float | |
| `pumps_fans_kwh` | float | |
| `pumps_fans_cost_gbp` | float | |
| `cooling_kwh` | float | mostly 0 in UK homes; carried for completeness as it affects the bill |
| `cooling_cost_gbp` | float | |
| `standing_charges_gbp` | float | daily standing charge × 365, once per distinct metered fuel (off-gas fuels have none) |
| `seg_credit_gbp` | float | SEG export credit on PV (subtracted) |
| `total_annual_bill_gbp` | float | Σ section costs + standing charges SEG |
The calculator is **load-bearing** (ADR-0013 amendment): for `sap_version < 10.2` the `effective_*`
columns hold the calculator's output (so `effective_* != lodged_*` legitimately); at/above 10.2 they
mirror the lodged figures and divergence is logged. A cert the calculator cannot score aborts the
batch rather than persisting a wrong row.
### Population timing
The bill columns are **defined now so the FE can create them**, but are populated only once the
`SapResult``EnergyBreakdown` adapter + `BillDerivation` wiring land (gated on the appliances /
cooking `SapResult` fields). Until then the SQLModel mirror in `infrastructure/postgres/` adds these
columns as nullable; the Drizzle migration can create them nullable in parallel.

43
domain/fuel_rates/fuel.py Normal file
View file

@ -0,0 +1,43 @@
from __future__ import annotations
from enum import Enum
class Fuel(Enum):
"""A canonical billing fuel — the join key between the calculator's
per-end-use fuel (mapped from SAP fuel codes) and the Fuel Rates snapshot
(ADR-0014). Member names match the snapshot's keys.
``COAL`` (traditional house coal) and ``HEAT_NETWORK`` are carried as
members so a cert lodging them maps to a Fuel, but they have no national
rate pricing them raises ``UnpricedFuel`` (house coal's domestic sale is
illegal in England; heat networks are scheme-specific).
"""
MAINS_GAS = "MAINS_GAS"
ELECTRICITY = "ELECTRICITY"
ELECTRICITY_OFF_PEAK = "ELECTRICITY_OFF_PEAK"
OIL = "OIL"
LPG = "LPG"
COAL = "COAL"
SMOKELESS = "SMOKELESS"
WOOD_LOGS = "WOOD_LOGS"
WOOD_PELLETS = "WOOD_PELLETS"
HEAT_NETWORK = "HEAT_NETWORK"
class UnpricedFuel(ValueError):
"""Bill Derivation was asked for a rate on a fuel the current Fuel Rates
snapshot does not price (ADR-0014).
Raised rather than billing at a wrong default so the gap surfaces
immediately house coal and heat networks have no national rate, and
off-peak electricity needs the day/night split that a later slice adds.
"""
def __init__(self, fuel: Fuel) -> None:
super().__init__(
f"no rate for fuel {fuel.name} in the current Fuel Rates snapshot; "
f"add it to the snapshot or map this end use to a priced fuel"
)
self.fuel = fuel

View file

@ -0,0 +1,46 @@
from __future__ import annotations
from collections.abc import Mapping
from dataclasses import dataclass
from domain.fuel_rates.fuel import Fuel, UnpricedFuel
@dataclass(frozen=True)
class FuelRate:
"""One fuel's current tariff: unit price + daily standing charge.
Off-gas fuels (oil / LPG / solid / wood) carry a ``0.0`` standing charge
they are delivered, not metered, so there is no daily charge.
"""
unit_rate_p_per_kwh: float
standing_charge_p_per_day: float
@dataclass(frozen=True)
class FuelRates:
"""A current Fuel Rates snapshot — the rate per billing Fuel plus the SEG
export credit (ADR-0014). ``period`` records which window it is for, since
a committed snapshot goes stale on the Ofgem-cap (quarterly) cadence.
Pricing a fuel the snapshot does not carry raises ``UnpricedFuel`` rather
than defaulting see [[reference-unmapped-sap-code]] for the same strict
discipline on the calculator side.
"""
period: str
seg_export_p_per_kwh: float
rates: Mapping[Fuel, FuelRate]
def unit_rate_p_per_kwh(self, fuel: Fuel) -> float:
return self._rate(fuel).unit_rate_p_per_kwh
def standing_charge_p_per_day(self, fuel: Fuel) -> float:
return self._rate(fuel).standing_charge_p_per_day
def _rate(self, fuel: Fuel) -> FuelRate:
rate = self.rates.get(fuel)
if rate is None:
raise UnpricedFuel(fuel)
return rate

View file

@ -0,0 +1,58 @@
from __future__ import annotations
from collections.abc import Mapping, Sequence
from dataclasses import dataclass
from enum import Enum
from domain.fuel_rates.fuel import Fuel
class BillSection(Enum):
"""A user-meaningful slice of the annual energy bill — the calculator's raw
end uses folded into the sections the UI shows (ADR-0014)."""
HEATING = "HEATING"
HOT_WATER = "HOT_WATER"
LIGHTING = "LIGHTING"
APPLIANCES = "APPLIANCES"
COOKING = "COOKING"
PUMPS_FANS = "PUMPS_FANS"
@dataclass(frozen=True)
class EnergyLine:
"""One section's delivered energy on one fuel. A section may have more than
one line (e.g. gas main heating + electric secondary heating)."""
section: BillSection
fuel: Fuel
kwh: float
@dataclass(frozen=True)
class EnergyBreakdown:
"""A Property's delivered energy per end use, the input to Bill Derivation —
produced from SAP10 Calculation in a later slice. ``exported_kwh`` is PV
generation exported to the grid, credited at the SEG rate."""
lines: Sequence[EnergyLine]
exported_kwh: float = 0.0
@dataclass(frozen=True)
class BillSectionCost:
"""One section's rolled-up delivered kWh and annual cost (£)."""
kwh: float
cost_gbp: float
@dataclass(frozen=True)
class Bill:
"""A Property's annual energy bill, composed per section plus the per-meter
standing charges and the SEG export credit, and the total (ADR-0014)."""
sections: Mapping[BillSection, BillSectionCost]
standing_charges_gbp: float
seg_credit_gbp: float
total_gbp: float

View file

@ -0,0 +1,71 @@
from __future__ import annotations
from collections import defaultdict
from typing import Final
from domain.fuel_rates.fuel import Fuel
from domain.fuel_rates.fuel_rates import FuelRates
from domain.property_baseline.bill import (
Bill,
BillSection,
BillSectionCost,
EnergyBreakdown,
)
_DAYS_PER_YEAR: Final[float] = 365.0
_PENCE_PER_POUND: Final[float] = 100.0
class BillDerivation:
"""Derives a Property's annual energy Bill by pricing a delivered-energy
breakdown at current Fuel Rates (ADR-0014).
Each end-use line is billed at its fuel's unit rate; **standing charges are
added once per distinct fuel used** (a meter, not an end use off-gas fuels
carry a 0 standing charge so they contribute nothing); the SEG export credit
is subtracted. Deterministic (ADR-0006). Raises ``UnpricedFuel`` (via
``FuelRates``) on a fuel the snapshot does not price.
"""
def __init__(self, fuel_rates: FuelRates) -> None:
self._rates = fuel_rates
def derive(self, breakdown: EnergyBreakdown) -> Bill:
section_kwh: defaultdict[BillSection, float] = defaultdict(float)
section_cost_p: defaultdict[BillSection, float] = defaultdict(float)
fuels_used: set[Fuel] = set()
for line in breakdown.lines:
section_kwh[line.section] += line.kwh
section_cost_p[line.section] += (
line.kwh * self._rates.unit_rate_p_per_kwh(line.fuel)
)
if line.kwh > 0:
fuels_used.add(line.fuel)
sections = {
section: BillSectionCost(
kwh=section_kwh[section], cost_gbp=section_cost_p[section] / _PENCE_PER_POUND
)
for section in section_kwh
}
standing_charges_gbp = (
sum(
(self._rates.standing_charge_p_per_day(fuel) * _DAYS_PER_YEAR for fuel in fuels_used),
0.0,
)
/ _PENCE_PER_POUND
)
seg_credit_gbp = (
breakdown.exported_kwh * self._rates.seg_export_p_per_kwh / _PENCE_PER_POUND
)
total_gbp = (
sum((section.cost_gbp for section in sections.values()), 0.0)
+ standing_charges_gbp
- seg_credit_gbp
)
return Bill(
sections=sections,
standing_charges_gbp=standing_charges_gbp,
seg_credit_gbp=seg_credit_gbp,
total_gbp=total_gbp,
)

View file

@ -0,0 +1,94 @@
from __future__ import annotations
import logging
from typing import TYPE_CHECKING, Optional
from domain.property_baseline.performance import Performance
from domain.property_baseline.rebaseliner import Rebaseliner, RebaselineReason
if TYPE_CHECKING:
from datatypes.epc.domain.epc_property_data import EpcPropertyData
from domain.sap10_calculator.calculator import SapCalculator, SapResult
logger = logging.getLogger(__name__)
# The calculator targets SAP 10.2 (14-03-2025). A cert lodged below this carries
# a superseded methodology and is rebaselined to the calculator's output; at or
# above it, the API's lodged figures are kept and the calculator only validates.
_SAP10_2_FLOOR = 10.2
_SAP_ABS_TOL = 0.5
_REL_TOL = 0.01
_KG_PER_TONNE = 1000.0
def _relative_diff(calculated: float, lodged: float) -> float:
if lodged == 0:
return 0.0 if calculated == 0 else float("inf")
return abs(calculated - lodged) / abs(lodged)
class CalculatorRebaseliner(Rebaseliner):
"""Produces Effective Performance from the deterministic `Sap10Calculator`
(ADR-0013 amendment the calculator is load-bearing).
Runs the calculator on every Property. For a cert lodged under a superseded
methodology (``sap_version < 10.2``) the calculator's output **is** Effective
Performance. At or above 10.2 the API's lodged figures are kept and the
calculator only **logs divergence** (a validation signal). A calculator
strict-raise propagates the batch aborts (ADR-0012) and the un-mapped cert
is fixed immediately.
"""
def __init__(self, calculator: "SapCalculator") -> None:
self._calculator = calculator
def rebaseline(
self, property_id: int, effective_epc: "EpcPropertyData", lodged: Performance
) -> tuple[Performance, RebaselineReason]:
# A raise (UnmappedSapCode, etc.) propagates: the calculator is
# load-bearing, so the batch aborts and the cert is fixed at once.
result: SapResult = self._calculator.calculate(effective_epc)
sap_version: Optional[float] = effective_epc.sap_version
if sap_version is not None and sap_version < _SAP10_2_FLOOR:
return Performance.from_sap_result(result), "pre_sap10"
self._log_divergence(
property_id=property_id, sap_version=sap_version, result=result, lodged=lodged
)
return lodged, "none"
def _log_divergence(
self,
*,
property_id: int,
sap_version: Optional[float],
result: "SapResult",
lodged: Performance,
) -> None:
if abs(result.sap_score_continuous - lodged.sap_score) > _SAP_ABS_TOL:
self._warn(property_id, sap_version, "sap_score", lodged.sap_score, result.sap_score_continuous)
if _relative_diff(result.primary_energy_kwh_per_m2, lodged.primary_energy_intensity) > _REL_TOL:
self._warn(
property_id, sap_version, "primary_energy_intensity",
lodged.primary_energy_intensity, result.primary_energy_kwh_per_m2,
)
calculated_co2_t = result.co2_kg_per_yr / _KG_PER_TONNE
if _relative_diff(calculated_co2_t, lodged.co2_emissions) > _REL_TOL:
self._warn(property_id, sap_version, "co2_emissions", lodged.co2_emissions, calculated_co2_t)
def _warn(
self,
property_id: int,
sap_version: Optional[float],
quantity: str,
lodged: float,
calculated: float,
) -> None:
logger.warning(
"SAP10 calculator divergence on %s for property_id=%s sap_version=%s: "
"lodged=%s calculated=%s",
quantity,
property_id,
sap_version,
lodged,
calculated,
)

View file

@ -1,12 +1,16 @@
from __future__ import annotations
from dataclasses import dataclass
from typing import Optional, TypeVar
from typing import Optional, TYPE_CHECKING, TypeVar
from datatypes.epc.domain.epc import Epc
from datatypes.epc.domain.epc_property_data import EpcPropertyData
if TYPE_CHECKING:
from domain.sap10_calculator.calculator import SapResult
_T = TypeVar("_T")
_KG_PER_TONNE = 1000.0
@dataclass(frozen=True)
@ -24,6 +28,20 @@ class Performance:
co2_emissions: float
primary_energy_intensity: int
@classmethod
def from_sap_result(cls, result: "SapResult") -> "Performance":
"""The four rated quantities, read off a calculator `SapResult`
(ADR-0013): band derived from the score, CO2 converted kgtonnes, PEUI
rounded to the lodged integer scale. The `from_*` factory mirrors
`Epc.from_sap_score`; living on the target keeps the SAP calculator
free of any `property_baseline` dependency."""
return cls(
sap_score=result.sap_score,
epc_band=Epc.from_sap_score(result.sap_score),
co2_emissions=result.co2_kg_per_yr / _KG_PER_TONNE,
primary_energy_intensity=round(result.primary_energy_kwh_per_m2),
)
def _require(value: Optional[_T], field: str) -> _T:
if value is None:

View file

@ -36,20 +36,22 @@ class Rebaseliner(ABC):
@abstractmethod
def rebaseline(
self, effective_epc: EpcPropertyData, lodged: Performance
self, property_id: int, effective_epc: EpcPropertyData, lodged: Performance
) -> tuple[Performance, RebaselineReason]: ...
class StubRebaseliner(Rebaseliner):
"""The no-ML stub for the validation phase.
"""A no-calculator stub for tests that don't want the real calculator.
SAP10 certs pass through untouched Effective Performance equals Lodged,
reason ``"none"``. A pre-SAP10 cert genuinely needs ML rebaselining, which is
not implemented yet (#1135), so it raises rather than fabricating a "none".
reason ``"none"``. A pre-SAP10 cert genuinely needs rebaselining, which this
stub does not do, so it raises rather than fabricating a "none". Production
uses ``CalculatorRebaseliner`` (the calculator is load-bearing ADR-0013
amendment); this stub stays for orchestrator/repo unit tests.
"""
def rebaseline(
self, effective_epc: EpcPropertyData, lodged: Performance
self, property_id: int, effective_epc: EpcPropertyData, lodged: Performance
) -> tuple[Performance, RebaselineReason]:
sap_version = effective_epc.sap_version
if sap_version is not None and sap_version < _SAP10_FLOOR:

View file

@ -0,0 +1,41 @@
from __future__ import annotations
from typing import Final
from domain.fuel_rates.fuel import Fuel
from domain.sap10_calculator.exceptions import UnmappedSapCode
# SAP 10.2 / Table 32 fuel code -> canonical billing Fuel (ADR-0014). Bounded to
# the ~47 Table 32 fuel codes (the keys of `table_12.UNIT_PRICE_P_PER_KWH`) — the
# carrier, NOT the PCDB product, so a thousand PCDB heat pumps all share one code.
# Input is a normalised Table 32 fuel code (the calculator sets `main_fuel_type`
# to Table 32 codes); an unmapped code raises `UnmappedSapCode` rather than
# guessing — a bounded, self-surfacing backlog [[reference-unmapped-sap-code]].
_CODE_TO_FUEL: Final[dict[int, Fuel]] = {
**dict.fromkeys([1, 7], Fuel.MAINS_GAS), # mains gas, grid biogas
**dict.fromkeys([2, 3, 5, 9], Fuel.LPG),
**dict.fromkeys([4, 71, 73, 75, 76], Fuel.OIL), # heating oil + bio-liquids
**dict.fromkeys([11, 15], Fuel.COAL), # house coal, anthracite
**dict.fromkeys([12], Fuel.SMOKELESS),
**dict.fromkeys([20, 21], Fuel.WOOD_LOGS), # logs, chips
**dict.fromkeys([22, 23], Fuel.WOOD_PELLETS),
**dict.fromkeys([30], Fuel.ELECTRICITY), # standard tariff
# 7/10/18-hour off-peak tariffs + 24-hour heating tariff — priced once the
# off-peak day/night slice lands; ELECTRICITY_OFF_PEAK is unpriced until then.
**dict.fromkeys([31, 32, 33, 34, 35, 38, 40], Fuel.ELECTRICITY_OFF_PEAK),
# "heat from ..." community/heat-network + distribution codes (41-58).
**dict.fromkeys(range(41, 59), Fuel.HEAT_NETWORK),
}
def sap_code_to_fuel(code: int) -> Fuel:
"""Map a SAP 10.2 / Table 32 fuel code to its canonical billing Fuel.
Raises ``UnmappedSapCode`` on a code with no single billing carrier e.g.
dual fuel (10) or the grid-export codes (36/60), which are not an end use's
input fuel.
"""
fuel = _CODE_TO_FUEL.get(code)
if fuel is None:
raise UnmappedSapCode("fuel_code", code)
return fuel

View file

@ -41,6 +41,7 @@ Appendix L + U. RdSAP10 Table 32 (p.95) for fuel prices/CO2/PE factors.
from __future__ import annotations
from abc import ABC, abstractmethod
from dataclasses import dataclass, field
from typing import Final, Optional, TYPE_CHECKING
@ -177,6 +178,14 @@ class CalculatorInputs:
hot_water_kwh_per_yr: float
pumps_fans_kwh_per_yr: float
lighting_kwh_per_yr: float
# Unregulated annual delivered electricity — output-only, NOT fed
# into ECF / cost / CO2 / primary energy / sap_score (regulated
# energy only). Surfaced for ADR-0014 BillDerivation's APPLIANCES +
# COOKING sections. `cooking_kwh_per_yr` is the SAP 10.2 Appendix L
# L20 (p.91) ELECTRICITY figure (138 + 28×N), not the L18 cooking
# heat gain. `appliances_kwh_per_yr` is the L13/L14/L16a annual E_A.
appliances_kwh_per_yr: float
cooking_kwh_per_yr: float
space_heating_fuel_cost_gbp_per_kwh: float
hot_water_fuel_cost_gbp_per_kwh: float
other_fuel_cost_gbp_per_kwh: float
@ -227,6 +236,17 @@ class CalculatorInputs:
pumps_fans_primary_factor: Optional[float] = None
lighting_primary_factor: Optional[float] = None
electric_shower_primary_factor: Optional[float] = None
# SAP 10.2 Appendix C §C3.2 (PDF p.51) — heat-network distribution
# pumping electricity. For community-heating mains the network pump
# energy = 1% of (space + water) heat generated (worksheet (313));
# its CO2 / PE (worksheet (372)/(472)) bill on Table 12d/12e monthly
# electricity factors (fuel code 50) weighted by the monthly heat
# profile. The energy + effective factors are precomputed in
# cert_to_inputs. 0.0 / None for individually-heated certs (no
# distribution loop) leaves the cascade unchanged.
heat_network_distribution_kwh_per_yr: float = 0.0
heat_network_distribution_co2_factor_kg_per_kwh: Optional[float] = None
heat_network_distribution_primary_factor: Optional[float] = None
# Generation offsets — applied as a cost credit against the ECF
# numerator. SAP 10.2 Appendix M: PV self-consumption + export
# collapse to a single credit at the export rate (Table 12 code 60).
@ -356,6 +376,15 @@ class SapResult:
hot_water_kwh_per_yr: float
pumps_fans_kwh_per_yr: float
lighting_kwh_per_yr: float
# Unregulated annual delivered electricity for ADR-0014
# BillDerivation (APPLIANCES + COOKING sections). Output-only — these
# do NOT contribute to ecf / total_fuel_cost_gbp / co2_kg_per_yr /
# primary_energy_kwh_per_yr / sap_score. `cooking_kwh_per_yr` is the
# SAP 10.2 Appendix L L20 (p.91) ELECTRICITY estimate (138 + 28×N);
# the bill adapter should treat it as an electricity carrier (a
# gas-cooker split, if ever needed, is a separate follow-up).
appliances_kwh_per_yr: float
cooking_kwh_per_yr: float
primary_energy_kwh_per_yr: float
primary_energy_kwh_per_m2: float
monthly: tuple[MonthlyEntry, ...]
@ -578,6 +607,13 @@ def calculate_sap_from_inputs(inputs: CalculatorInputs) -> SapResult:
electric_shower_co2 = (
inputs.electric_shower_kwh_per_yr * electric_shower_co2_factor
)
# SAP 10.2 Appendix C §C3.2 (PDF p.51) worksheet (372) — electricity
# for pumping water through a heat network's distribution system.
# Zero for individually-heated certs (factor None → 0.0).
heat_network_distribution_co2 = (
inputs.heat_network_distribution_kwh_per_yr
* (inputs.heat_network_distribution_co2_factor_kg_per_kwh or 0.0)
)
co2 = (
main_heating_co2
+ secondary_heating_co2
@ -585,6 +621,7 @@ def calculate_sap_from_inputs(inputs: CalculatorInputs) -> SapResult:
+ pumps_fans_co2
+ lighting_co2
+ electric_shower_co2
+ heat_network_distribution_co2
)
# SAP 10.2 Appendix M1 §7 — subtract PV CO2 credit. Onsite consumption
# offsets grid imports at the IMPORT CO2 factor (Table 12d weighted
@ -644,6 +681,12 @@ def calculate_sap_from_inputs(inputs: CalculatorInputs) -> SapResult:
+ inputs.lighting_kwh_per_yr * lighting_primary_factor
+ inputs.electric_shower_kwh_per_yr * electric_shower_primary_factor
)
# SAP 10.2 Appendix C §C3.2 (PDF p.51) worksheet (472) — heat-network
# distribution pumping electricity primary energy (CO2 sister above).
heat_network_distribution_primary_kwh = (
inputs.heat_network_distribution_kwh_per_yr
* (inputs.heat_network_distribution_primary_factor or 0.0)
)
# SAP 10.2 Appendix M1 §8: PV onsite consumption credits at IMPORT
# PEF (offsets grid imports); PV exports credit at the EXPORT PEF
# ("electricity sold to grid, PV" — Table 12 code 60 = 0.501). When
@ -678,6 +721,7 @@ def calculate_sap_from_inputs(inputs: CalculatorInputs) -> SapResult:
space_heating_primary_kwh
+ hot_water_primary_kwh
+ other_primary_kwh
+ heat_network_distribution_primary_kwh
- pv_primary_offset_kwh,
)
primary_energy_per_m2 = primary_energy_kwh / tfa if tfa > 0 else 0.0
@ -720,6 +764,8 @@ def calculate_sap_from_inputs(inputs: CalculatorInputs) -> SapResult:
"hot_water_co2_kg_per_yr": hot_water_co2,
"pumps_fans_co2_kg_per_yr": pumps_fans_co2,
"lighting_co2_kg_per_yr": lighting_co2,
"heat_network_distribution_co2_kg_per_yr": heat_network_distribution_co2,
"heat_network_distribution_pe_kwh_per_yr": heat_network_distribution_primary_kwh,
"space_heating_pe_kwh_per_m2": space_heating_primary_kwh / tfa if tfa > 0 else 0.0,
"hot_water_pe_kwh_per_m2": hot_water_primary_kwh / tfa if tfa > 0 else 0.0,
"other_pe_kwh_per_m2": other_primary_kwh / tfa if tfa > 0 else 0.0,
@ -744,6 +790,8 @@ def calculate_sap_from_inputs(inputs: CalculatorInputs) -> SapResult:
hot_water_kwh_per_yr=inputs.hot_water_kwh_per_yr,
pumps_fans_kwh_per_yr=inputs.pumps_fans_kwh_per_yr,
lighting_kwh_per_yr=inputs.lighting_kwh_per_yr,
appliances_kwh_per_yr=inputs.appliances_kwh_per_yr,
cooking_kwh_per_yr=inputs.cooking_kwh_per_yr,
primary_energy_kwh_per_yr=primary_energy_kwh,
primary_energy_kwh_per_m2=primary_energy_per_m2,
monthly=monthly,
@ -751,7 +799,21 @@ def calculate_sap_from_inputs(inputs: CalculatorInputs) -> SapResult:
)
class Sap10Calculator:
class SapCalculator(ABC):
"""The contract a SAP calculator satisfies: an `EpcPropertyData` in, a
typed `SapResult` out. `Sap10Calculator` is the SAP 10.2 implementation;
a future methodology (e.g. SAP 10.3 / a successor) is another subclass.
Consumers (e.g. `CalculatorRebaseliner`) depend on this abstraction, not
on a concrete calculator so the engine can be swapped without touching
them.
"""
@abstractmethod
def calculate(self, epc: "EpcPropertyData") -> SapResult: ...
class Sap10Calculator(SapCalculator):
"""Deterministic SAP 10.2 calculator entry point. Maps an
`EpcPropertyData` to typed `CalculatorInputs` via the RdSAP-driven
`cert_to_inputs` mapper and runs the 12-month worksheet loop.

View file

@ -0,0 +1,265 @@
# SAP calculator — agent guide (start here)
This is the **canonical onboarding doc** for working on the SAP 10.2 /
RdSAP 10 calculator. It is meant to get you productive **without reading
any historical handover**. The `HANDOVER_*.md` files in this directory
are point-in-time session notes (useful for the specific residual they
chase, ignore otherwise). For deep architecture/API see
[`SAP_CALCULATOR.md`](SAP_CALCULATOR.md).
Three things this doc gives you: (1) the **accuracy bar** for the two
input paths, (2) the **debugging loop**, (3) the **tools & pipeline**.
---
## 0. The one-paragraph mental model
A cert's data comes in via one of two front-ends — an **Elmhurst Summary
PDF** (site-notes path) or an **EPC-register API JSON** (API path). Both
map to the same typed `EpcPropertyData`, which feeds a deterministic
cascade that reproduces the RdSAP10 engine. Our **ground truth is the
Elmhurst worksheet PDF** (U985 / P960 / dr87) — the per-line `(1)..(286)`
calculation, not the rounded values the EPC register lodges. We pin the
cascade against the worksheet to **abs = 1e-4 on every line ref**.
---
## 1. Accuracy expectations — site-notes vs API
The worksheet PDF is **always** the target. The EPC register's lodged
SAP/CO2/PE are rounded *and* carry Elmhurst's own residual, so matching
the lodged values is not the goal — matching the worksheet is.
| Path | Input | When a worksheet PDF exists for the cert | API/site-notes-only (no worksheet) |
|---|---|---|---|
| **Site-notes** | Elmhurst Summary PDF → extractor → `from_elmhurst_site_notes` | **abs = 1e-4** on continuous SAP **and every populated line ref** and cost / CO2 / PE | n/a (we always have the worksheet for site-notes fixtures) |
| **API** | register JSON → `from_api_response` | **abs = 1e-4** on continuous SAP vs the worksheet (same bar as site-notes — the two paths must converge) | **±0.5** SAP vs the lodged register value (fallback only) |
Three rules that fall out of this:
- **Cross-mapper parity.** For a cert that has both an API JSON and an
Elmhurst Summary, the two paths must produce SAP within **1e-4 of each
other** *and* of the worksheet. The cascade output (not a structural
EPC diff) is the equivalence check. A divergence localises to one
mapper.
- **No tolerance widening.** A failing 1e-4 pin is a real cascade bug or
a fixture defect — diagnose it, don't relax it. No `rel=`, no `xfail`,
no adaptive ceilings. ΔSAP = 0.07 is **not** "closed".
- **±0.5 is a fallback, not a destination.** It's only for API-only
certs with no worksheet to check against. If you can get a worksheet,
the bar is 1e-4.
Two documented, deliberate exceptions to "match the spec literal" live
in [`SAP_CALCULATOR.md` §8](SAP_CALCULATOR.md) ("Elmhurst-mirrored spec
divergences") — cases where the BRE-approved Elmhurst engine diverges
from the SAP 10.2 text and we mirror the engine. Add a §8 row only with
≥2-cert evidence.
---
## 2. The tools & pipeline
### 2.1 The two PDFs per cert
- **`Summary_NNNNNN.pdf`** — the Elmhurst **site notes / input**. This is
what the assessor lodged: dimensions, fabric, heating system, controls,
cylinder, etc. It is the INPUT, equivalent to the API JSON.
- **The worksheet** — the **ground truth output**, every line ref
`(1)..(286)` to 4 d.p. Three families, all the same format:
- `U985-0001-NNNNNN.pdf` — the 6 gas-combi conformance fixtures.
- `P960-0001-NNNNNN.pdf` — the heating-systems corpus + community heating.
- `dr87-0001-NNNNNN.pdf` — the API-paired cohort ("Additional data with api").
### 2.2 The cascade pipeline (site-notes path)
```python
import subprocess, re
from pathlib import Path
from backend.documents_parser.elmhurst_extractor import ElmhurstSiteNotesExtractor
from datatypes.epc.domain.mapper import EpcPropertyDataMapper
from domain.sap10_calculator.rdsap.cert_to_inputs import (
cert_to_inputs, cert_to_demand_inputs, local_climate_for_cert,
)
from domain.sap10_calculator.calculator import calculate_sap_from_inputs
# 1. Summary PDF -> per-page text (pdftotext -layout, one string per page)
def summary_pdf_to_pages(pdf: Path) -> list[str]:
n = int(re.search(r"Pages:\s+(\d+)",
subprocess.run(["pdfinfo", str(pdf)], capture_output=True, text=True).stdout).group(1))
pages = []
for i in range(1, n + 1):
layout = subprocess.run(
["pdftotext", "-layout", "-f", str(i), "-l", str(i), str(pdf), "-"],
capture_output=True, text=True).stdout
pages.append("\n".join(
tok for line in layout.splitlines() for tok in re.split(r"\s{2,}", line.strip()) if tok))
return pages
pages = summary_pdf_to_pages(Path("sap worksheets/.../Summary_NNNNNN.pdf"))
site_notes = ElmhurstSiteNotesExtractor(pages).extract() # -> ElmhurstSiteNotes
epc = EpcPropertyDataMapper.from_elmhurst_site_notes(site_notes) # -> EpcPropertyData
# 2. Two cascades. RATING = SAP/EI rating (UK-avg climate, region 0).
# DEMAND = Current Carbon / Current PE / Fuel Bill (postcode climate, PCDB Table 172).
rating = calculate_sap_from_inputs(cert_to_inputs(epc))
demand = calculate_sap_from_inputs(cert_to_demand_inputs(epc)) # climate = local_climate_for_cert(epc)
rating.sap_score_continuous # un-rounded SAP — pin THIS, not the integer
rating.total_fuel_cost_gbp
rating.co2_kg_per_yr
demand.primary_energy_kwh_per_yr
```
Shortcut: `Sap10Calculator().calculate(epc)` runs the rating cascade
(`cert_to_inputs``calculate_sap_from_inputs`) in one call.
### 2.3 The API path
Identical from `EpcPropertyData` onward — only the front-end changes:
```python
import json
data = json.loads(Path("tests/domain/sap10_calculator/rdsap/fixtures/golden/<cert>.json").read_text())
epc = EpcPropertyDataMapper.from_api_response(data) # -> EpcPropertyData
# ... same cert_to_inputs / calculate_sap_from_inputs as above
```
### 2.4 Section helpers — intermediate line refs
Every worksheet section has a `<section>_section_from_cert(epc)` helper
returning a typed result with the line-ref values. Use these to inspect
where a residual originates **without** running the whole cascade
(`postcode_climate=` selects rating vs demand):
```python
from domain.sap10_calculator.rdsap.cert_to_inputs import (
water_heating_section_from_cert, # §4 (42)..(65)m
heat_transmission_section_from_cert, # §3 (26)..(37)
internal_gains_section_from_cert, # §5 (66)..(73)
mean_internal_temperature_section_from_cert, # §7 (85)..(94)
space_heating_section_from_cert, # §8 (95)..(99)
fuel_cost_section_from_cert, # §10a (240)..(255)
environmental_section_from_cert, # §12 (261)..(274)
primary_energy_section_from_cert, # §13a (275)..(286)
)
wh = water_heating_section_from_cert(epc)
wh.energy_content_monthly_kwh # (45)m ; wh.output_kwh_per_yr # (62)/(64)
```
(Full table of helpers + line refs is in [`SAP_CALCULATOR.md` §1.3](SAP_CALCULATOR.md).)
### 2.5 Reading the worksheet from the shell
```bash
# Dump a worksheet line ref (e.g. (217)m water-heater monthly efficiency):
pdftotext -layout "sap worksheets/.../P960-0001-NNNNNN.pdf" - | grep -nE "\(217\)|\(62\)|\(210\)"
# Read a Summary input field (controls, cylinder, fuel):
pdftotext -layout "sap worksheets/.../Summary_NNNNNN.pdf" - | grep -niE "cylinder|control|interlock|fuel"
```
### 2.6 Where the test vectors live
| Set | Location | What |
|---|---|---|
| 6 U985 conformance fixtures | `tests/domain/sap10_calculator/worksheet/_elmhurst_worksheet_NNNNNN.py` (+ Summary PDFs in `backend/documents_parser/tests/fixtures/`) | Gas-combi certs, every line ref transcribed as `LINE_*` / `DEMAND_LINE_*` constants. Pinned in `worksheet/test_section_cascade_pins.py` + `worksheet/test_e2e_elmhurst_sap_score.py`. |
| Heating-systems corpus | `sap worksheets/heating systems examples/<variant>/` (Summary + P960) | 41 variants of **one property** with only the heating system changed → any residual is attributable to the heating subsystem. Pinned in `backend/documents_parser/tests/test_heating_systems_corpus.py`. |
| API golden fixtures | `tests/domain/sap10_calculator/rdsap/fixtures/golden/<cert>.json` | Register JSON for the API path. |
| API + worksheet pairs | `sap worksheets/Additional data with api/<cert>/` (Summary + dr87) | Certs that have BOTH an API JSON and a worksheet → cross-mapper parity checks. |
---
## 3. The debugging loop
When a cert's SAP/cost/CO2/PE is off, **never guess a fix** — walk it.
1. **Reproduce & decompose.** Build the epc (extractor+mapper, or a
fixture's `build_epc()`), run both cascades, and see **which of the
four outputs** drifts. Cost/CO2/PE drift with the same sign as energy;
isolate the carrier.
2. **Find the section.** Walk the four metrics back to a worksheet
section: SAP off but cost EXACT often means a demand/gains issue;
cost off but energy EXACT means a price/factor issue; CO2/PE off but
cost EXACT means a factor issue. Use the §2.4 section helpers to get
the cascade's intermediate line refs.
3. **Per-line compare vs the worksheet.** `pdftotext -layout` the
worksheet and compare the cascade's `(45)/(56)/(62)/(210)/(217)m/...`
line-by-line against the PDF. The first diverging line ref is the bug.
4. **Localise to a layer.**
- cascade value present in worksheet but cascade has 0 / wrong → **calculator** gap (a spec rule not wired, or a dispatch gate).
- the input field the worksheet used isn't in `epc`**mapper** (mis-mapped) or **extractor** (didn't capture the Summary field). Audit the Summary PDF for the field first — many lodgements are incomplete and the fixture, not the calculator, is wrong.
5. **Cite the spec.** Find the SAP 10.2 / RdSAP 10 rule (page + line) that
produces the worksheet's number. Confirm the worksheet matches the
spec literal; if it diverges, it's a candidate §8 Elmhurst-mirror
(needs ≥2-cert evidence). **SAP 10.2 only — never 10.3.**
6. **Cross-check vs API (when available).** If the cert has an API JSON
too, run `from_api_response` through the same cascade. If the API path
matches the worksheet but the site-notes path doesn't (or vice-versa),
the bug is in **that mapper**, not the calculator. If both diverge
identically, it's the **calculator/cascade**.
7. **Fix one cause, re-pin smaller.** TDD: one failing AAA test → one
impl → re-pin the (now smaller) residual. A spec-correct fix often
**exposes** the next residual that an offsetting bug was masking —
that's the next slice, not a regression. Don't conflate
`main_heating_category` (often `None` on Elmhurst Table 4b boilers)
with `sap_main_heating_code`.
### Worked shape (real example: oil 6)
Residual +3.05 SAP. (1) HW + space both off. (2) §4 HW efficiency. (3)
worksheet (210) space eff = 75 but Table 4b code 126 = 80; (217)m summer
= 63 = 685 → a 5pp penalty. (4) the Summary lodges control `2101` ("no
thermostatic control of room temperature") → no room thermostat → P960
header "Boiler Interlock: No". (5) RdSAP 10 §3 + SAP 10.2 Table 4c(2):
no room thermostat ⇒ not interlocked ⇒ 5pp Space+DHW. Fix the
`no_interlock` gate → space+HW fuel EXACT, residual collapses to a single
exposed pump cause (Table 4f footnote a) ×1.3) → next slice. Two slices,
fully closed.
---
## 4. Run the suite
```bash
PYTHONPATH=/workspaces/model python -m pytest \
tests/domain/sap10_calculator/ \
backend/documents_parser/tests/ \
--no-cov -q -p no:cacheprovider
```
Conformance pins only:
```bash
PYTHONPATH=/workspaces/model python -m pytest \
tests/domain/sap10_calculator/worksheet/test_section_cascade_pins.py \
tests/domain/sap10_calculator/worksheet/test_e2e_elmhurst_sap_score.py \
backend/documents_parser/tests/test_heating_systems_corpus.py \
--no-cov -q
```
Notes:
- `load_cells` tests pin against the gitignored `*.xlsx` reference
worksheet at repo root; they **skip** when it's absent (CI), run
locally when present.
- All new code passes `pyright` strict, zero errors. Tests use literal
`# Arrange / # Act / # Assert` headers and `abs(x - y) <= tol` (not
`pytest.approx`, which strict-pyright flags).
- Commit one slice per change, with the spec citation in the message.
---
## 5. Spec PDFs on disk
```
domain/sap10_calculator/docs/specs/
sap-10-2-full-specification-2025-03-14.pdf # SAP 10.2 (the methodology)
RdSAP 10 Specification 10-06-2025.pdf # RdSAP 10 (the reduced-data rules)
pcdb10.dat / pcdb_table_*.jsonl # PCDB (boilers, HPs, postcode weather)
```
Pages worth bookmarking: SAP 10.2 §7 MIT (p.28-32), Table 4b boiler eff
(p.168), Table 4c efficiency adjustments (p.169), Table 4e controls
(p.171-174), Table 4f auxiliary energy (p.175), Table 12 factors (p.191),
Appendix U region tables (p.124-127). RdSAP 10 §10 water heating (p.54-56,
incl. §10.7 no-water-heating default), Table 28/29 cylinder defaults,
Table 32 prices (p.95).
```

View file

@ -0,0 +1,276 @@
# Handover — post Slices S0380.153..155
Branch: `feature/per-cert-mapper-validation`. **HEAD `152db1ae`**.
Predecessor: [`HANDOVER_POST_S0380_152.md`](HANDOVER_POST_S0380_152.md).
## TL;DR
Three slices landed. Each addressed a distinct spec rule the cascade
was missing, all surfaced through the same heating-systems corpus
(property 001431 × 41 heating-system variants).
| Slice | Commit | Spec rule closed |
|---|---|---|
| S0380.153 | `e4bf4e70` | SAP 10.2 Table 3 (p.160) — "not separately timed" middle row for solid-fuel boilers (codes 151-161). DHW timer follows the appliance, not a separate programmer. |
| S0380.154 | `5e941b92` | SAP 10.2 §12.4.4 (p.36-37) — back-boiler summer-immersion HW split for codes 156, 158. Cascade now blends winter boiler + summer electric immersion across kWh/cost/CO2/PE/standing-charge fields. |
| S0380.155 | `152db1ae` | SAP 10.2 Table 4a (p.163-164) — heat-pump water column distinct from space column for 10 codes (211/213/215/216/217 + 521/523/525/526/527). Cascade was using space efficiency for HW on these codes. |
Extended handover suite at HEAD: **899 pass, 0 fail.** Pyright
net-zero (43 → 43).
## Disciplines reinforced this session
1. **Per-line walk before spec hypothesis.** S0380.153 found via
dumping SF3's worksheet (59)m row — it showed winter h=5 / summer
h=3 (= "not separately timed"), not the h=3 year-round the cascade
was using. The handover narrative said "SF2 separately-timed-DHW
gate" but the per-line walk revealed the rule applies to ALL
solid-fuel boilers (codes 151-161), not just back-boiler combos.
2. **Bigger slice OK when one spec rule has multiple wire points.**
S0380.154 (§12.4.4) touched HW kWh + cost rate + CO2 factor + PE
factor + standing charges + primary-loss override — five distinct
plumbing points. Doing it as one coherent slice (vs splitting into
"fix kWh first, then fix cost") kept the residual pin monotonic.
3. **Spec correctness > pin stability for Elmhurst-vs-spec gaps.**
The lighting-PE +48.66 cluster (5 variants with identical offset
from off-peak HW immersion) was deferred because Elmhurst uses
Table 12 ANNUAL factor (1.501 PE / 0.136 CO2) for off-peak HW
while spec Table 12d/12e header mandates MONTHLY factors. The
cascade follows spec; the cohort residual stays.
## Current residual state at HEAD `152db1ae`
### Cascade-OK tier (25 variants on pin grid)
Sorted by |ΔSAP_c|:
| Variant | ΔSAP_c | Δcost | ΔPE | Notes |
|---|---:|---:|---:|---|
| oil 1 | **-0.0000** | **-0.00** | **+0.00** | EXACT |
| oil pcdb 1/2 | **+0.0000** | **+0.00** | **+0.00** | EXACT |
| oil pcdb 3 | **+0.0000** | **+0.00** | **-0.00** | EXACT |
| electric 1 | **-0.0000** | **-0.00** | +48.66 | SAP exact, PE Elmhurst quirk |
| solid fuel 5 | **+0.0000** | **+0.00** | +48.66 | SAP exact, same quirk |
| solid fuel 6 | **+0.0000** | **+0.00** | +48.66 | SAP exact, same quirk |
| solid fuel 7 | **-0.0000** | **+0.00** | +48.66 | SAP exact, same quirk |
| solid fuel 8 | **-0.0000** | **+0.00** | +48.66 | SAP exact, same quirk |
| **solid fuel 2** | **-0.0000** | **-0.00** | -1027.51 | **closed by .154** (SAP+cost EXACT; CO2/PE Elmhurst blend artifact) |
| **solid fuel 3** | **-0.0000** | **-0.00** | -0.00 | **closed by .153** (4-metric EXACT) |
| pcdb 1 | -0.0108 | +£0.24 | +5.70 | basically exact |
| **gshp** | **-0.0178** | **+£0.41** | +33.52 | **closed by .155** (HW kWh 841→1138 matches worksheet) |
| ashp | -0.024 | +£0.55 | +36.34 | basically exact |
| solid fuel 4 | +0.085 | -£1.96 | -5.78 | close |
| solid fuel 11 | +0.0912 | -£2.10 | -0.74 | close |
| electric 8 | +0.0941 | -£2.17 | +6.58 | close |
| electric 7 | +0.1017 | -£2.34 | +3.10 | close |
| electric 6 | +0.1081 | -£2.49 | +0.16 | close |
| solid fuel 9 | +0.1072 | -£2.47 | -5.07 | close |
| solid fuel 10 | +0.1134 | -£2.61 | -13.91 | close |
| electric 9 | +0.1199 | -£2.76 | -4.51 | close |
| electric 3 | +0.1215 | -£2.80 | -5.99 | close |
| **electric 2** | **-0.4584** | **+£10.56** | **+443.13** | warm-air HP code 524 — open Cluster C |
| **electric 5** | **-1.1759** | **+£27.09** | **+438.03** | storage code 402 R=0.20 — open |
Σ |ΔSAP_c| across 25 variants ≈ **2.7 SAP points** (was 14.5 at session start,
~6.4 after .150-.152, now ~2.7 = 81% reduction across 6 slices over
two sessions).
### Blocked tier (16 variants — `MissingMainFuelType`)
Unchanged. Community heating × 5, electric storage 11-14, no system,
oil 2-6, pcdb 3.
## Open fronts ranked by leverage
### 1. **electric 5 — SAP -1.18 / cost +£27 / PE +438** (largest open)
Storage heater code 402 (R=0.20, slimline). REGRESSED by S0380.145 +
S0380.151 — pre-S0380.145 was net-zero from offsetting bugs.
Per-line probe at session-end:
- Cascade adjusted MIT[Jan] = 19.10 vs worksheet (93) = 18.61
(cascade +0.49 K higher)
- Cascade base MIT[Jan] = 18.70 vs worksheet (92) = 18.21
(cascade +0.49 K higher — same diff)
- Cascade `control_temperature_adjustment_c` = +0.4 K (Table 4e
code 2402 — correct)
- Per-zone components diverge: cascade T_living = 19.85 vs ws 19.65;
T_h2 = 18.59 vs ws 19.12; T_elsewhere = 17.27 vs ws 17.59.
The diverging components suggest §9 Table 9a/9b off-period reduction
formula differs in Elmhurst for R=0.20 storage heaters. Cascade's
formula:
```
T_sc = (1-R)(T_h - 2) + R(T_e + η·G/H)
if t_off > t_c: u = (T_h - T_sc)(t_off - 0.5·t_c) / 24
```
matches spec verbatim. But the per-zone numbers (T_h2 cascade 18.59
vs ws 19.12) suggest a HEAT LOSS PARAMETER or HLP-formula divergence
upstream of the off-period reduction.
This needs careful spec analysis of Table 9c steps for low-R systems
— may take 1-2 slices. NOT a quick win.
### 2. **electric 2 — SAP -0.46 / cost +£10.56 / PE +443**
Warm-air ASHP code 524 (Space = Water = 170, so S0380.155 fix
doesn't apply). Cascade HW kWh OVER worksheet by 465 kWh (+19%) —
opposite direction from gshp. Distinct spec rule. Probably HW
efficiency cascade for warm-air HPs (Appendix N3 has separate
treatment from Cat 4 hydronic HPs).
### 3. **Lighting-only PE +48.66 cohort cluster (5 variants)** — **DEFERRED**
electric 1, solid fuel 5/6/7/8. All have identical PE +48.66 / CO2
+11.94 offset from off-peak HW immersion. Worksheet uses Table 12
ANNUAL factor (1.501 / 0.136) on the "low-rate cost" line; cascade
uses Table 12d/12e MONTHLY cascade per spec header. Cascade is
spec-correct. Elmhurst applies an undocumented exception for
off-peak HW immersion. Cannot close without violating spec.
### 4. **electric 3 / 6 / 7 / 8 / 9 + solid fuel 9-11 — ΔSAP ±0.09-0.12**
Residual cluster — likely a shared shave-the-residual fix. Probably
the same Elmhurst-vs-spec PE blend artifact as #3 but for the
secondary-heating fraction or similar. Lowest leverage.
### 5. **gshp ΔSAP -0.018 / ΔPE +34** — landed in S0380.155
Sub-tolerance close but not 1e-4. Same Elmhurst-vs-spec PE blend
artifact as #3 (HW from HP is on standard tariff, not off-peak, so
NOT the same off-peak-immersion path — but same monthly-vs-annual
factor pattern). Defer until the cluster fix lands.
## Slice history (this session)
| Slice | HEAD | Scope |
|---|---|---|
| S0380.153 | `e4bf4e70` | SAP 10.2 Table 3 (PDF p.160) middle row "Cylinder thermostat, water heating NOT separately timed" applies to solid-fuel boiler systems. Per §9.2.4 these are "independent solid fuel boilers, open fires with a back boiler and room heaters with a boiler" — the appliance is the timer. New `_TABLE_4A_SOLID_FUEL_BOILER_CODES` frozenset + branch in `_separately_timed_dhw`. SF3 (code 160 + WHC=901): worksheet (59)m winter 64.58 / summer 41.92 matches cascade. ΔSAP +0.30 → -0.0000 EXACT all 4 metrics. SF2 narrows +2.06 → +1.86 (remaining is the §12.4.4 immersion rule). |
| S0380.154 | `5e941b92` | SAP 10.2 §12.4.4 (PDF p.36-37) back-boiler summer-immersion HW split. For Table 4a codes 156 + 158 (back-boiler combos) + WHC ∈ {901, 902, 914} + cylinder, HW splits: winter (Oct-May) at boiler eff + summer (Jun-Sep) at 100% electric immersion. New `_section_12_4_4_summer_immersion_applies(epc, main)` predicate + `_section_12_4_4_hw_blend(...)` returning 5-tuple (annual_hw_fuel_kwh, blended_cost, blended_co2, blended_pe, extra_standing). `_primary_loss_override` zeros (59)m Jun-Sep. Orchestrator wires 4 fields + standing once. SF2 closures: ΔSAP +1.86 → -0.0000 EXACT, Δcost -£42.84 → -£0.00 EXACT; CO2/PE residuals -93/-1027 are Elmhurst summer CO2/PE blend artifacts vs Table 12d/12e. |
| S0380.155 | `152db1ae` | SAP 10.2 Table 4a (PDF p.163-164) heat-pump rows split efficiency into Space and Water columns. Codes 211/213 (Cat 4 GSHP/WSHP ≤35°C: SH 230 / DHW 170), 215/216/217 (Cat 4 gas-fired HP ≤35°C: SH 120-110 / DHW 84-77), and Cat 5 warm-air equivalents 521/523/525/526/527. New `_TABLE_4A_HEAT_PUMP_WATER_EFFICIENCY` 10-code dict consulted in `_water_efficiency_with_category_inherit` before `seasonal_efficiency` fallback. Codes where Space == Water (214/221/223/224/524) unchanged. gshp (code 211) HW kWh 841 → 1138.45 (matches worksheet 1138.46). ΔSAP +0.94 → -0.0178. No regressions on 40 other variants. |
## Standard slice workflow (unchanged)
1. Read spec page + identify rule
2. Probe one cluster variant; verify diagnosis via monkey-patch / direct walk
3. Write failing AAA test (literal `# Arrange / # Act / # Assert`)
4. Implement helper / dispatch entry / mapper extension
5. Re-pin affected variants (DO NOT widen tolerance)
6. Run extended handover suite (command below)
7. Pyright net-zero check (`git stash` → pyright → `git stash pop` → pyright)
8. Commit with spec citation + `Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>`
9. Update `project-heating-systems-corpus` + `MEMORY.md` index
## Test baseline at HEAD `152db1ae`
```bash
PYTHONPATH=/workspaces/model python -m pytest \
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
backend/documents_parser/tests/test_heating_systems_corpus.py \
backend/documents_parser/tests/test_elmhurst_extractor.py \
backend/documents_parser/tests/test_elmhurst_end_to_end.py \
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
domain/sap10_calculator/worksheet/tests/test_heat_transmission.py \
domain/sap10_calculator/worksheet/tests/test_internal_gains.py \
domain/sap10_calculator/worksheet/tests/test_solar_gains.py \
domain/sap10_calculator/worksheet/tests/test_dimensions.py \
domain/sap10_calculator/worksheet/tests/test_rating.py \
domain/sap10_calculator/worksheet/tests/test_ventilation.py \
domain/sap10_calculator/worksheet/tests/test_appendix_h_solar.py \
domain/sap10_calculator/worksheet/tests/test_mev.py \
domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
domain/sap10_calculator/tests/test_pcdb_table_322_lookup.py \
domain/sap10_calculator/tests/test_pcdb_table_329_lookup.py \
domain/sap10_calculator/tests/test_table_12a.py \
--no-cov -q
```
Expected: **899 pass, 0 fail.**
## Memories to load (in order)
```
project-heating-systems-corpus # HEAD 152db1ae
feedback-sap-10-2-only-never-10-3 # CRITICAL — never reference SAP 10.3
feedback-software-no-special-handling # CRITICAL — apply spec uniformly
feedback-spec-floor-skepticism # CUTS BOTH WAYS — skeptical of your OWN audit narrative
feedback-worksheet-not-api-reference
feedback-spec-citation-in-commits
feedback-verify-handover-claims
feedback-zero-error-strict # TARGET: ΔSAP_c < 1e-4 vs worksheet
feedback-commit-per-slice
feedback-aaa-test-convention
feedback-e2e-validation-philosophy
feedback-abs-diff-over-pytest-approx
feedback-golden-residuals-near-zero
feedback-one-e-minus-4-across-the-board
reference-unmapped-sap-code
reference-unmapped-api-code
project-oil-price-spec-divergence
```
## What NOT to do
- **Don't reference SAP 10.3** — track 10.2 deliberately
- **Don't widen pin tolerances** — re-pin smaller or find the spec gap
- **Don't add empirical gates** to keep cohort pins stable when a
spec rule clearly applies
- **Don't re-investigate Slices .91..155** — all settled
- **Don't add new helpers to `domain/sap10_ml/`** — on deprecation
path; `domain/sap10_calculator/tables/` is the canonical home
- **Don't treat ΔSAP=0.07 as "closed"** — target is <1e-4 vs worksheet
- **Don't try to close the lighting-PE +48.66 cluster** — it's an
Elmhurst-vs-spec quirk (Elmhurst uses Table 12 annual factor for
off-peak HW immersion while spec Table 12d/12e header mandates
monthly factors). Closing it would violate spec.
- **Don't form a spec hypothesis without per-line data** — walk the
worksheet line-by-line for the failing variant first, then look up
the spec rule. Headline residuals tell you a gap exists; only the
per-line walk tells you which section of the spec it lives in.
## Spec source quick-reference
All under `domain/sap10_calculator/docs/specs/`:
- **SAP 10.2 full spec**: `sap-10-2-full-specification-2025-03-14.pdf`
- **§4** (p.135-137) — water heating worksheet (45..65)
- **§9.2.4** (p.27) — Solid fuel boiler systems (the appliance is
the timer; Table 3 not-separately-timed row applies). **Slice .153.**
- **§9.4.11** (p.30) — Boiler interlock: -5pp to BOTH SH and DHW
- **§9.4.19** (p.34-35) — Storage heater controls
- **§12** (p.45) — Electricity tariff types
- **§12.4.4** (p.36-37) — Solid fuel systems; back-boiler combos
use electric immersion in summer. **Slice .154.**
- **§A.2.2** (~p.189) — Forced-secondary set
- **Appendix D §D2.1 (2)** (p.57) — Eq D1 monthly water eff cascade
- **Appendix F2** (p.63) — 18-hour CPSU: high rate for all other uses
- **Appendix N3** (p.107-109) — Heat pump DHW efficiency cascade
- **Table 3** (p.160) — Primary circuit loss; zero-loss list +
middle row "not separately timed" h=5/h=3. **Slices .152 / .153.**
- **Table 4a** (p.163-170) — heating systems incl. **separate
Space + Water columns for HP rows**. **Slice .155.**
- **Table 4b** (p.168) — gas/liquid boilers seasonal efficiency
- **Table 4e** (p.171-173) — heating system controls + temperature
adjustment column. Group 4 codes 2401/2402/2403 = electric
storage controls (+0.7/+0.4/+0.4 K).
- **Table 4f** (p.174) — pumps + fans
- **Table 9a/9b** (p.183) — utilisation factor + off-period reduction
- **Table 9c** (p.184) — MIT cascade (step 8 = Table 4e adj wired)
- **Table 11** (p.188) — secondary heating fraction
- **Table 12** (p.191) — SAP rating fuel prices + standing charges
- **Table 12a** (p.191) — high/low-rate fraction by system × tariff
- **Table 12d** (p.195) — monthly variation in CO2 factors for
electricity (spec mandates use INSTEAD OF Table 12 annual)
- **Table 12e** (p.196) — monthly variation in PE factors
- **Table 13** (p.197) — high-rate fraction for electric DHW
- **RdSAP 10 spec**: `RdSAP 10 Specification 10-06-2025.pdf`
- **§4.1 Table 5** (p.28) — Ventilation parameters incl.
extract fans age-band default
- **§5** (p.29) — Floor infiltration spec rule
- **§10.11 Table 29** (p.56) — Heating/HW parameters; inaccessible cylinder
- **§19 Table 32** (p.95) — RdSAP10 fuel prices / CO2 / PE
## Good luck.

View file

@ -0,0 +1,303 @@
# Handover — post Slices S0380.156..159
Branch: `feature/per-cert-mapper-validation`. **HEAD `fba45d11`**.
Predecessor: [`HANDOVER_POST_S0380_155.md`](HANDOVER_POST_S0380_155.md).
## TL;DR
Four slices landed — three closed the electric 2 (Cat 5 warm-air ASHP
code 524) cohort entry point, one closed the electric 5 (Cat 7
slimline storage code 402 + 18-hour tariff) entry point. All four
came from the same per-line walk discipline: dump the worksheet
section the residual landed in, identify the diverging line ref,
look up the spec rule.
| Slice | Commit | Spec rule closed |
|---|---|---|
| S0380.156 | `02092c80` | SAP 10.2 Table 3 (p.160) zero-loss list — universal WHC=903 guard at top of `_primary_loss_applies`. Cat-4 HP branch was falsely returning True when WHC=903 means electric immersion (no primary circuit). |
| S0380.157 | `a2a4b682` | SAP 10.2 Table 2b note b) (p.159) "×0.9 if separate DHW timing (boiler / warm-air / HP)". Companion WHC=903 guard at top of `_separately_timed_dhw`. Electric immersion is not in the verbatim system-type list. |
| S0380.158 | `8843df1b` | SAP 10.2 Table 4f (p.174) row "Warm air heating system fans" = SFP × 0.4 × V per footnote e default SFP = 1.5 W/(l/s). New `_TABLE_4A_WARM_AIR_SAP_CODES` frozenset (22 codes) + leaf helper. |
| S0380.159 | `fba45d11` | SAP 10.2 Table 4a (p.166) Cat 7 R splits between Off-peak (codes 402/403 R=0.2) and 24-hour heating tariff (R=0.4). Per §12.4.3 the 18-hour tariff has 18h low-rate availability ≈ continuous charging → routes to the 24-hour Table 4a R sub-row for codes 402/403/405/406. |
Extended handover suite at HEAD: **903 pass, 0 fail.** Pyright net-zero
(43 → 43). Σ |ΔSAP_c| across the 25-variant cohort: **2.87 → 1.21**
(58% reduction across 4 slices over one session).
## Disciplines reinforced this session
1. **Per-line walk → spec → fix.** Every closure came from dumping the
failing variant's worksheet line-by-line:
- .156: cascade `primary_loss_monthly_kwh_annual = 509.98` vs
worksheet (59)m = 0 every month → Table 3 zero-loss line.
- .157: cascade `solar_storage_monthly_kwh_annual = 403.87` vs
worksheet (56) = 448.73 → ratio = 0.9 exactly → Table 2b note b.
- .158: cascade `pumps_fans_kwh_per_yr = 0` vs worksheet (249) =
136.35 → Table 4f warm-air fans = 1.5 × 0.4 × 227.25.
- .159: cascade T_living = 20.12 vs worksheet 19.6519 → all MIT
inputs match → back-solve from Tsc formula isolates R as the
only divergence → Table 4a Cat 7 24-hour sub-row.
2. **Companion WHC=903 fixes.** S0380.156 and .157 both added a guard
at the **top** of a predicate that already had logic for the same
case lower down. The Cat-4 HP early-return was masking the
downstream electric-immersion / electric-water checks. Pattern:
when a predicate has a per-system-category early-return, the
data-shape gate (here WHC=903) needs to come **first**.
3. **Spec docstring flagged the gap before it surfaced.** S0380.159
closed a TODO already noted in the source — the existing
`_RESPONSIVENESS_BY_SAP_CODE` dict comment said "promote to
(sap_code, tariff) lookup when 24-hour fixture surfaces; until
then the off-peak default applies (under-shoots R for the
24-hour case)." The 18-hour fixture surfaces in this corpus.
## Current residual state at HEAD `fba45d11`
### Cascade-OK tier (25 variants on pin grid)
Sorted by |ΔSAP_c|:
| Variant | ΔSAP_c | Δcost | ΔCO2 | ΔPE | Notes |
|---|---:|---:|---:|---:|---|
| oil 1 | **-0.0000** | **-0.00** | **+0.00** | **+0.00** | EXACT |
| oil pcdb 1 | **+0.0000** | **+0.00** | **-0.00** | **+0.00** | EXACT |
| oil pcdb 2 | **+0.0000** | **+0.00** | **-0.00** | **+0.00** | EXACT |
| oil pcdb 3 | **+0.0000** | **+0.00** | **+0.00** | **-0.00** | EXACT |
| electric 1 | **-0.0000** | **-0.00** | +11.95 | +48.66 | SAP/cost exact, lighting-PE quirk |
| solid fuel 5 | **+0.0000** | **+0.00** | +11.95 | +48.66 | same lighting-PE quirk |
| solid fuel 6 | **+0.0000** | **+0.00** | +11.95 | +48.66 | same |
| solid fuel 7 | **-0.0000** | **+0.00** | +11.95 | +48.66 | same |
| solid fuel 8 | **-0.0000** | **+0.00** | +11.95 | +48.66 | same |
| solid fuel 2 | **-0.0000** | **-0.00** | -93.10 | -1027.51 | SAP/cost exact, §12.4.4 blend artifact |
| solid fuel 3 | **-0.0000** | **-0.00** | +0.00 | -0.00 | EXACT |
| pcdb 1 | -0.0108 | +£0.24 | +1.33 | +5.70 | basically exact |
| gshp | -0.0178 | +£0.41 | +7.06 | +33.52 | basically exact |
| ashp | -0.0240 | +£0.55 | +7.33 | +36.34 | basically exact |
| solid fuel 4 | +0.0850 | -£1.96 | -9.31 | -5.78 | cluster |
| solid fuel 11 | +0.0912 | -£2.10 | +10.55 | -0.74 | cluster |
| electric 8 | +0.0941 | -£2.17 | +7.92 | +6.58 | cluster |
| electric 7 | +0.1017 | -£2.34 | +7.64 | +3.10 | cluster |
| **electric 5** | **+0.1081** | **-£2.49** | **+7.30** | **+0.07** | **CLOSED in .159 — was -1.18** |
| electric 6 | +0.1081 | -£2.49 | +7.32 | +0.16 | cluster |
| solid fuel 9 | +0.1072 | -£2.47 | +9.69 | -5.07 | cluster |
| **electric 2** | **-0.1087** | **+£2.50** | **+16.54** | **+97.69** | **CLOSED in .156-.158 — was -0.46** |
| solid fuel 10 | +0.1134 | -£2.61 | +9.31 | -13.91 | cluster |
| electric 9 | +0.1199 | -£2.76 | +6.82 | -4.51 | cluster |
| electric 3 | +0.1215 | -£2.80 | +6.72 | -5.99 | cluster |
**Σ |ΔSAP_c| = 1.21** (was 2.87 at start of S0380.156).
**Σ |ΔCO2| = 267.67 kg** (was ~310).
**Σ |ΔPE| = 1489.96 kWh** (was ~2400 — driven by 5-variant lighting cluster + sf2).
Buckets:
- **EXACT** (|Δ|<1e-4): **11/25** (44%)
- **basically exact** (|Δ|<0.05): 3/25 (ashp/gshp/pcdb1)
- **mid** (0.05..0.3): 11/25 — the 9-variant cluster (electric 3/5/6/7/8/9 + sf 4/9/10/11) + electric 2
- **big** (>=0.3): **0/25** — all variants now under 0.3 SAP
### Blocked tier (16 variants — `MissingMainFuelType`)
Unchanged. Community heating × 5, electric storage 11-14, no system,
oil 2-6, pcdb 3.
## Open fronts ranked by leverage
### 1. The 9-variant cluster — ±0.09..0.13 SAP / £2..£3 / +£7 CO2 / small PE
electric 3, electric 5, electric 6, electric 7, electric 8, electric 9,
solid fuel 4, solid fuel 9, solid fuel 10, solid fuel 11 — plus
similar magnitudes on electric 5 post-.159.
The pattern is uniform:
- ΔSAP +0.085 .. +0.121 (always positive — cascade SAP higher than worksheet)
- Δcost £1.96 .. £2.80 (cascade cost lower — under-charging by ~38 kWh × 7.41 p/kWh)
- ΔCO2 +6.72 .. +10.55 kg (cascade over-emitting)
- ΔPE 13.91 .. +6.58 kWh (small, both signs)
The cost gap is consistent with cascade SH demand being **~38 kWh
lower** than worksheet. Diagnostic from earlier probing on electric
3 / 6: cascade SH_demand ≈ 11050 vs worksheet (98c) ≈ 11088 (diff
38 kWh per variant).
**Hypothesis (per [[feedback-spec-floor-skepticism]] — verify before
accepting)**: prior handover ranked this cluster as "lowest leverage —
probably Elmhurst-vs-spec PE/CO2 monthly factor pattern". After .159
closed electric 5 from ALSO being in this pattern, that hypothesis
looks weaker — the same cluster shape may have a clean spec citation.
Suggested probe:
- Pick one cluster variant (electric 3 or solid fuel 4) and run a
per-line walk against the worksheet (98c)/(211)/(255).
- The £2.50 cost suggests **38 kWh SH demand difference**. Where?
- Check §8 step 10 `Qheat = 0.024 × (Lm η·Gm) × nm` line refs vs
worksheet (98a). If cascade Qheat is off, look upstream at (97).
- Closing this cluster would shave **Σ|ΔSAP| from 1.21 → ~0.10** in
one slice. **Highest leverage today.**
### 2. Electric 2 follow-up — 0.11 SAP / +£2.50 cost / +£16 CO2 / +£98 PE
Cascade now overshoots after .156-.158 wired the missing
primary-loss / storage-loss / warm-air-fan components correctly.
Likely a small upstream SH cascade gap (cascade SH demand +57 kWh
over worksheet — Cat 5 warm-air HP specific). The +136 kWh warm-air
fan electricity bills at 18-hour high rate → adds £18.64 + 18.54 kg
CO2 + 204 kWh PE. The over-shoot of +£2.50 / +£16 CO2 / +£98 PE
roughly matches the magnitude → suggests one of:
- The warm-air fan should bill at a different rate (Table 12a
fraction for ALL_OTHER_USES on 18-hour vs the Appendix F2 rate).
- The fan power should be lower for Cat 5 HPs (different SFP than
the default 1.5 W/(l/s)).
- Some other small Cat 5 / warm-air HP specific rule.
**Suggested probe**: dump the worksheet (249) cost line carefully —
worksheet shows 136.35 × 13.67 = £18.64. Cascade should compute the
same. If Δcost ≠ +£18.64, then the fan kWh is right but the rate is
not. If Δcost = +£18.64, the fan kWh shouldn't be there at all on
this row (maybe it goes to a different (249)b line).
### 3. Lighting-only PE +48.66 cohort — 5 variants — **DEFERRED**
electric 1, solid fuel 5/6/7/8. SAP/cost EXACT; PE +48.66 / CO2
+11.95 from Elmhurst using Table 12 ANNUAL factor for off-peak HW
immersion split. Spec Table 12d/12e header mandates MONTHLY factors.
Closing it violates spec.
### 4. Mapper-extension unblocking (16 blocked variants)
- Community heating × 5 — extend extractor for §14.1 block.
- Electric storage 11-14 — extend `_ELMHURST_MAIN_HEATING_EES_TO_FUEL_CODE`
for EES codes WEA, REA, OEA.
- "No system" — spec-assumed direct electric.
- Oil 2-6 — Table 4b non-oil liquid fuels (HVO/FAME/B30K/bioethanol).
- pcdb 3 — `"Bulk LPG"` mapper dict gap.
Separate from cascade closure work. Each unblock = one mapper slice.
## Slice history (this session)
| Slice | HEAD | Scope |
|---|---|---|
| S0380.156 | `02092c80` | SAP 10.2 Table 3 (PDF p.160) zero-loss list — universal WHC=903 guard at the top of `_primary_loss_applies`. New `_WHC_ELECTRIC_IMMERSION: Final[int] = 903` constant. Pre-slice the Cat 4 HP branch returned True unconditionally when no PCDB record was lodged → electric 2 cascade falsely added ~510 kWh/yr primary loss. Closures electric 2: HW kWh 2849 → 2339, ΔSAP 0.46 → +0.81 (residual swung past — exposed offsetting bugs). Δcost +£10.56 → £18.71, ΔPE +443 → 162. |
| S0380.157 | `a2a4b682` | SAP 10.2 Table 2b note b) (PDF p.159) — companion WHC=903 guard at top of `_separately_timed_dhw`. Pre-slice the Cat 4 HP branch fired before the existing `_is_electric_water` check could route to False; cascade applied ×0.9 to (53) Temperature Factor on an immersion-fed cylinder. Reuses `_WHC_ELECTRIC_IMMERSION`. Closures electric 2: storage loss 403.87 → 448.73 EXACT, HW kWh 2339 → 2384.12 EXACT match worksheet, ΔSAP +0.81 → +0.70. |
| S0380.158 | `8843df1b` | SAP 10.2 Table 4f (PDF p.174) row "Warm air heating system fans" = SFP × 0.4 × V per footnote e default SFP = 1.5 W/(l/s). New `_TABLE_4A_WARM_AIR_SAP_CODES` frozenset (22 codes: Cat 5 HPs 521/523-527 + Cat 9 warm-air 501-515/520) + leaf helper `_table_4f_warm_air_heating_fans_kwh(main, dwelling_volume_m3, has_balanced_mv)`. Footnote-e balanced-MV omission via `_has_balanced_mechanical_ventilation` predicate. Closures electric 2: pumps_fans 0 → 136.35 EXACT, ΔSAP +0.70 → 0.11, Δcost £16.14 → +£2.50. |
| S0380.159 | `fba45d11` | SAP 10.2 Table 4a (PDF p.166) Cat 7 R splits Off-peak vs 24-hour heating tariff sub-rows. Per §12.4.3 the 18-hour tariff has 18h low-rate availability ≈ continuous charging. New `_CONTINUOUS_CHARGING_TARIFFS = {EIGHTEEN_HOUR, TWENTY_FOUR_HOUR}` + `_RESPONSIVENESS_24_HOUR_OVERRIDE_BY_SAP_CODE` (codes 402/403/405/406). `tariff: Optional[Tariff]` parameter added to `_responsiveness`; threaded through both MIT cascade call sites. Closures electric 5: ΔSAP 1.18 → +0.11 (91% reduction), Δcost +£27.09 → £2.49, ΔPE +438 → +0.07 EXACT. Electric 5 now joins the 9-variant cluster pattern. |
## Standard slice workflow (unchanged)
1. Read spec page + identify rule
2. Probe one cluster variant; verify diagnosis via monkey-patch / direct walk
3. Write failing AAA test (literal `# Arrange / # Act / # Assert`)
4. Implement helper / dispatch entry / mapper extension
5. Re-pin affected variants (DO NOT widen tolerance)
6. Run extended handover suite (command below)
7. Pyright net-zero check (`git stash` → pyright → `git stash pop` → pyright)
8. Commit with spec citation + `Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>`
9. Update `project-heating-systems-corpus` + `MEMORY.md` index
## Test baseline at HEAD `fba45d11`
```bash
PYTHONPATH=/workspaces/model python -m pytest \
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
backend/documents_parser/tests/test_heating_systems_corpus.py \
backend/documents_parser/tests/test_elmhurst_extractor.py \
backend/documents_parser/tests/test_elmhurst_end_to_end.py \
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
domain/sap10_calculator/worksheet/tests/test_heat_transmission.py \
domain/sap10_calculator/worksheet/tests/test_internal_gains.py \
domain/sap10_calculator/worksheet/tests/test_solar_gains.py \
domain/sap10_calculator/worksheet/tests/test_dimensions.py \
domain/sap10_calculator/worksheet/tests/test_rating.py \
domain/sap10_calculator/worksheet/tests/test_ventilation.py \
domain/sap10_calculator/worksheet/tests/test_appendix_h_solar.py \
domain/sap10_calculator/worksheet/tests/test_mev.py \
domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
domain/sap10_calculator/tests/test_pcdb_table_322_lookup.py \
domain/sap10_calculator/tests/test_pcdb_table_329_lookup.py \
domain/sap10_calculator/tests/test_table_12a.py \
--no-cov -q
```
Expected: **903 pass, 0 fail.**
## Memories to load (in order)
```
project-heating-systems-corpus # HEAD fba45d11
feedback-sap-10-2-only-never-10-3 # CRITICAL — never reference SAP 10.3
feedback-software-no-special-handling # CRITICAL — apply spec uniformly
feedback-spec-floor-skepticism # CUTS BOTH WAYS — skeptical of your OWN audit narrative
feedback-worksheet-not-api-reference
feedback-spec-citation-in-commits
feedback-verify-handover-claims
feedback-zero-error-strict # TARGET: ΔSAP_c < 1e-4 vs worksheet
feedback-commit-per-slice
feedback-aaa-test-convention
feedback-e2e-validation-philosophy
feedback-abs-diff-over-pytest-approx
feedback-golden-residuals-near-zero
feedback-one-e-minus-4-across-the-board
reference-unmapped-sap-code
reference-unmapped-api-code
project-oil-price-spec-divergence
```
## What NOT to do
- **Don't reference SAP 10.3** — track 10.2 deliberately.
- **Don't widen pin tolerances** — re-pin smaller or find the spec gap.
- **Don't add empirical gates** to keep cohort pins stable when a
spec rule clearly applies.
- **Don't re-investigate Slices .91..159** — all settled.
- **Don't add new helpers to `domain/sap10_ml/`** — on deprecation
path; `domain/sap10_calculator/tables/` is the canonical home.
- **Don't treat ΔSAP=0.07 as "closed"** — target is <1e-4 vs worksheet.
- **Don't try to close the lighting-PE +48.66 cluster** — it's an
Elmhurst-vs-spec quirk that closing would violate spec.
- **Don't form a spec hypothesis without per-line data** — walk the
worksheet line-by-line for the failing variant first.
## Spec source quick-reference
All under `domain/sap10_calculator/docs/specs/`:
- **SAP 10.2 full spec**: `sap-10-2-full-specification-2025-03-14.pdf`
- **§4** (p.135-137) — water heating worksheet (45..65)
- **§7** (p.26) — Mean internal temperature
- **§9.2.4** (p.27) — Solid fuel boiler systems
- **§9.4.11** (p.30) — Boiler interlock: -5pp to BOTH SH and DHW
- **§9.4.19** (p.34-35) — Storage heater controls
- **§12.4.3** (p.36) — Electric tariff types (7-hour / 10-hour /
18-hour / 24-hour heating). **Slice .159.**
- **§12.4.4** (p.36-37) — Solid fuel back-boiler summer immersion.
**Slice .154.**
- **§A.2.2** (~p.189) — Forced-secondary set
- **Appendix D §D2.1 (2)** (p.57) — Eq D1 monthly water eff cascade
- **Appendix F2** (p.63) — 18-hour CPSU
- **Appendix N3** (p.107-109) — Heat pump DHW efficiency cascade
- **Table 2b** (p.159) — Cylinder temperature factor + note b ×0.9
rule for boiler/warm-air/HP. **Slice .157.**
- **Table 3** (p.160) — Primary circuit loss; zero-loss list incl.
electric immersion. **Slices .152 / .153 / .156.**
- **Table 4a** (p.163-170) — heating systems + R splits between
Off-peak and 24-hour heating tariff for Cat 7 electric storage.
**Slices .155 / .159.**
- **Table 4b** (p.168) — gas/liquid boilers seasonal efficiency
- **Table 4e** (p.171-173) — heating system controls + temperature
adjustment column. Group 4 storage controls 2401/2402/2403.
- **Table 4f** (p.174) — pumps + fans (incl. warm-air row).
**Slice .158.**
- **Tables 9 / 9a / 9b / 9c** (p.182-184) — heating periods, MIT
cascade, T_sc formula. Used in .159 back-solve.
- **Table 11** (p.188) — secondary heating fraction
- **Table 12** (p.191) — SAP rating fuel prices + standing charges
- **Table 12a** (p.191) — high/low-rate fraction by system × tariff
- **Table 12d/12e** (p.195-196) — monthly variation in CO2/PE factors
- **Table 13** (p.197) — high-rate fraction for electric DHW
- **RdSAP 10 spec**: `RdSAP 10 Specification 10-06-2025.pdf`
- **§4.1 Table 5** (p.28) — Ventilation parameters
- **§5** (p.29) — Floor infiltration spec rule
- **§10.11 Table 29** (p.56) — Heating/HW parameters
- **§19 Table 32** (p.95) — RdSAP10 fuel prices / CO2 / PE
## Good luck.

View file

@ -0,0 +1,295 @@
# Handover — post Slices S0380.160..163
Branch: `feature/per-cert-mapper-validation`. **HEAD `9896644c`**.
Predecessor: [`HANDOVER_POST_S0380_159.md`](HANDOVER_POST_S0380_159.md).
## TL;DR
Four slices landed; the 41-variant controlled-variable heating-systems
corpus closed from Σ|ΔSAP_c| 1.24 → 0 on its 25 cascade-OK variants.
All 25 now SAP / cost / CO2 / PE **EXACT** vs the Elmhurst worksheet
on all 4 metrics, with only `solid fuel 2` open via the S0380.154
summer-immersion-blend artifact. The master doc gained a new §8
"Elmhurst-mirrored spec divergences" section seeded by .163.
| Slice | Commit | Spec rule / engine behaviour closed |
|---|---|---|
| S0380.160 | `af34ad98` | SAP 10.2 Table 5a (PDF p.177) "Central heating pump in heated space" — wet-pump gate. Pre-slice cascade added 7 W pump gain for every non-HP main; row only applies to mains with a water-loop circulation pump (electric storage / solid-fuel room heaters / electric direct-acting are dry → 0 W). |
| S0380.161 | `482ce88b` | SAP 10.2 Table 5a (PDF p.177) "Warm air heating system fans a) c)" — GAIN side = SFP × 0.04 × V. Sister to S0380.158 (kWh side); wires Cat 5 warm-air HP (e.g. electric 2 code 524) + Cat 9 warm-air non-HP. Footnote c) omission when balanced MV present. |
| S0380.162 | `8d465d97` | SAP 10.2 Appendix N3.1 (PDF p.105) "The default heat gain from Table 5a is included via worksheet (70)" for electric HPs. .160 had over-stripped HP pump gain; .162 refines: PCDB Table 362 records keep 0 W (pump in COP per N1.2.1); Cat 5 warm-air HPs keep 0 W (no water pump); Cat 4 HPs without PCDB get 3 W default. |
| S0380.163 | `9896644c` | **First Elmhurst-mirrored spec divergence.** SAP 10.2 Table 12 footnote (t) reads "monthly Table 12e factors should be used" for all electric end-uses; the BRE-approved Elmhurst engine uses Table 12 annual flat (1.501 PE / 0.136 CO2) for the worksheet (278) "Water heating (low-rate cost)" line on dual-rate tariffs. Cascade now mirrors the engine — STANDARD tariff still monthly, dual-rate (7-hour / 10-hour / 18-hour / 24-hour) → annual. |
Extended handover suite at HEAD: **907 pass, 0 fail.** Pyright net-zero
(43 → 43).
## Disciplines reinforced this session
1. **Per-line walk before forming a spec hypothesis.** Every closure
came from dumping the failing variant's worksheet line-by-line:
- .160: cascade `pumps_fans[Jan] = 7.0` vs worksheet (70) = 0 for
electric 3 → Table 5a "Central heating pump" row inapplicable.
- .161: cascade (70) = 0 for electric 2 vs worksheet 13.6350 W =
1.5 × 0.04 × 227.25 → Table 5a warm-air-fan row never wired.
- .162: cascade (70) = 0 for ashp vs worksheet 3.0 W → Appendix
N3.1 default heat gain rule for electric HPs without PCDB.
- .163: cascade HW PE factor 1.5214 vs worksheet 1.5010 →
Elmhurst applies Table 12 annual for low-rate dual-rate billing.
2. **`[[feedback-spec-floor-skepticism]]` cuts both ways.** The
handover post-.159 claimed the lighting-PE +48.66 cohort was
"non-closable per spec" (Table 12 footnote (t) mandates monthly).
Per-line walk revealed: cascade IS spec-correct, Elmhurst diverges,
and per [[feedback-software-no-special-handling]] we mirror the
engine. The user pushed back on the "non-closable" framing and
that pushback was correct — the divergence IS closable, just at
the cost of one documented Elmhurst mirror. New master-doc §8
captures the divergence with criteria for when to add more.
3. **Slice rollback as a debugging tool.** S0380.160 over-stripped
HP pump gain (zeroed for all HPs including ashp/gshp where the
spec applies the Table 5a default). .162 didn't revert .160 — it
refined the predicate with the Appendix N3.1 carve-out, so PCDB-
Table-362 HPs stay at 0 and non-PCDB HPs apply the default.
## Current residual state at HEAD `9896644c`
### Cascade-OK tier (25 variants on pin grid)
**All 25 variants now SAP / cost / CO2 / PE EXACT (|Δ| < 1e-3)** vs the
worksheet, except `solid fuel 2`:
| Variant | ΔSAP_c | Δcost | ΔCO2 | ΔPE | Notes |
|---|---:|---:|---:|---:|---|
| ashp | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| electric 1 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| electric 2 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT (was -0.11 SAP) |
| electric 3 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT (was +0.12 SAP) |
| electric 5 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| electric 6 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| electric 7 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| electric 8 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| electric 9 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| gshp | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| oil 1 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| oil pcdb 1 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| oil pcdb 2 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| oil pcdb 3 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| pcdb 1 | -0.0108 | +£0.24 | +1.33 | +5.70 | sub-tolerance |
| **solid fuel 2** | **±0.0000** | **±0.00** | **-93.10** | **-1027.51** | S0380.154 summer-immersion-blend artifact |
| solid fuel 3 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| solid fuel 4 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| solid fuel 5 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| solid fuel 6 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| solid fuel 7 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| solid fuel 8 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| solid fuel 9 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| solid fuel 10 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| solid fuel 11 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
**Σ|ΔSAP_c| = 0.011** (entirely `pcdb 1`, was 2.87 at start of
S0380.156). 24/25 variants are SAP/cost/CO2/PE EXACT.
### Blocked tier (16 variants — `MissingMainFuelType`)
Unchanged. Community heating × 5, electric storage 11-14, no system,
oil 2-6, pcdb 3.
## Open fronts ranked by leverage
### 1. **`solid fuel 2` — S0380.154 summer-immersion-blend CO2/PE — 93/-1027**
Cascade ΔSAP / Δcost are EXACT but ΔCO2 = 93 kg/yr and ΔPE = 1027
kWh/yr remain. Source: S0380.154 split HW into winter-boiler + Jun-Sep
electric-immersion blend. The blend (`_section_12_4_4_hw_blend`) sets
its own `hw_co2_factor` / `hw_pe_factor` directly — it doesn't route
through `_hot_water_co2_factor_kg_per_kwh` / `_hot_water_primary_factor`
which got the .163 dual-rate annual gate.
Likely a parallel fix: either route the blend through the same Elmhurst-
mirror gate, OR investigate whether Elmhurst applies Table 12d/12e
monthly for the summer-immersion months (the 4 Jun-Sep months) and the
boiler factor for the 8 winter months, vs the cascade's all-monthly
treatment. Per-line walk needed first — `_section_12_4_4_hw_blend`
docstring is at `domain/sap10_calculator/rdsap/cert_to_inputs.py:4870`.
**Highest leverage:** closes the LAST open variant in the corpus
cascade-OK tier. After this, 25/25 EXACT on all 4 metrics.
### 2. **`pcdb 1`0.0108 SAP / +£0.24 / +1.33 CO2 / +5.7 PE**
Sub-tolerance gap. PCDB-listed gas boiler (Table 322 index 716). Not
the same shape as the lighting-PE quirk (different magnitude per kWh).
Probably a Δ in cascade HW or SH computation specific to PCDB Table
322 path. Lower leverage — already < 0.05 SAP.
### 3. **Mapper-extension unblocking (16 blocked variants)**
Separate from cascade closure. Each unblock = one mapper slice:
- Community heating × 5 — extend extractor for §14.1 block.
- Electric storage 11-14 — extend `_ELMHURST_MAIN_HEATING_EES_TO_FUEL_CODE`
for EES codes WEA, REA, OEA.
- "No system" — spec-assumed direct electric.
- Oil 2-6 — Table 4b non-oil liquid fuels (HVO/FAME/B30K/bioethanol).
- pcdb 3 — `"Bulk LPG"` mapper dict gap.
Each variant unblocked becomes a new pin on the corpus residual grid;
closures from there follow the existing per-line-walk discipline.
### 4. **Cohort-2 golden residuals**
`test_golden_fixtures.py` carries PE/CO2 residual pins for 38 cohort-2
certs. The S0380.163 fix (HW PE/CO2 annual on dual-rate) likely
affected several. After S0380.163 ran the golden suite passes (59/59);
verify the pinned residuals are still optimal or could now be tightened
toward zero per [[feedback-golden-residuals-near-zero]]. **Quick check
slice:** loop the golden fixtures, dump current residual vs pinned
residual, re-pin if pinned > actual.
## Slice history (this session)
| Slice | HEAD | Scope |
|---|---|---|
| S0380.160 | `af34ad98` | SAP 10.2 Table 5a (PDF p.177) row 1 "Central heating pump in heated space" wet-pump gate. New `_any_main_system_has_central_heating_pump(epc)` predicate in `internal_gains.py` mirroring `cert_to_inputs._is_wet_boiler_main` (S0380.149's kWh-side gate). Pre-slice the cascade applied 7 W (UNKNOWN-date default) for every non-HP main; per worksheet evidence (electric 3 (70) = 0 every month vs cascade 7 W), dry mains have no central heating pump and the row simply doesn't apply. 10-variant cluster closure: electric 3/5/6/7/8/9 + solid fuel 4/9/10/11 ΔSAP +0.085..+0.121 → ±0.0000 EXACT. |
| S0380.161 | `482ce88b` | SAP 10.2 Table 5a (PDF p.177) row "Warm air heating system fans a) c)" GAIN side = SFP × 0.04 × V W with default SFP 1.5 W/(l/s) per footnote c). Sister to S0380.158 which wired the Table 4f kWh side (136.35 kWh/yr). Per-line walk on electric 2 (Cat 5 ASHP code 524): worksheet (70) = 13.6350 W heating-mask, cascade 0 W. New `_any_main_system_has_warm_air_distribution(epc)` + `_has_balanced_mechanical_ventilation(epc)` predicates + `_TABLE_5A_WARM_AIR_FAN_DEFAULT_SFP_W_PER_L_PER_S = 1.5` constant. Closures electric 2: ΔSAP 0.1087 → 0.0000 EXACT. |
| S0380.162 | `8d465d97` | SAP 10.2 Appendix N3.1 (PDF p.105) "Circulation pump and fan" — "For electric heat pumps: ... The default heat gain from Table 5a is included via worksheet (70)." S0380.160 over-stripped (zeroed for all HPs); .162 refines the HP gate in `_any_main_system_has_central_heating_pump`: PCDB Table 362 records keep 0 W (pump in COP per N1.2.1); Cat 5 warm-air HPs keep 0 W (no water pump; warm-air fan via .161); Cat 4 HPs without PCDB get the Table 5a default per pump age. Closures: ashp ΔSAP 0.0240 → +0.0000 EXACT, Δcost +£0.55 → +£0.00 EXACT, ΔPE +36.34 → +25.51 (residual narrows to HW annual-vs-monthly Elmhurst quirk only); gshp same shape. |
| **S0380.163** | **`9896644c`** | **First Elmhurst-mirrored spec divergence. SAP 10.2 Table 12 footnote (t) (PDF p.189) reads literally would apply Table 12e monthly factors to all electric end-uses including dual-rate HW. The BRE-approved Elmhurst engine applies Table 12 ANNUAL flat (1.501 PE / 0.136 CO2) for the worksheet (278) "Water heating (low-rate cost)" row on dual-rate tariffs. New `tariff: Tariff` parameter on `_hot_water_primary_factor` + `_hot_water_co2_factor_kg_per_kwh`: STANDARD → monthly cascade (unchanged); 7-hour/10-hour/18-hour/24-hour → Table 12 annual flat. 18-variant deferred cohort closure (electric 1/2/3/5/6/7/8/9 + solid fuel 4/5/6/7/8/9/10/11 + ashp + gshp): ΔCO2 +6.31/+11.95 → ±0.0000 EXACT, ΔPE +25.51/+48.66 → ±0.0000 EXACT. All 25 cascade-OK variants now SAP / cost / CO2 / PE EXACT (except solid fuel 2 summer-immersion blend artifact). Master doc gained new §8 "Elmhurst-mirrored spec divergences" section.** |
## Standard slice workflow (unchanged)
1. Read spec page + identify rule (or Elmhurst worksheet pattern)
2. Probe one variant; verify diagnosis via monkey-patch / direct walk
3. Write failing AAA test (literal `# Arrange / # Act / # Assert`)
4. Implement helper / dispatch entry / mapper extension
5. Re-pin affected variants (DO NOT widen tolerance)
6. Run extended handover suite (command below)
7. Pyright net-zero check (`git stash` → pyright → `git stash pop` → pyright)
8. If mirroring Elmhurst against spec literal: add a row to
`SAP_CALCULATOR.md §8 "Elmhurst-mirrored spec divergences"`.
9. Commit with spec citation + `Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>`
10. Update `project-heating-systems-corpus` + `MEMORY.md` index
## Test baseline at HEAD `9896644c`
```bash
PYTHONPATH=/workspaces/model python -m pytest \
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
backend/documents_parser/tests/test_heating_systems_corpus.py \
backend/documents_parser/tests/test_elmhurst_extractor.py \
backend/documents_parser/tests/test_elmhurst_end_to_end.py \
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
domain/sap10_calculator/worksheet/tests/test_heat_transmission.py \
domain/sap10_calculator/worksheet/tests/test_internal_gains.py \
domain/sap10_calculator/worksheet/tests/test_solar_gains.py \
domain/sap10_calculator/worksheet/tests/test_dimensions.py \
domain/sap10_calculator/worksheet/tests/test_rating.py \
domain/sap10_calculator/worksheet/tests/test_ventilation.py \
domain/sap10_calculator/worksheet/tests/test_appendix_h_solar.py \
domain/sap10_calculator/worksheet/tests/test_mev.py \
domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
domain/sap10_calculator/tests/test_pcdb_table_322_lookup.py \
domain/sap10_calculator/tests/test_pcdb_table_329_lookup.py \
domain/sap10_calculator/tests/test_table_12a.py \
--no-cov -q
```
Expected: **907 pass, 0 fail.**
## Memories to load (in order)
```
project-heating-systems-corpus # HEAD 9896644c
feedback-sap-10-2-only-never-10-3 # CRITICAL — never reference SAP 10.3
feedback-software-no-special-handling # CRITICAL — informed S0380.163; mirror the engine
feedback-spec-floor-skepticism # cuts both ways: spec-floor AND non-closable framings
feedback-worksheet-not-api-reference
feedback-spec-citation-in-commits
feedback-verify-handover-claims
feedback-zero-error-strict # TARGET: ΔSAP_c < 1e-4 vs worksheet
feedback-commit-per-slice
feedback-aaa-test-convention
feedback-e2e-validation-philosophy
feedback-abs-diff-over-pytest-approx
feedback-golden-residuals-near-zero # relevant for Open Front #4 below
feedback-one-e-minus-4-across-the-board
reference-unmapped-sap-code
reference-unmapped-api-code
project-oil-price-spec-divergence
```
## What NOT to do
- **Don't reference SAP 10.3** — track 10.2 deliberately.
- **Don't widen pin tolerances** — re-pin smaller or find the spec gap.
- **Don't add empirical gates** to keep cohort pins stable when a
spec rule clearly applies. Add Elmhurst-mirror gates ONLY when
worksheet evidence is reproducible across multiple certs.
- **Don't re-investigate Slices .91..163** — all settled.
- **Don't add new helpers to `domain/sap10_ml/`** — on deprecation
path; `domain/sap10_calculator/tables/` is the canonical home.
- **Don't treat ΔSAP=0.07 as "closed"** — target is <1e-4 vs worksheet.
- **Don't add a new SAP_CALCULATOR.md §8 divergence row without per-line
worksheet evidence across ≥2 certs.** The Elmhurst-mirror gate is
the exception, not the rule; default to spec-correct cascade.
## Spec source quick-reference
All under `domain/sap10_calculator/docs/specs/`:
- **SAP 10.2 full spec**: `sap-10-2-full-specification-2025-03-14.pdf`
- **§4** (p.135-137) — water heating worksheet (45..65)
- **§7** (p.26) — Mean internal temperature
- **§9.2.4** (p.27) — Solid fuel boiler systems
- **§9.4.11** (p.30) — Boiler interlock: -5pp to BOTH SH and DHW
- **§9.4.19** (p.34-35) — Storage heater controls
- **§12.4.3** (p.36) — Electric tariff types
- **§12.4.4** (p.36-37) — Solid fuel back-boiler summer immersion.
**Used in Slice .154; the source of `solid fuel 2`'s open
residual (Open Front #1).**
- **§A.2.2** (~p.189) — Forced-secondary set
- **Appendix D §D2.1 (2)** (p.57) — Eq D1 monthly water eff cascade
- **Appendix F2** (p.63) — 18-hour CPSU
- **Appendix N3.1** (p.105) — Heat pump circulation pump GAIN
inclusion rule. **Slice .162.**
- **Appendix N3** (p.107-109) — Heat pump DHW efficiency cascade
- **Table 2b** (p.159) — Cylinder temperature factor + note b ×0.9
rule for boiler/warm-air/HP. **Slice .157.**
- **Table 3** (p.160) — Primary circuit loss; zero-loss list incl.
electric immersion. **Slices .152 / .153 / .156.**
- **Table 4a** (p.163-170) — heating systems + R splits. **Slices
.155 / .159.**
- **Table 4b** (p.168) — gas/liquid boilers seasonal efficiency
- **Table 4e** (p.171-173) — heating system controls + temperature
adjustment column. Group 4 storage controls 2401/2402/2403.
- **Table 4f** (p.174) — pumps + fans (incl. warm-air row).
**Slice .158.**
- **Table 5a** (p.177) — pump + fan GAINS (incl. central heating
pump and warm-air-fan rows). **Slices .160 / .161 / .162.**
- **Tables 9 / 9a / 9b / 9c** (p.182-184) — heating periods, MIT
cascade, T_sc formula.
- **Table 11** (p.188) — secondary heating fraction
- **Table 12** (p.189) — fuel prices + annual CO2/PE factors;
footnotes (s)/(t) point to monthly cascades. **Slice .163
(Elmhurst-mirror divergence).**
- **Table 12a** (p.191) — high/low-rate fraction by system × tariff
- **Table 12d/12e** (p.194-195) — monthly variation in CO2/PE
factors. **Slice .163 (mirrored against literal reading).**
- **Table 13** (p.197) — high-rate fraction for electric DHW
- **RdSAP 10 spec**: `RdSAP 10 Specification 10-06-2025.pdf`
- **§4.1 Table 5** (p.28) — Ventilation parameters
- **§5** (p.29) — Floor infiltration spec rule
- **§10.11 Table 29** (p.56) — Heating/HW parameters
- **§19 Table 32** (p.95) — RdSAP10 fuel prices / CO2 / PE
- **§19.2** (p.94) — RdSAP10 CO2/PE = SAP10.2 Table 12 (defers to
SAP 10.2 §14 for PE calc — confirms footnote (t) applies to
EPC PE block).
## Master doc
The canonical architecture + API + validation doc lives at
[`domain/sap10_calculator/docs/SAP_CALCULATOR.md`](SAP_CALCULATOR.md)
(7 sections + new §8). The S0380.163 slice added §8 as the home for
Elmhurst-mirrored spec divergences; future slices that diverge from
spec literal interpretation should add a §8.x row there.
## Good luck.

View file

@ -0,0 +1,230 @@
# Handover — post Slice S0380.164
Branch: `feature/per-cert-mapper-validation`. **HEAD `<new>`**.
Predecessor: [`HANDOVER_POST_S0380_163.md`](HANDOVER_POST_S0380_163.md).
## TL;DR
S0380.164 closed the **last** open variant in the 25-variant cascade-OK
tier of the heating-systems corpus. `solid fuel 2`'s residual ΔCO2 =
93.10 / ΔPE = 1027.51 (S0380.154 summer-immersion blend artifact) →
±0.0000 EXACT on both. All 25 cascade-OK variants now SAP / cost /
CO2 / PE EXACT vs the Elmhurst worksheet on every metric. Master doc
gained §8.2 "Elmhurst-mirrored summer-immersion CO2/PE double-count"
flagged with the single-cert evidence caveat.
| Slice | Commit | Spec rule / engine behaviour closed |
|---|---|---|
| S0380.164 | `<new>` | **Second Elmhurst-mirrored spec divergence.** SAP 10.2 §12.4.4 (PDF p.36-37) back-boiler combos: spec-literal CO2/PE for summer immersion = Σ wh_summer_m × Table 12d/12e monthly (per Table 12 footnotes s/t). BRE-approved Elmhurst engine adds an extra `S_fuel × Table 12 annual electric` term ON TOP of the monthly cascade for dual-rate tariffs — same shape as §8.1 (S0380.163) but additive. Closure SF2: ΔCO2 93.10 → +0.0000, ΔPE 1027.51 → +0.0000. 25/25 cascade-OK variants now SAP / cost / CO2 / PE EXACT. Documented at `SAP_CALCULATOR.md §8.2` with explicit single-cert evidence flag. |
Extended handover suite at HEAD: **909 pass, 0 fail.** Pyright net-zero
(43 → 43).
## Discipline reinforced this session
1. **Per-line walk first.** SF2's worksheet (264) HW CO2 factor 0.3710
and (278) HW PE factor 1.3771 don't decompose into any single Table
12 / 12d / 12e combination. Back-solving with the cascade's
`W × anth_annual + S × monthly_summer_avg` formula left an unexplained
residual that matched exactly `S_fuel × Table 12 annual electric` on
both metrics. The pattern is the §8.1 (S0380.163) Elmhurst-mirror
applied a second time, additively.
2. **Single-cert evidence handled with discipline.** The corpus has
exactly one §12.4.4 fixture: SF2. `solid fuel 1` (= code 156) is
an empty folder; no other corpus cert exercises a §12.4.4 back-
boiler combo. The handover discipline says "≥2 certs" before
adding a `SAP_CALCULATOR.md §8` row. **User-explicit override:** the
user accepted the single-cert case given (a) clean per-line
evidence (math matches to within rounding); (b) the same shape as
the §8.1 mirror already in place. The new §8.2 row is tagged with
an explicit "⚠ Single-cert evidence" subsection so future agents
know to revisit when a second §12.4.4-eligible cert worksheet
becomes available.
3. **Cost unaffected — only CO2/PE.** The §12.4.4 blend computes cost
cleanly per spec: `W × boiler_price + S × off_peak_low_price`. The
double-count quirk only appears on the CO2 and PE factor lines.
Consistent with Elmhurst's engine where cost flows through
pricing tables (Table 32) while CO2/PE flow through factor tables
(Table 12 / 12d / 12e) — the divergence is in the factor logic, not
the price logic.
## Current residual state at HEAD `<new>`
### Cascade-OK tier (25 variants on pin grid) — **ALL EXACT**
All 25 variants now SAP / cost / CO2 / PE **EXACT** (|Δ| < 1e-3) vs the
worksheet, with the sole remaining residual being `pcdb 1` at
sub-tolerance.
| Variant | ΔSAP_c | Δcost | ΔCO2 | ΔPE | Notes |
|---|---:|---:|---:|---:|---|
| ashp | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| electric 1 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| electric 2 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| electric 3 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| electric 5 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| electric 6 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| electric 7 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| electric 8 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| electric 9 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| gshp | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| oil 1 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| oil pcdb 1 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| oil pcdb 2 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| oil pcdb 3 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| pcdb 1 | -0.0108 | +£0.24 | +1.33 | +5.70 | sub-tolerance |
| **solid fuel 2** | **±0.0000** | **±0.00** | **±0.0000** | **±0.0000** | **EXACT (was -93/-1027 pre-slice)** |
| solid fuel 3 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| solid fuel 4 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| solid fuel 5 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| solid fuel 6 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| solid fuel 7 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| solid fuel 8 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| solid fuel 9 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| solid fuel 10 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
| solid fuel 11 | ±0.0000 | ±0.00 | ±0.00 | ±0.00 | EXACT |
**Σ|ΔSAP_c| = 0.011** (entirely `pcdb 1`). The 41-variant heating-
systems corpus is **closed on its cascade-OK tier**; only sub-tolerance
work and mapper-extension unblocks remain.
### Blocked tier (16 variants — `MissingMainFuelType`)
Unchanged. Community heating × 5, electric storage 11-14, no system,
oil 2-6, pcdb 3.
## Open fronts ranked by leverage
### 1. **`pcdb 1` sub-tolerance — 0.011 SAP / +£0.24 / +1.33 CO2 / +5.7 PE**
The last sub-tolerance gap in the cascade-OK tier. Per-line probe:
- PCDF Index 716 (Potterton oil boiler, 65 % winter / 53 % summer)
- Cascade HW kWh = 7068.41 vs worksheet (219) = 7063.96 → Δ +4.45 kWh
- Δ4.45 × 5.44 p/kWh = £0.242 ≡ Δcost pin ✓
- Δ4.45 × 0.298 kg/kWh = 1.325 kg ≡ ΔCO2 pin ✓
- Δ4.45 × 1.180 kWh/kWh = 5.25 (vs pin +5.70 — close, demand-mode
HW kWh likely differs by ~0.5 from rating-mode)
The 4.45 kWh HW kWh overshoot is a tiny computation diff in the Eq D1
monthly cascade. Worksheet (217)m for pcdb 1:
- Jan-May / Oct-Dec: 54.41 .. 57.00 (Eq D1 weighted between adjusted
60 winter and adjusted 48 summer)
- Jun-Sep: 48.00 (summer eff only, no Eq D1 weighting)
The cascade likely produces slightly different monthly weights or fails
to switch to summer-only on Jun-Sep. Closing this needs a deep dive
into the PCDB-Table-322 Eq D1 cascade for `Cylinder Stat: No` certs
with WHC=901. ~£0.24 + 1.3 kg / 5.7 kWh is essentially noise.
### 2. **Mapper-extension unblocking (16 blocked variants)**
Separate from cascade closure. Each unblock = one mapper slice:
- Community heating × 5 — extend extractor for §14.1 block.
- Electric storage 11-14 — extend `_ELMHURST_MAIN_HEATING_EES_TO_FUEL_CODE`
for EES codes WEA, REA, OEA.
- "No system" — spec-assumed direct electric.
- Oil 2-6 — Table 4b non-oil liquid fuels (HVO/FAME/B30K/bioethanol).
- pcdb 3 — `"Bulk LPG"` mapper dict gap (one-line `_ELMHURST_MAIN_
FUEL_TO_SAP10["Bulk LPG"] = 27`).
Each variant unblocked becomes a new pin on the corpus residual grid;
closures from there follow the existing per-line-walk discipline.
### 3. **Cohort-2 golden residuals**
`test_golden_fixtures.py` carries PE/CO2 residual pins for 38 cohort-2
certs. S0380.164's narrow gate (§12.4.4 + back-boiler combo + dual-rate
+ cylinder + WHC ∈ {901,902,914}) means cohort-2 is unaffected; 59/59
golden tests pass. Quick-check slice: loop the golden fixtures, dump
current residual vs pinned residual, re-pin tighter if pinned > actual.
## Standard slice workflow (unchanged)
1. Read spec page + identify rule (or Elmhurst worksheet pattern)
2. Probe one variant; verify diagnosis via monkey-patch / direct walk
3. Write failing AAA test (literal `# Arrange / # Act / # Assert`)
4. Implement helper / dispatch entry / mapper extension
5. Re-pin affected variants (DO NOT widen tolerance)
6. Run extended handover suite (command below)
7. Pyright net-zero check (`git stash` → pyright → `git stash pop` → pyright)
8. If mirroring Elmhurst against spec literal: add a row to
`SAP_CALCULATOR.md §8 "Elmhurst-mirrored spec divergences"`. The
≥2-cert rule applies unless the new divergence shares its shape with
an already-documented row (S0380.164 was admitted under this
exception with a single-cert flag — S0380.164 is the precedent).
9. Commit with spec citation + `Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>`
10. Update `project-heating-systems-corpus` + `MEMORY.md` index
## Test baseline at HEAD `<new>`
```bash
PYTHONPATH=/workspaces/model python -m pytest \
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
backend/documents_parser/tests/test_heating_systems_corpus.py \
backend/documents_parser/tests/test_elmhurst_extractor.py \
backend/documents_parser/tests/test_elmhurst_end_to_end.py \
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
domain/sap10_calculator/worksheet/tests/test_heat_transmission.py \
domain/sap10_calculator/worksheet/tests/test_internal_gains.py \
domain/sap10_calculator/worksheet/tests/test_solar_gains.py \
domain/sap10_calculator/worksheet/tests/test_dimensions.py \
domain/sap10_calculator/worksheet/tests/test_rating.py \
domain/sap10_calculator/worksheet/tests/test_ventilation.py \
domain/sap10_calculator/worksheet/tests/test_appendix_h_solar.py \
domain/sap10_calculator/worksheet/tests/test_mev.py \
domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
domain/sap10_calculator/tests/test_pcdb_table_322_lookup.py \
domain/sap10_calculator/tests/test_pcdb_table_329_lookup.py \
domain/sap10_calculator/tests/test_table_12a.py \
--no-cov -q
```
Expected: **909 pass, 0 fail.**
## Memories to load (in order)
```
project-heating-systems-corpus # HEAD <new>
feedback-sap-10-2-only-never-10-3 # CRITICAL — never reference SAP 10.3
feedback-software-no-special-handling # CRITICAL — informed S0380.163 / .164
feedback-spec-floor-skepticism # cuts both ways
feedback-worksheet-not-api-reference
feedback-spec-citation-in-commits
feedback-verify-handover-claims
feedback-zero-error-strict # TARGET: ΔSAP_c < 1e-4 vs worksheet
feedback-commit-per-slice
feedback-aaa-test-convention
feedback-e2e-validation-philosophy
feedback-abs-diff-over-pytest-approx
feedback-golden-residuals-near-zero
feedback-one-e-minus-4-across-the-board
reference-unmapped-sap-code
reference-unmapped-api-code
project-oil-price-spec-divergence
```
## What NOT to do
- **Don't reference SAP 10.3** — track 10.2 deliberately.
- **Don't widen pin tolerances** — re-pin smaller or find the spec gap.
- **Don't add empirical gates** to keep cohort pins stable when a
spec rule clearly applies. Add Elmhurst-mirror gates ONLY when
worksheet evidence is reproducible across multiple certs OR shares
shape with an already-documented §8 row (the .164 single-cert
precedent).
- **Don't re-investigate Slices .91..164** — all settled.
- **Don't add new helpers to `domain/sap10_ml/`** — on deprecation
path; `domain/sap10_calculator/tables/` is the canonical home.
- **Don't treat ΔSAP=0.07 as "closed"** — target is <1e-4 vs worksheet.
## Master doc
The canonical architecture + API + validation doc lives at
[`domain/sap10_calculator/docs/SAP_CALCULATOR.md`](SAP_CALCULATOR.md)
(7 sections + §8 with .1 and .2 entries). S0380.164 added §8.2 for
the §12.4.4 summer-immersion double-count.
## Good luck.

View file

@ -0,0 +1,193 @@
# Handover — post Slices S0380.164..169
Branch: `feature/per-cert-mapper-validation`. **HEAD `9ed003a5`**.
Predecessor: [`HANDOVER_POST_S0380_164.md`](HANDOVER_POST_S0380_164.md).
## TL;DR
Six slices landed in two phases:
**Phase 1 — cascade closure** (.164 + .165): closed the LAST two open
variants in the cascade-OK tier. `solid fuel 2` ΔCO2/ΔPE 93/-1027 →
±0.0000 (Elmhurst-mirror §12.4.4 summer-immersion double-count, new
§8.2 row). `pcdb 1` Δ0.011 SAP → +0.0000 (§9.4.11 boiler-interlock
-5pp applied AFTER Eq D1 instead of before). The 41-variant heating-
systems corpus now has its 27-variant **cascade-OK tier fully EXACT**
on every metric (Σ|ΔSAP_c| ≈ 0.0001 ≈ floating-point noise).
**Phase 2 — mapper unblocks** (.166.169): 11 of 16 blocked variants
unblocked via 4 mapper-extension slices. 7 EXACT on first try
(pcdb 3, electric 11/12/13/14, oil 2, oil 5); 4 land with cascade-side
residuals pinned as forcing functions (oil 3/4/6, no system).
| Slice | HEAD | Scope | Cascade-OK tier |
|---|---|---|---|
| S0380.164 | `302db131` | **§8.2 Elmhurst-mirror summer-immersion CO2/PE double-count.** §12.4.4 back-boiler HW blend adds `S_fuel × Table 12 annual electric` term on top of Table 12d/e monthly cascade for dual-rate tariffs. Closures: `solid fuel 2` ΔCO2/ΔPE → ±0.0000. Single-cert evidence flagged in new §8.2 row. | 25/25 EXACT |
| S0380.165 | `3de52bcb` | **§9.4.11 boiler-interlock -5pp applied AFTER Eq D1, not before.** Reciprocal Eq D1 weighting is non-linear in η; worksheet (217)m for pcdb 1 matches post-Eq-D1 form to 1e-4. New `interlock_penalty_pp` kwarg on `_apply_water_efficiency`. Closures: `pcdb 1` Δ0.011 SAP / +£0.24 / +1.33 CO2 / +5.70 PE → ±0.0000. | 27/27 EXACT |
| S0380.166 | `589a8631` | **Bulk LPG mapper unblock.** `"Bulk LPG": 27` in `_ELMHURST_MAIN_FUEL_TO_SAP10`. Pcdb 3 EXACT first try. | 28/28 EXACT |
| S0380.167 | `7901dda4` | **Electric storage 11-14 unblock.** WEA / REA / OEA → 30. Electric 11/12/13/14 all EXACT first try. | 32/32 EXACT |
| S0380.168 | `58a95472` | **Bio-liquid oil 2-6 unblock + Table 32 FAME 5.44 → 7.64 (deferred .131 TODO).** BFD/BXE/BXF/BZC/B3C + 4 water labels. oil 2 (HVO) + oil 5 (Bioethanol) EXACT; oil 3/4 (FAME) closed 85 % of cost gap; oil 6 (B30K) carries cascade residual. | 34 EXACT + 3 pinned |
| S0380.169 | `9ed003a5` | **"No system" unblock per §A.2.2.** `"NON": 30`. Cascade lands with small residual pinned. | 34 EXACT + 4 pinned |
Extended handover suite at HEAD: **916 pass, 0 fail.** Pyright net-zero
(43 → 43).
## Current residual state at HEAD `9ed003a5`
### Cascade-OK tier (38 variants, up from 25)
**34 variants EXACT (|Δ| < 1e-3) on all 4 metrics.** 4 variants
carry pinned non-zero residuals (forcing functions):
| Variant | ΔSAP_c | Δcost | ΔCO2 | ΔPE | Notes |
|---|---:|---:|---:|---:|---|
| oil 3 | +2.59 | £62 | 14.58 | 967 | FAME cascade HW kWh diff |
| oil 4 | +2.56 | £57 | 13.35 | 885 | FAME cascade HW kWh diff |
| oil 6 | +3.05 | £70 | 240.66 | 1113 | B30K SH/HW kWh gap |
| no system | +1.18 | £27 | 49.83 | 562 | §A.2.2 portable-electric defaults |
All other 34 variants ±0.0000 on every metric.
### Blocked tier (5 variants — community heating)
Unchanged. `community heating 1/2/3/4/6`. **Deepest remaining tier:**
needs extractor field extraction + new community-fuel-type dispatch
in the mapper + cascade integration for heat-network paths (Table 12c
distribution loss, Table 12 heat-network codes 41/43/48/51/53/54).
## Open fronts ranked by leverage
### 1. **Community heating unblock — 5 variants** (next-session work)
The deepest remaining work. The Elmhurst Summary §14.1 lodges:
Heating Type Space and Water Heating
Community Heat Source Boilers / Combined Heat and Power / Heat pump
Community Fuel Type Mains Gas / Electricity / Mineral oil / Coal / ...
Variant breakdown:
community heating 1 — COM + 301 + Boilers + Mains Gas → code 51
community heating 2 — COM + 302 + CHP + Mains Gas → code 48
community heating 3 — COM + 304 + Heat pump + Electricity → code 41
community heating 4 — COM + 302 + CHP + Mineral oil → code 48 (or 53 if not CHP)
community heating 6 — COM + 302 + CHP + Coal → code 48 (or 54 if not CHP)
The work spans three layers:
1. **Extractor**: extract Community Heat Source + Community Fuel Type
strings into a new `CommunityHeating` dataclass on `ElmhurstSiteNotes`.
2. **Mapper**: `_resolve_community_heating_fuel_code(heat_source, fuel)`
dispatch helper that maps the (heat_source, fuel) pair to the
correct Table 12 heat-network code.
3. **Cascade**: ensure the heat-network path (Table 12c DLF age-band,
PCDB Table 322 records if lodged, Table 12 note (k) DHW-only
half-standing) handles all 5 sub-variants correctly.
Likely 2-3 slices total — start with the extractor + mapper, then
probe each variant's residual and close cascade gaps as they surface.
### 2. **oil 3 / oil 4 (FAME) cascade HW kWh gap**
Cascade HW kWh ~900 less than worksheet for oil 3/4 (FAME boilers).
Probably SAP 10.2 Table 4b code 128/129 has a non-standard summer
efficiency or Eq D1 path that the cascade isn't applying. Per-line
worksheet walk: cascade `_apply_water_efficiency` output vs (219)m
row.
### 3. **oil 6 (B30K) cascade SH + HW kWh gap**
Δcost £70 on identical prices means kWh differs. Likely a different
Table 4b code-126 path. Closes ~£70 / 240 CO2 / 1113 PE.
### 4. **no system §A.2.2 cascade defaults**
ΔSAP +1.18 — cascade thinks dwelling is more efficient than worksheet.
Probable spec gap: §A.2.2 portable-electric defaults
(responsiveness / control-type / Table 11 secondary fraction). Per-
line walk on (210)m / (240) cost factor needed.
## Standard slice workflow (unchanged)
1. Read spec page + identify rule (or Elmhurst worksheet pattern)
2. Probe one variant; verify diagnosis via monkey-patch / direct walk
3. Write failing AAA test (literal `# Arrange / # Act / # Assert`)
4. Implement helper / dispatch entry / mapper extension
5. Re-pin affected variants (DO NOT widen tolerance)
6. Run extended handover suite (command below)
7. Pyright net-zero check (`git stash` → pyright → `git stash pop` → pyright)
8. If mirroring Elmhurst against spec literal: add a row to
`SAP_CALCULATOR.md §8 "Elmhurst-mirrored spec divergences"`. The
≥2-cert rule applies unless the new divergence shares its shape
with an already-documented row (S0380.164 added §8.2 under this
exception with a single-cert flag).
9. Commit with spec citation + `Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>`
10. Update `project-heating-systems-corpus` + `MEMORY.md` index
## Test baseline at HEAD `9ed003a5`
```bash
PYTHONPATH=/workspaces/model python -m pytest \
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
backend/documents_parser/tests/test_heating_systems_corpus.py \
backend/documents_parser/tests/test_elmhurst_extractor.py \
backend/documents_parser/tests/test_elmhurst_end_to_end.py \
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
domain/sap10_calculator/worksheet/tests/test_heat_transmission.py \
domain/sap10_calculator/worksheet/tests/test_internal_gains.py \
domain/sap10_calculator/worksheet/tests/test_solar_gains.py \
domain/sap10_calculator/worksheet/tests/test_dimensions.py \
domain/sap10_calculator/worksheet/tests/test_rating.py \
domain/sap10_calculator/worksheet/tests/test_ventilation.py \
domain/sap10_calculator/worksheet/tests/test_appendix_h_solar.py \
domain/sap10_calculator/worksheet/tests/test_mev.py \
domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
domain/sap10_calculator/tests/test_pcdb_table_322_lookup.py \
domain/sap10_calculator/tests/test_pcdb_table_329_lookup.py \
domain/sap10_calculator/tests/test_table_12a.py \
--no-cov -q
```
Expected: **916 pass, 0 fail.**
## Memories to load (in order)
```
project-heating-systems-corpus # HEAD 9ed003a5
feedback-sap-10-2-only-never-10-3
feedback-software-no-special-handling
feedback-spec-floor-skepticism
feedback-worksheet-not-api-reference
feedback-spec-citation-in-commits
feedback-verify-handover-claims
feedback-zero-error-strict
feedback-commit-per-slice
feedback-aaa-test-convention
feedback-e2e-validation-philosophy
feedback-abs-diff-over-pytest-approx
feedback-golden-residuals-near-zero
feedback-one-e-minus-4-across-the-board
feedback-bigger-slices-for-uniform-work
reference-unmapped-sap-code
reference-unmapped-api-code
project-oil-price-spec-divergence
```
## What NOT to do
- **Don't reference SAP 10.3** — track 10.2 deliberately.
- **Don't widen pin tolerances** — re-pin smaller or find the spec gap.
- **Don't add empirical gates** to keep cohort pins stable.
- **Don't re-investigate Slices .91..169** — all settled.
- **Don't add new helpers to `domain/sap10_ml/`** — on deprecation path.
- **Don't treat ΔSAP=0.07 as "closed"** — target is <1e-4 vs worksheet.
## Master doc
The canonical architecture + API + validation doc lives at
[`domain/sap10_calculator/docs/SAP_CALCULATOR.md`](SAP_CALCULATOR.md)
(7 sections + §8 with .1 and .2 entries). S0380.164 added §8.2 for
the §12.4.4 summer-immersion double-count.
## Good luck.

View file

@ -0,0 +1,287 @@
# Handover — post Slices S0380.170..173
Branch: `feature/per-cert-mapper-validation`. **HEAD `e71987c2`**.
Predecessor: [`HANDOVER_POST_S0380_169.md`](HANDOVER_POST_S0380_169.md).
## TL;DR
Four community-heating slices landed. The **blocked tier emptied** for
the first time in the corpus's history (`.170`); cost cascade closed
on the CHP cluster (`.171`); CO2/PE closed on non-CHP variants
(`.172`, `.173`).
All 41 corpus variants now run end-to-end through the cascade:
**36 EXACT + 9 pinned** (oil 3/4/6 + no system + 5 community
heating). The 5 community-heating variants carry forcing-function
residuals scoped to specific Elmhurst-mirror divergences detailed
below.
| Slice | HEAD | Scope |
|---|---|---|
| S0380.170 | `9f0d23ad` | **Community heating mapper unblock.** New `CommunityHeating` dataclass on `ElmhurstSiteNotes.main_heating`; extractor `_extract_community_heating()` reads §14.1 Heat Source × Fuel Type. Mapper `_resolve_community_heating_fuel_code(heat_source, fuel)` dispatches per SAP 10.2 Table 12 (PDF p.189): Boilers+Gas→51, CHP→48, HP+Elec→41, Boilers+Oil→53, Boilers+Coal→54. All 5 variants unblocked; 5 forcing-function residuals pinned. Blocked tier tuple emptied. |
| S0380.171 | `a4b5f4e7` | **RdSAP 10 §C CHP heat-fraction cost split.** New `MainHeatingDetail.community_heating_chp_fraction` + `community_heating_boiler_fuel_type` fields populated by the mapper for SAP code 302. `_fuel_cost_gbp_per_kwh` returns `0.35 × CHP_price + 0.65 × boiler_price` when fields set. CH2/CH4 cost gap £104 → +£0.17 (essentially exact); SAP +4.50 → 0.008. CH6 regressed (-3.52 → -8.03 SAP) — spec-correct fix exposed cert-side DLF=1.0 quirk that only CH6 lodges (in P960 input data, not in Summary). |
| S0380.172 | `36d4bf87` | **Table 4a heat-network heat-source-eff CO2/PE factor scaling.** New `_HEAT_NETWORK_HEAT_SOURCE_EFFICIENCY` dict (301→0.80, 304→3.00 per SAP 10.2 Table 4a PDF p.164). `_heat_network_heat_source_efficiency_scaling(main)` returns 1/eff. Wired into `_main_heating_co2_factor_kg_per_kwh` + `_main_heating_primary_factor` non-electric branches. CH1 CO2/PE -787/-3827 → -126/-967; CH3 CO2/PE +1614/+11879 → +473/+1749. SAP 302 excluded — converges with CHP credit in follow-up. |
| S0380.173 | `e71987c2` | **WHC=901 HW path inherits main fuel for community heating.** New `_is_community_heating_hw_from_main(epc)` predicate (WHC ∈ {901,902,914} + heat-network main + SAP code in heat-source-eff table). `_hot_water_fuel_cost_gbp_per_kwh` gains `inherit_main_for_community_heating` kwarg; HW CO2/PE get top-level branches scaled by 1/heat_source_eff. CH1 PE 967 → **9** (essentially closed); CH3 PE +1749 → 387 (~78%); CH3 CO2 +473 → 86 (~82%). Cost/SAP signs flip on CH1/CH3 — HW matches worksheet exactly, exposing +£12 lighting/standing overage. |
Extended handover suite at HEAD: **926 pass + 1 skipped, 0 fail.**
Pyright net-zero on affected files (32 → 32 across the 4 slices).
## Current residual state at HEAD `e71987c2`
### Cascade-OK tier (41 variants — all populated corpus folders)
**36 variants EXACT (|Δ| < 1e-3) on all 4 metrics.** 9 variants
carry pinned non-zero residuals (forcing functions, ranked by
total magnitude):
| Variant | ΔSAP_c | Δcost | ΔCO2 | ΔPE | Closure driver |
|---|---:|---:|---:|---:|---|
| CH6 (CHP/Coal) | 8.03 | +£185 | 2935 | +7865 | DLF=1.0 in P960 + CHP credit |
| CH4 (CHP/Oil) | 0.008 | +£0.17 | 4397 | +495 | CHP credit (CO2) |
| CH2 (CHP/Gas) | 0.008 | +£0.17 | 1430 | +1506 | CHP credit (CO2 + PE) |
| oil 6 (B30K) | +3.05 | £70 | 241 | 1113 | Table 4b code 126 SH+HW kWh gap |
| oil 3 (FAME) | +2.59 | £62 | 15 | 967 | Table 4b code 128 HW kWh gap |
| oil 4 (FAME) | +2.56 | £57 | 13 | 885 | Table 4b code 129 HW kWh gap |
| CH3 (HP/Elec) | 0.53 | +£12 | 86 | 387 | Lighting/standing + 0.8523 multiplier |
| CH1 (Boilers/Gas) | 0.53 | +£12 | +52 | **9** | Lighting/standing (PE essentially closed) |
| no system | +1.18 | £27 | 50 | 562 | §A.2.2 portable-electric defaults |
### Blocked tier (0 variants)
**Empty for the first time.** All previously blocked variants
(`community heating 1/2/3/4/6`, `electric 11-14`, `oil 2-6`, `no
system`, `pcdb 3`) now cascade-execute. The
`_BLOCKED_BY_MISSING_MAIN_FUEL_TYPE` tuple in
[`test_heating_systems_corpus.py`](../../../backend/documents_parser/tests/test_heating_systems_corpus.py)
is empty; the parametrized raise-test is `pytest.mark.skipif`'d
with reason `"all blocked variants have been unblocked (latest:
S0380.170)"`.
## Open fronts ranked by leverage
### 1. SAP 302 CHP CO2/PE credit cascade (3 variants — CH2, CH4, CH6)
Highest cohort leverage: closes ~8 SAP-equivalent across CH2 / CH4
/ CH6 + their large CO2 / PE residuals simultaneously.
Per spec block 13b PE (PDF p.153) + 12b CO2:
```
Space heating from CHP (307a) × 100 ÷ (462) = ... (463)
less credit emissions (307a)×(461) ÷ (462) = ... (464)
Water heated by CHP (310a) × 100 ÷ (462) = ... (465)
less credit emissions (310a)×(461) ÷ (462) = ... (466)
Heat from heat source 2 [(307b)+(310b)] × 100 ÷
(467b) = ... (468)
```
Per RdSAP 10 §C (PDF p.58) defaults: **CHP overall eff 75%,
heat-to-power ratio 2.0 → heat_eff 50% + electric_eff 25%; boiler
eff 80%**. Verified against CH2/CH4/CH6 worksheet (461)/(462) = 25%
/ 50% exactly.
**Per-line worksheet caveat.** The Elmhurst worksheet (463) energy
column = spec_formula × 0.8523 uniformly across non-CHP heat-
network rows. This 0.8523 multiplier appears in CH1 (467) too (=
spec `(307+310) × 100/80` × 0.8523 → 16717.79 instead of 19614.94).
Mechanism unidentified; not RdSAP 10 / SAP 10.2 spec-derived as
far as the spec PDFs document. **Do per-line walks before forming
hypotheses** per [[feedback-spec-floor-skepticism]]. This may need
a SAP_CALCULATOR.md §8 row.
Implementation sketch: add CHP credit factor + boiler-fuel-code
fields to MainHeatingDetail; the .172 scaling helper already keys
on `_HEAT_NETWORK_HEAT_SOURCE_EFFICIENCY` — add 302 there with
weighted overall eff once the split formula is in place. The
.173 predicate `_is_community_heating_hw_from_main` also gates on
table membership and will pick up SAP 302 automatically.
Likely 2-3 slices: (a) CHP credit + boiler-side eff for SH; (b)
mirror for HW path; (c) Elmhurst 0.8523 multiplier if it turns out
to be load-bearing.
### 2. CH1 / CH3 lighting / standing overage (+£12 cost)
Surfaced by S0380.173 closing the HW path. Cascade cost matches
worksheet exactly on SH + HW, leaves +£12 over on lighting + standing.
Probable mechanisms (in rank order):
1. Standing charge double-count. Worksheet (351) = £120 for code 51
(heat-network). Cascade may also apply the Mains-gas standing
even though water_heating_fuel still lodges code 26 → API code 1.
2. Lighting kWh rate mismatch. Cascade uses `other_fuel_cost_gbp_
per_kwh = 0.1367` (18-hour high) — verify against worksheet (350)
= 282 × 0.1367.
3. (313) electricity-for-heat-distribution kWh stream billed at
wrong rate. Worksheet uses heat-network rate 4.24 for this; check
cascade.
Probably 1 slice once diagnosed via per-line walk. Closes CH1 +
CH3 fully.
### 3. CH6 DLF=1.0 lodging in P960 (cert-side architecture gap)
CH6 P960 input data lodges `Distribution Loss: Two adjoining
dwellings sharing a single heating system` + `Distribution Loss
Value: 0.0`, producing worksheet (306) = 1.0000. CH4 with the
same §14 Summary shape lodges `Distribution Loss: Calculated +
Value: 1.5`, producing (306) = 1.4500.
The DLF distinguisher is NOT in the Summary PDF — only the P960
worksheet input data block. The current architecture only reads
Summary; routing through P960 inverts that.
Two paths forward:
- (a) Extend the Elmhurst Summary extractor to look for any `§17
Additional Information` line — currently neither CH4 nor CH6
Summaries lodge anything here, but if Elmhurst adds it the gap
closes.
- (b) Accept CH6 as a pinned forcing function. Spec-correct
cascade applies DLF=1.45 for age G per Table 12c; CH6's manual
override (per spec §C3.1: "For design-stage SAP assessments, a
DLF of >= 1 can be manually entered") is unmodelable without
P960 access.
Recommend (b) — pin and document.
### 4. oil 3 / oil 4 (FAME) HW kWh gap
Carried over from S0380.168. ΔSAP +2.59/+2.56. Cascade HW kWh
~900 less than worksheet on FAME boilers (Table 4b codes 128/129).
Per-line walk on `_apply_water_efficiency` vs (219)m. Probably 1
slice.
### 5. oil 6 (B30K) SH + HW kWh gap
Carried over from S0380.168. ΔSAP +3.05. Likely Table 4b code-126
path differs. Probably 1 slice.
### 6. "no system" §A.2.2 portable-electric defaults
Carried over from S0380.169. ΔSAP +1.18. Cascade thinks dwelling
more efficient than worksheet. Probable §A.2.2 portable-electric
defaults gap (responsiveness/control/Table 11). Probably 1 slice.
## Critical discipline reinforced last session
**Per-line walk worksheet → spec → fix.** S0380.171 + .172 + .173
each landed via per-line worksheet dumps confirming the spec rule
before implementation. S0380.173 in particular: probing CH3 HW
factors revealed the cascade was billing HW at Mains-gas (Elmhurst
§15.0 placeholder) rather than heat-network rate; per-line walk on
worksheet (342) confirmed the fix direction.
**Spec-floor skepticism cuts BOTH ways.** S0380.171 was framed by
the prior handover as a single closure for CH2/CH4/CH6 ("biggest
leverage by spec-coherent grouping"). The actual implementation
closed CH2/CH4 exactly but REGRESSED CH6 — exposing the cert-side
DLF=1.0 quirk that was previously masked by offsetting bugs. Per
[[feedback-software-no-special-handling]] applied uniformly;
documented as forcing function rather than gated out.
**Gate carefully across SH and HW paths.** S0380.172 + .173 use the
same `_HEAT_NETWORK_HEAT_SOURCE_EFFICIENCY` table to gate the
heat-source-eff scaling. SAP 302 is intentionally absent — when the
CHP credit slice lands, ADD 302 to that table and both SH (via
.172's wiring) and HW (via .173's predicate) auto-activate.
**Cost-side and CO2/PE-side need different efficiencies for heat
networks.** Cost uses heat-network unit price × network_input
(metered at the dwelling boundary). CO2/PE uses Table 12 factor ×
fuel_input = network_input / heat_source_eff. The .172 + .173
scaling helpers express this by pre-scaling the Table 12 factor at
lookup time, leaving the cost path unaffected.
## Standard slice workflow (unchanged)
1. Read spec page + identify rule (or Elmhurst worksheet pattern)
2. Probe one variant; verify diagnosis via monkey-patch / direct walk
3. Write failing AAA test (literal `# Arrange / # Act / # Assert`)
4. Implement helper / dispatch entry / mapper extension
5. Re-pin affected variants (DO NOT widen tolerance)
6. Run extended handover suite (command below)
7. Pyright net-zero check (`git stash` → pyright → `git stash pop` → pyright)
8. If mirroring Elmhurst against spec literal: add a row to
`SAP_CALCULATOR.md §8 "Elmhurst-mirrored spec divergences"`. The
≥2-cert rule applies unless the new divergence shares its shape
with an already-documented row (S0380.164 added §8.2 under this
exception with a single-cert flag).
9. Commit with spec citation + `Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>`
10. Update `project-heating-systems-corpus` + `MEMORY.md` index
## Test baseline at HEAD `e71987c2`
```bash
PYTHONPATH=/workspaces/model python -m pytest \
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
backend/documents_parser/tests/test_heating_systems_corpus.py \
backend/documents_parser/tests/test_elmhurst_extractor.py \
backend/documents_parser/tests/test_elmhurst_end_to_end.py \
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
domain/sap10_calculator/worksheet/tests/test_heat_transmission.py \
domain/sap10_calculator/worksheet/tests/test_internal_gains.py \
domain/sap10_calculator/worksheet/tests/test_solar_gains.py \
domain/sap10_calculator/worksheet/tests/test_dimensions.py \
domain/sap10_calculator/worksheet/tests/test_rating.py \
domain/sap10_calculator/worksheet/tests/test_ventilation.py \
domain/sap10_calculator/worksheet/tests/test_appendix_h_solar.py \
domain/sap10_calculator/worksheet/tests/test_mev.py \
domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
domain/sap10_calculator/tests/test_pcdb_table_322_lookup.py \
domain/sap10_calculator/tests/test_pcdb_table_329_lookup.py \
domain/sap10_calculator/tests/test_table_12a.py \
--no-cov -q
```
Expected: **926 pass + 1 skipped, 0 fail.**
## Memories to load (in order)
```
project-heating-systems-corpus # HEAD e71987c2
feedback-sap-10-2-only-never-10-3
feedback-software-no-special-handling
feedback-spec-floor-skepticism
feedback-worksheet-not-api-reference
feedback-spec-citation-in-commits
feedback-verify-handover-claims
feedback-zero-error-strict
feedback-commit-per-slice
feedback-aaa-test-convention
feedback-e2e-validation-philosophy
feedback-abs-diff-over-pytest-approx
feedback-golden-residuals-near-zero
feedback-one-e-minus-4-across-the-board
feedback-bigger-slices-for-uniform-work
reference-unmapped-sap-code
reference-unmapped-api-code
project-oil-price-spec-divergence
```
## What NOT to do
- **Don't reference SAP 10.3** — track 10.2 deliberately.
- **Don't widen pin tolerances** — re-pin smaller or find the spec gap.
- **Don't add empirical gates** to keep cohort pins stable.
- **Don't re-investigate Slices .91..173** — all settled.
- **Don't add new helpers to `domain/sap10_ml/`** — on deprecation path.
- **Don't treat ΔSAP=0.07 as "closed"** — target is <1e-4 vs worksheet.
- **Don't form a spec hypothesis without per-line data** — walk the
worksheet first. The Elmhurst 0.8523 multiplier on heat-network
rows (CH1 (467), CH3 (467), CH2/CH4/CH6 (468)) is unexplained and
may be load-bearing for the CHP credit slice.
- **Don't gate SH and HW paths separately.** The .172 + .173 wiring
shares `_HEAT_NETWORK_HEAT_SOURCE_EFFICIENCY` membership; adding
SAP 302 to that table auto-activates both paths.
## Master doc
The canonical architecture + API + validation doc lives at
[`domain/sap10_calculator/docs/SAP_CALCULATOR.md`](SAP_CALCULATOR.md)
(§8.1 + §8.2 documented). The next CHP-credit slice may add §8.3 if
the Elmhurst 0.8523 multiplier or block-13b PE/CO2 line formulas
turn out to diverge from spec literal.
## Good luck.

View file

@ -0,0 +1,277 @@
# Handover — post Slices S0380.174..176
Branch: `feature/per-cert-mapper-validation`. **HEAD `326066ee`**.
Predecessor: [`HANDOVER_POST_S0380_173.md`](HANDOVER_POST_S0380_173.md).
## TL;DR
Three slices landed on the heating-systems corpus:
- **`.174`** closed §4 (62)m HW for all 5 community-heating variants;
CH1 HW EXACT.
- **`.175`** wired §14.1 Community Heating "Heating Controls SAP" into
the mapper; CH1 + CH3 SAP / cost EXACT.
- **`.176`** added Table 4b combi sub-row fall-through to the (61)m
default gate; **oil 3 + oil 4 FULLY EXACT on all four metrics**.
**41 variants total → 34 EXACT + 7 pinned** (was 36 + 5 pre-`.174`).
Pinned count drops from 9 (in `.173` handover) to 7 because oil 3 +
oil 4 fully closed; CH1 + CH3 reshape from "SAP/cost-pinned" to
"SAP/cost EXACT + CO2/PE pinned" pending the (372) electrical-
distribution Elmhurst factor mystery.
| Slice | HEAD | Scope |
|---|---|---|
| S0380.174 | `4876140a` | **§4 storage + primary loss for community heating.** SAP 10.2 §4 line 1482 "Heat networks": primary circuit loss for insulated pipework + cylinderstat applies. Table 2b note b verbatim system-type list ("boiler / warm air / heat pump") OMITS community heating — ×0.9 multiplier doesn't apply. Three changes: new `_HEAT_NETWORK_PIPEWORK_INSULATION_FRACTION = 1.0`, new branch in `_primary_loss_applies` (heat-network main + WHC ∈ {901, 902, 914} → True), new `_table_2b_note_b_multiplier_applies` predicate gating ×0.9 by system type. CH1 HW useful (62)m closes 2339.24 → 2658.01 EXACT vs ws; HW fuel kWh closes 3391.90 → 3854.12 EXACT; (65)m heat gains 793.51 → 1221.62 EXACT. Cost/SAP signs flipped — exposed pre-existing §7 MIT +0.46 K over-count. |
| S0380.175 | `eda07d12` | **§14.1 Community Heating heating_controls_sap extraction.** All 5 CH variants lodge "Heating Controls SAP: 2306" in §14.1 (bare 4-digit form), not in §14.0 "Main Heating Controls Sap" (which is empty for CH certs). Pre-slice mapper read only §14.0 → `main_heating_control=''` → cascade defaulted to `control_type=2` (off-hours (7, 8)). Code 2306 (Table 4e Group 3) → control_type=3 (off-hours (9, 8)), which closes the +0.46 K MIT (92)m residual that `.174` surfaced. Two changes: `_elmhurst_sap_control_code` accepts bare integer form, `_map_elmhurst_sap_heating` falls through to `mh.community_heating.heating_controls_sap` when §14.0 is empty. CH1 + CH3 ΔSAP_c -1.0572 → +0.0000 EXACT; Δcost +£24.36 → -£0.00 EXACT. |
| S0380.176 | `326066ee` | **Table 4b combi sub-row dispatch for (61)m default.** SAP 10.2 §4 line 7702 + Table 4b row names: codes 128/129/130 are explicit combi sub-rows ("Combi oil boiler, ..."). Pre-slice `_table_3a_combi_loss_default_applies` gated only on `main_heating_category ∈ {1, 2, 3, 6}`; Elmhurst mapper leaves the category None on Table 4b liquid-fuel boilers so the cascade fell through to (61)m=0. Added `_TABLE_4B_COMBI_OR_CPSU_CODES` fall-through (set already exists in symmetric `_primary_loss_applies` Table 4b branch — see `.146`). **oil 3 + oil 4 ALL FOUR METRICS EXACT** (ΔSAP +2.5863/+2.5603 → ±0.0000, Δcost -£62/-£57 → ±0.00, ΔPE -967/-885 → ±0.00). Cohort 9 → 7 pinned. |
Extended handover suite at HEAD: **933 pass + 1 skipped, 0 fail.**
Pyright net-zero on all affected files.
## Current residual state at HEAD `326066ee`
### Cascade-OK tier (41 variants — all populated corpus folders)
**34 variants EXACT** on all four metrics (|ΔSAP| < 1e-3, |Δcost| <
£0.01, |ΔCO2| < 0.1 kg, |ΔPE| < 0.1 kWh):
```
ashp, gshp,
electric 1, 2, 3, 5, 6, 7, 8, 9, 11, 12, 13, 14,
oil 1, oil 2, oil 3, oil 4, oil 5, oil pcdb 1, oil pcdb 2, oil pcdb 3,
pcdb 1, pcdb 3,
solid fuel 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
```
**7 variants pinned**:
| Variant | SAP code | ΔSAP_c | Δcost | ΔCO2 | ΔPE | Closure driver |
|---|---:|---:|---:|---:|---:|---|
| CH6 (CHP/Coal) | 302 | 7.49 | +£172.68 | 2939.67 | +7481.57 | DLF=1.0 P960 quirk + CHP credit |
| oil 6 (B30K) | 126 | +3.05 | £69.79 | 240.66 | 1112.66 | -5pp interlock penalty on non-combi |
| no system | 699 | +1.18 | £27.15 | 49.83 | 562.44 | §A.2.2 portable-electric defaults |
| CH4 (CHP/Oil) | 302 | +0.53 | £12.16 | 4401.85 | +111.58 | SAP 302 CHP credit (CO2) |
| CH2 (CHP/Gas) | 302 | +0.53 | £12.16 | 1435.09 | +1123.01 | SAP 302 CHP credit (CO2 + PE) |
| CH3 (HP/Elec) | 304 | +0.0000 | £0.00 | 98.92 | 457.54 | (372) electrical-distribution CO2/PE + (367) HP scaling |
| CH1 (Boilers/Gas) | 301 | +0.0000 | £0.00 | 23.60 | 208.23 | (372) electrical-distribution CO2/PE Elmhurst factor |
### Blocked tier (0 variants)
**Empty** (since `.170`). `_BLOCKED_BY_MISSING_MAIN_FUEL_TYPE` tuple
in `test_heating_systems_corpus.py` remains empty; the parametrised
raise-test is `pytest.mark.skipif`'d.
## Open fronts ranked by leverage
### 1. SAP 302 CHP CO2/PE credit cascade (3 variants — CH2, CH4, CH6)
Highest cohort leverage: closes ~8 SAP-equivalent across CH6 + the
big CO2 / PE residuals on CH2 / CH4 simultaneously. Spec block 13b
PE (PDF p.153) + 12b CO2:
```
Space heating from CHP (307a) × 100 ÷ (362) = ... (363)
less credit emissions (307a)×(361) ÷ (362) = ... (364)
Water heated by CHP (310a) × 100 ÷ (362) = ... (365)
less credit emissions (310a)×(361) ÷ (362) = ... (366)
Heat from heat source 2 [(307b)+(310b)] × 100 ÷
(467b) = ... (468)
```
RdSAP 10 §C defaults: CHP overall eff 75%, heat-to-power ratio 2.0 →
heat_eff 50% + electric_eff 25%; boiler eff 80%. Verified against
CH2/CH4/CH6 worksheet (461)/(462) = 25% / 50% exactly.
**Per-line walk caveat (unresolved).** The Elmhurst worksheet (463)
energy column = spec_formula × 0.8523 uniformly across non-CHP
heat-network rows. The 0.8523 multiplier appears in CH1 (467) too.
Mechanism unidentified; not RdSAP 10 / SAP 10.2 spec-derived. Walk
the worksheet per-line before forming hypotheses.
Likely 2-3 slices: (a) CHP credit + boiler-side eff for SH; (b)
mirror for HW path; (c) the 0.8523 multiplier if load-bearing. The
`.172` scaling helper already keys on
`_HEAT_NETWORK_HEAT_SOURCE_EFFICIENCY` — add 302 there with weighted
overall eff once the split formula is in place. The `.173` predicate
`_is_community_heating_hw_from_main` also gates on table membership
and will pick up SAP 302 automatically.
### 2. oil 6 (B30K) 5pp interlock cascade (1 variant, ΔSAP +3.05)
Per-line walk in this session: cascade Eq D1 outputs (winter, summer) =
(80, 68) → annual eff ~73% at Jan; ws (217)m Jan = 73.07. Cascade gives
HW kWh 3823.38 vs ws 4099.59. Back-solving the worksheet: applying -5pp
interlock penalty to BOTH winter and summer ((75, 63)) reproduces ws
(217)m Jan = 73.07 EXACTLY.
Cascade `eq_d1_interlock_penalty_pp` is gated on `no_interlock` =
"cylinder thermostat absent". For oil 6 the cert lodges
`cylinder_thermostat = 'Y'` so cascade sets penalty=0. But the
worksheet applies -5pp anyway — likely a different gate for non-PCDB
Table 4b regular boilers vs PCDB Table 105 boilers.
Probable 1-slice fix: extend the interlock-penalty gate to fire for
Table 4b non-combi boiler codes (124-127) when... [need spec citation
on the exact rule — investigate SAP 10.2 §9.4.11 interlock conditions
for Elmhurst's interpretation]. ΔSAP_c +3.05 → ±0.0000 expected;
closes 4 metrics on a single variant.
### 3. "no system" §A.2.2 portable-electric defaults (1 variant, ΔSAP +1.18)
Carried over from `.169`. Cascade thinks dwelling more efficient than
worksheet. Probable §A.2.2 portable-electric defaults gap
(responsiveness / control / Table 11). Probably 1 slice.
### 4. CH1 / CH3 (372)/(472) electrical-distribution CO2/PE (deferred)
Worksheet (372) CO2 factor = 0.1994 (block 11a, rating cascade) and
0.2114 (block 11b, demand cascade). PE factor = 1.7591 / 2.1872.
These don't match any Table 12 / 12d / 12e weighting I could derive
from the SH (307) or (307)+(310) heating-demand monthly profile.
(313) annual = 0.01 × (307) only — confirmed across all 5 CH variants
(NOT 0.01 × ((307)+(310)) as spec text says). Once a factor source is
identified, cascade should add an electricity-for-heat-distribution
contribution to CO2/PE for heat-network mains.
Deferred this session. Either reverse-engineer the Elmhurst formula
from a wider set of variants or find BRE documentation on the (372)
factor convention.
### 5. CH6 DLF=1.0 lodging in P960 (architectural — pinned forever)
P960 input lodges `Distribution Loss: Two adjoining dwellings sharing
a single heating system` + `Distribution Loss Value: 0.0` → ws (306)
= 1.0000. Summary lodges nothing distinguishing CH6 from CH4. Per
spec §C3.1 the manual-DLF override is legal but the Summary doesn't
carry it. Two paths: (a) extend extractor to surface §17 Additional
Information when Elmhurst eventually lodges it; (b) accept as pinned.
Recommendation: **(b) — pin and document**.
## Critical discipline reinforced this session
**Per-line walk worksheet → spec → fix.** All three slices landed via
per-line worksheet dumps confirming the spec rule before
implementation:
- `.174` probed ws (56)/(57)/(59) and back-solved p=1.0 + TF=0.6 from
the spec literal.
- `.175` traced the cascade's MIT divergence to control_type=2 vs
expected 3, then back-solved from worksheet (89) util_rest = 0.9898
+ (90) T_rest = 16.11 confirming off-hours (9, 8).
- `.176` ran the cascade `_apply_water_efficiency` trace to find
`annual=1935.37` (= (45)) being passed when ws uses (62) = 2535.37
(= (45) + (46) + (61)) — exposing the missing combi_loss.
**Spec-floor skepticism cuts BOTH ways.** `.174`'s spec-correct fix
EXPOSED the §7 MIT bug that pre-slice offsetting bugs had masked. The
chain `.174``.175` followed [[feedback-software-no-special-handling]]:
apply spec-correct fix uniformly; the surfaced residual is the next
slice's target, not a regression.
**Pin diagnoses before forming hypotheses.** The (372) Elmhurst factor
0.1994 doesn't match any Table 12 derivation. Rather than guess,
session pivoted to the next-tractable front (oil 3/4 combi loss) which
closed cleanly. The (372) deferred entry documents what's known and
what's tried.
**Don't conflate `main_heating_category` and `sap_main_heating_code`.**
Two `.176` slices ago, similar Elmhurst mapper artifact: the FAME oil
boilers lodge `sap_main_heating_code=128/129` but the mapper leaves
`main_heating_category=None`. Cascade dispatch helpers that gate on
either field must check BOTH. The `_TABLE_4B_COMBI_OR_CPSU_CODES` set
already existed for the symmetric `_primary_loss_applies` branch
(per `.146`); adding the same fall-through to
`_table_3a_combi_loss_default_applies` was a 4-line change with
exact closure on 8 metrics (oil 3 + oil 4 × SAP/cost/CO2/PE).
## Standard slice workflow (unchanged)
1. Read spec page + identify rule (or Elmhurst worksheet pattern)
2. Probe one variant; verify diagnosis via monkey-patch / direct walk
3. Write failing AAA test (literal `# Arrange / # Act / # Assert`)
4. Implement helper / dispatch entry / mapper extension
5. Re-pin affected variants (DO NOT widen tolerance)
6. Run extended handover suite (command below)
7. Pyright net-zero check (`git stash` → pyright → `git stash pop` → pyright)
8. If mirroring Elmhurst against spec literal: add a row to
`SAP_CALCULATOR.md §8 "Elmhurst-mirrored spec divergences"`. The
≥2-cert rule applies unless the new divergence shares its shape
with an already-documented row.
9. Commit with spec citation + `Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>`
10. Update `project-heating-systems-corpus` + `MEMORY.md` index
## Test baseline at HEAD `326066ee`
```bash
PYTHONPATH=/workspaces/model python -m pytest \
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
backend/documents_parser/tests/test_heating_systems_corpus.py \
backend/documents_parser/tests/test_elmhurst_extractor.py \
backend/documents_parser/tests/test_elmhurst_end_to_end.py \
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
domain/sap10_calculator/worksheet/tests/test_heat_transmission.py \
domain/sap10_calculator/worksheet/tests/test_internal_gains.py \
domain/sap10_calculator/worksheet/tests/test_solar_gains.py \
domain/sap10_calculator/worksheet/tests/test_dimensions.py \
domain/sap10_calculator/worksheet/tests/test_rating.py \
domain/sap10_calculator/worksheet/tests/test_ventilation.py \
domain/sap10_calculator/worksheet/tests/test_appendix_h_solar.py \
domain/sap10_calculator/worksheet/tests/test_mev.py \
domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
domain/sap10_calculator/tests/test_pcdb_table_322_lookup.py \
domain/sap10_calculator/tests/test_pcdb_table_329_lookup.py \
domain/sap10_calculator/tests/test_table_12a.py \
--no-cov -q
```
Expected: **933 pass + 1 skipped, 0 fail.**
## Memories to load (in order)
```
project-heating-systems-corpus # HEAD 326066ee
feedback-sap-10-2-only-never-10-3
feedback-software-no-special-handling
feedback-spec-floor-skepticism
feedback-worksheet-not-api-reference
feedback-spec-citation-in-commits
feedback-verify-handover-claims
feedback-zero-error-strict
feedback-commit-per-slice
feedback-aaa-test-convention
feedback-e2e-validation-philosophy
feedback-abs-diff-over-pytest-approx
feedback-golden-residuals-near-zero
feedback-one-e-minus-4-across-the-board
feedback-bigger-slices-for-uniform-work
reference-unmapped-sap-code
reference-unmapped-api-code
project-oil-price-spec-divergence
```
## What NOT to do
- **Don't reference SAP 10.3** — track 10.2 deliberately.
- **Don't widen pin tolerances** — re-pin smaller or find the spec gap.
- **Don't add empirical gates** to keep cohort pins stable.
- **Don't re-investigate Slices .91..176** — all settled.
- **Don't add new helpers to `domain/sap10_ml/`** — on deprecation path.
- **Don't treat ΔSAP=0.07 as "closed"** — target is <1e-4 vs worksheet.
- **Don't form a spec hypothesis without per-line data.** The (372)
Elmhurst factor 0.1994 is unexplained; don't bake guesses into the
cascade. Reverse-engineer with more variants first, or find BRE
documentation.
- **Don't conflate `main_heating_category` and `sap_main_heating_code`
in cascade gates.** The Elmhurst mapper leaves `category=None` on
Table 4b liquid-fuel boilers; gates must check both fields.
## Master doc
The canonical architecture + API + validation doc lives at
[`domain/sap10_calculator/docs/SAP_CALCULATOR.md`](SAP_CALCULATOR.md)
(§8.1 + §8.2 documented). If the (372) Elmhurst-factor mystery
resolves and the formula turns out to be an Elmhurst-vs-spec
divergence, add §8.3.
## Good luck.

View file

@ -0,0 +1,172 @@
# Handover — post Slices S0380.177..179 (+ infra/CI work)
Branch: `feature/per-cert-mapper-validation`. **HEAD `af8e0d94`**
(post merge from main). Predecessor:
[`HANDOVER_POST_S0380_176.md`](HANDOVER_POST_S0380_176.md).
## TL;DR
The 41-variant heating-systems corpus is now **36 EXACT + 5 pinned**.
The only remaining residuals are the **5 community-heating (CH) variants**
— all `SAP code 302/301/304` heat-network systems. Everything else
(oil, electric, solid fuel, ASHP/GSHP, PCDB, "no system") is EXACT on
all four metrics (ΔSAP/Δcost/ΔCO2/ΔPE).
Three closure slices + four infra changes landed this session:
| Slice / change | HEAD | Scope |
|---|---|---|
| S0380.177 | `5276282d` | **oil 6 boiler interlock from room-thermostat absence.** Control code 2101 ("no thermostatic control of room temperature") ⇒ no room thermostat ⇒ per RdSAP 10 §3 NOT interlocked despite cylinderstat=Yes (P960 "Boiler Interlock: No") ⇒ SAP 10.2 Table 4c(2) 5pp Space+DHW. New `_BOILER_NO_ROOM_THERMOSTAT_CONTROL_CODES={2101,2102}`; `no_interlock` ORs room-thermostat absence with stored-HW cylinderstat absence; Space 5pp leg now fires for Table 4b non-PCDB boilers. |
| S0380.178 | `c054d712` | **oil 6 circulation pump ×1.3 for absent room thermostat.** SAP 10.2 Table 4f footnote a) (PDF p.175) "Multiply by 1.3 if room thermostat is absent" ⇒ 41 × 1.3 = 53.3 kWh = ws (230c). Closes oil 6 FULLY (same root cause as .177). |
| S0380.179 | `f2062a2f` | **RdSAP 10 §10.7 electric-immersion default for "no system".** Cert lodges water code 999 (NON) + "cylinder present: No", but §10.7 substitutes an electric immersion on a Table 28 row-1 110 L cylinder + Table 29 row-1 insulation. New `_apply_rdsap_no_water_heating_system_default(epc)` rebinds the epc at the top of `cert_to_inputs` when `water_heating_code==999`. One fix closed HW (594 kWh storage loss) AND the downstream space residual (+228, a HW-gains→MIT artifact). Closes "no system" FULLY. |
| appliances+cooking | `2f039aeb` | Threaded `appliances_kwh_per_yr` + `cooking_kwh_per_yr` (Appendix L L13/L14/L16a + L20) onto `SapResult`/`CalculatorInputs` for ADR-0014 BillDerivation. **Output-only, zero rating drift.** |
| test fixes | `0e484aaa` | Fixed 11 pre-existing CI failures from an absorbed PR: `test_appendix_u.py` signature drift + mislabelled "SAP 10.3"→10.2; `test_table_32.py` re-pinned oil(4)=5.44 / FAME(73)=7.64 to the worksheet-canonical values the table actually uses. |
| corpus PDFs | `d1c87d84` | Committed the 82 heating-corpus PDF fixtures (`sap worksheets/heating systems examples/`) so CI can run the residual pins. |
| **test move** | `d7d5084f` | **Moved all 5 calculator test dirs → `tests/domain/sap10_calculator/`** so CI (which collects `tests/`) runs them. SEE "Test layout changed" below — it changes every command. |
## ⚠ Test layout changed this session — commands are different now
The calculator tests **moved** out of `domain/sap10_calculator/.../tests`
into `tests/domain/sap10_calculator/{,worksheet,rdsap,climate,validation}`.
Cross-imports were rewritten `domain.sap10_calculator.worksheet.tests`
`tests.domain.sap10_calculator.worksheet`. Any old handover command
that references `domain/sap10_calculator/worksheet/tests/...` is STALE.
**New full verification command** (replaces the old extended suite):
```bash
PYTHONPATH=/workspaces/model python -m pytest \
tests/domain/sap10_calculator/ \
backend/documents_parser/tests/ \
--no-cov -q -p no:cacheprovider
```
Expected at HEAD: **~2221 pass, 1 skipped, 0 fail** (the 1 skip is the
corpus blocked-variant `skipif`). The cascade-pin / golden / e2e
conformance suites are all under `tests/domain/sap10_calculator/`.
**Two gotchas:**
1. `load_cells` tests (`tests/domain/sap10_calculator/worksheet/test_{dimensions,ventilation,water_heating}.py`) pin against the gitignored `2026-05-19-17-18 RdSap10Worksheet.xlsx` at repo root. `_xlsx_loader.load_cells` `pytest.skip()`s when the xlsx is absent — so they run locally and skip in CI. If you're missing the xlsx locally, those skip (not fail).
2. **Uncommitted `pytest.ini` change** (came in with a main pull) REMOVES `tests/` + `domain/sap10_ml/tests` from `testpaths`. HEAD has them; the working tree strips them. This is NOT a slice change — confirm with the user before committing it, because removing `tests/` would un-collect the moved calculator tests.
## Current residual state at HEAD `af8e0d94`
### 36 variants EXACT (all four metrics < tolerance)
```
ashp, gshp,
electric 1, 2, 3, 5, 6, 7, 8, 9, 11, 12, 13, 14,
oil 1, oil 2, oil 3, oil 4, oil 5, oil 6, oil pcdb 1, oil pcdb 2, oil pcdb 3,
pcdb 1, pcdb 3,
solid fuel 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
no system
```
### 5 community-heating variants pinned
| Variant | SAP code | ΔSAP_c | Δcost | ΔCO2 | ΔPE | Closure driver |
|---|---:|---:|---:|---:|---:|---|
| CH6 (CHP/Coal) | 302 | 7.4942 | +£172.68 | 2939.67 | +7481.57 | SAP 302 CHP credit + DLF=1.0 P960 quirk |
| CH2 (CHP/Gas) | 302 | +0.5277 | £12.16 | 1435.09 | +1123.01 | SAP 302 CHP credit (CO2 + PE) |
| CH4 (CHP/Oil) | 302 | +0.5277 | £12.16 | 4401.85 | +111.58 | SAP 302 CHP credit (CO2) |
| CH3 (HP/Elec) | 304 | +0.0000 | £0.00 | 98.92 | 457.54 | (372) electrical-distribution + HP COP |
| CH1 (Boilers/Gas) | 301 | +0.0000 | £0.00 | 23.60 | 208.23 | (372) electrical-distribution factor |
Blocked tier: **empty**.
## Open fronts ranked by leverage
### 1. SAP 302 CHP CO2/PE credit cascade (3 variants — CH2/CH4/CH6) — HIGHEST
Closes the big CO2/PE residuals on CH2/CH4 AND the 7.49 SAP on CH6
simultaneously. Spec: block 13b PE (PDF p.153) + 12b CO2 — the
displaced-electricity CHP credit lines (worksheet (363)-(366),
(464)/(466)/(468)):
```
Space heating from CHP (307a) × 100 ÷ (362) = ... (363)
less credit emissions (307a)×(361) ÷ (362) = ... (364)
Water heated by CHP (310a) × 100 ÷ (362) = ... (365)
less credit emissions (310a)×(361) ÷ (362) = ... (366)
Heat from heat source 2 [(307b)+(310b)] × 100 ÷ (467b) (468)
```
RdSAP 10 §C defaults (verified vs CH2/CH4/CH6 worksheet (461)/(462)):
CHP overall eff 75%, heat-to-power 2.0 → heat_eff 50% / electric_eff
25%; boiler eff 80%. The `.172` scaling helper already keys on
`_HEAT_NETWORK_HEAT_SOURCE_EFFICIENCY` — add code 302 there once the
split formula is in place; the `.173` predicate
`_is_community_heating_hw_from_main` auto-activates.
**⚠ UNRESOLVED per-line caveat — walk before hypothesising.** The
Elmhurst worksheet (463) energy column = `spec_formula × 0.8523`
uniformly across non-CHP heat-network rows (the 0.8523 also shows in
CH1 (467)). It is NOT RdSAP 10 / SAP 10.2 spec-derived. Per
[[feedback-spec-floor-skepticism]] / [[feedback-software-no-special-handling]],
DUMP the worksheet per-line and reconcile 0.8523 before baking any CHP
formula into the cascade. Likely 2-3 slices.
### 2. CH1/CH3 (372)/(472) electrical-distribution CO2/PE — DEFERRED
CH1/CH3 are SAP + cost EXACT; only CO2/PE remain. Worksheet (372) CO2
factor = 0.1994 (block 11a) / 0.2114 (block 11b); PE = 1.7591 / 2.1872.
These don't match ANY Table 12 / 12d / 12e weighting derivable from the
(307) or (307)+(310) heating-demand monthly profile. (313) annual =
0.01 × (307) ONLY (verified across 5 variants, NOT 0.01 × (307+310) as
the spec text says). **Don't guess** — reverse-engineer the 0.1994
factor from a wider variant set or find BRE documentation first.
### 3. CH6 DLF=1.0 P960 quirk — architectural, likely pin-forever
P960 input lodges `Distribution Loss: Two adjoining dwellings...` +
`Distribution Loss Value: 0.0` → ws (306) = 1.0000, but the Summary
doesn't carry anything distinguishing CH6 from CH4. Per §C3.1 the
manual-DLF override is legal but not surfaced by the Summary.
Recommendation: pin + document once the CHP credit lands.
## Discipline (carried from every prior handover)
- **Per-line walk worksheet → spec → fix.** All 3 slices this session
landed via per-line P960 dumps. Don't form a spec hypothesis without
per-line data (the 0.8523 + 0.1994 factors are the live examples).
- **Spec-floor skepticism cuts BOTH ways** — a spec-correct fix often
EXPOSES the next residual (oil 6 .177→.178; "no system" HW→space).
Apply the spec uniformly; the surfaced residual is the next target.
- **SAP 10.2 ONLY, never 10.3.**
- **Don't conflate `main_heating_category` and `sap_main_heating_code`**
— the Elmhurst mapper leaves `category=None` on Table 4b liquid-fuel
boilers; cascade gates must check both.
- **Target is < 1e-4 vs worksheet** — ΔSAP=0.07 is NOT closed. Re-pin
smaller; never widen tolerance, never xfail.
- **One slice = one commit**, spec citation in the message, trailer
`Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>`.
## Memories to load (in order)
```
project-heating-systems-corpus # HEAD af8e0d94, 36 EXACT + 5 pinned
feedback-sap-10-2-only-never-10-3
feedback-software-no-special-handling
feedback-spec-floor-skepticism
feedback-worksheet-not-api-reference
feedback-spec-citation-in-commits
feedback-verify-handover-claims
feedback-zero-error-strict
feedback-commit-per-slice
feedback-aaa-test-convention
feedback-e2e-validation-philosophy
feedback-abs-diff-over-pytest-approx
feedback-one-e-minus-4-across-the-board
reference-unmapped-sap-code
reference-unmapped-api-code
project-oil-price-spec-divergence
```
## Master doc
Architecture + API + validation: [`SAP_CALCULATOR.md`](SAP_CALCULATOR.md)
(§8 "Elmhurst-mirrored spec divergences" carries .163 HW dual-rate
annual + .164 §12.4.4 summer-immersion). If the CHP 0.8523 multiplier
resolves to an Elmhurst-vs-spec divergence, add §8.3.
## Good luck.

View file

@ -0,0 +1,114 @@
# Handover — post S0380.189 (thermal mass parameter / Table 22)
Point-in-time note. Start from [`AGENT_GUIDE.md`](AGENT_GUIDE.md) for the
methodology, accuracy bar, and pipeline — this doc only records *what this
session did* and *what is open*.
- **Branch:** `feature/per-cert-mapper-validation`
- **HEAD:** `e03f08cd` (S0380.189)
- **Baseline:** `2290 passed, 1 skipped, 0 failed` (the skip is the
gitignored xlsx `load_cells` test). Verify with the §4 suite command.
---
## What this session shipped (S0380.185189)
| Slice | What | Spec |
|---|---|---|
| **.185** | Recorded the CH6 "pin-forever" proof — distribution-loss is an Elmhurst Summary-export gap, not a mapper miss (controlled adjoining-dwellings pair byte-identical in inputs). | — |
| **.186** | Added `test_golden_cert_pe_co2_matches_worksheet` — pins calc PE/CO2 for the 47 worksheet-backed certs against the dr87 `(286)`/`(272)` at full precision (not lodged register values). | Appendix U |
| **.187** | Appendix M1 §3a `D_PV,m` was missing **electric secondary** space heating `(215)m` — under-credited PV on gas-main+electric-secondary+PV certs. | SAP 10.2 App M1 §3a |
| **.188** | `D_PV,m` used the Appendix L **L12 lighting GAIN** (`= E_L×0.85`) instead of the **L10 lighting ELECTRICITY** `(232)`. Closed the whole PV cohort to 1e-4. Same gain-vs-electricity class as the S0380.73 cooking fix. | SAP 10.2 App L L10/L12, M1 §3a |
| **.189** | **Thermal mass parameter** was hardcoded 250 at all 5 §7/§8 call sites. Now `_thermal_mass_parameter_kj_per_m2_k(epc)` per Table 22. | RdSAP 10 §5.16 Table 22 (p.48) |
After .186.188 **all 47 worksheet-backed certs match calc≡worksheet at
<1e-4 on PE and CO2** (the convergence target). See
[`project-golden-coverage-state` memory] for the per-slice detail.
---
## The S0380.189 diagnosis (template for the open work)
Driven entirely by the **per-line walk** (AGENT_GUIDE §3) against a
**user-simulated worksheet** for a 6035-archetype property:
- **Fixture:** `sap worksheets/golden fixture debugging/simulated case 1/`
`Summary_001431 (1).pdf` (input) + `P960-0001-001431 - ….pdf` (worksheet
ground truth). A gas-combi mid-terrace, TFA 128, solid brick **with
internal insulation**, no PV/secondary/cylinder. (The `P960-…` prefix is
just the Elmhurst account id; cert is 001431.) **These PDFs are untracked**
— the user holds them locally.
- **Decompose:** PE +8.78 / SAP 1.76 / CO2 +0.21, *entirely* space-heating
demand (+838 kWh). Fabric `(2637)`, internal gains `(73)`, climate
`(96)m`, HTC `(39)` all EXACT → localised to **§7 MIT** `(92)` +0.71 °C.
- **Root cause:** TMP hardcoded 250; cert is masonry **with internal
insulation** → RdSAP 10 §5.16 Table 22 = **100**. Wrong TMP → time
constant τ=Cm/(3.6·H) ≈40 h not 16 h → §7 temperature reduction too small
→ MIT too high → space heating over-stated.
- **Fix + blast radius:** only golden cert **6035** re-pinned (SAP 6→2, PE
+46.42→+19.16, CO2 +1.07→+0.42). All other fixtures are masonry-no-internal
→ TMP unchanged.
### Two diagnostic traps that cost false starts (read before debugging §7/§8)
1. **The worksheet has TWO blocks per dwelling.** The first is the SAP-rating
block (UK-average climate, region 0); the second, under *"CALCULATION OF
EPC COSTS, EMISSIONS AND PRIMARY ENERGY"*, is the postcode block. The
**demand cascade (`cert_to_demand_inputs`) matches the POSTCODE block.**
Comparing calc HTC/MIT to the UK-avg block shows phantom gaps (e.g. HTC
"10.8"); against the postcode block HTC is exact.
2. **`(84)` Total gains = internal `(73)` + solar `(83)`.** The calc's
`internal_gains_annual_avg_w` is `(73)` only — don't diff it against `(84)`
(a phantom "248 W" gap that is just solar).
Use the §2.4 section helpers (`mean_internal_temperature_section_from_cert`,
`space_heating_section_from_cert`, `internal_gains_section_from_cert`,
`local_climate_for_cert`) for the per-line walk.
---
## OPEN — next slices (ranked)
### 1. Summary-path `main_fuel_type` derivation from the SAP code ← do first
The Elmhurst **Summary PDF has no main-heating fuel field** — only the SAP
code (`14.0 Main Heating1 → Main Heating SAP Code 104`). So
`from_elmhurst_site_notes` leaves `main_fuel_type=''`, and `cert_to_inputs`
raises `MissingMainFuelType` (cert_to_inputs.py `_main_fuel_code`). This
**blocks the Summary path for every gas-combi cert**, including the simulated
case (I injected `main_fuel_type=26` to run the diagnosis).
Fix: derive the fuel in the mapper (or `_main_fuel_code`) from
`sap_main_heating_code` via SAP 10.2 Table 4b (104 = condensing combi **mains
gas**), mirroring the existing strict-raise → derive pattern. Cite the table.
This is the higher-leverage win — it unblocks the whole site-notes gas-combi
population, not one cert.
### 2. Pin the simulated 001431 case end-to-end (after #1)
Once #1 lands, the Summary path runs natively. Add it as a site-notes fixture
and pin every line ref at 1e-4. **NB:** with .189 + injected fuel, PE and CO2
close to 1e-4 but **SAP showed +0.0007** vs a hand-computed target (`100
13.95×ECF` from the 4-dp `(257)`=1.6047). That is almost certainly ECF
rounding in the target, not a real gap — but **verify against the worksheet's
own continuous SAP** before declaring it closed (don't trust my hand value).
### 3. Cert 6035 remaining +19 PE
6035 is API/lodged-only (no worksheet) so it can't be pinned past the lodged
register. The user can reproduce 6035 *exactly* in Elmhurst to get a
worksheet — offer to format the golden JSON
(`tests/domain/sap10_calculator/rdsap/fixtures/golden/6035-7729-2309-0879-2296.json`)
as Elmhurst inputs. With a worksheet, walk the remaining +19 the same way.
### Carry-over (lower priority)
- `transform.py:973` treats `wall_construction in (5,6)` as timber-frame for
the ventilation structural-ACH split, but Table 22 / `rdsap_uvalues`
classify 6 = **system built (masonry)**, only 5/7/8 are timber/cob/park.
Possible latent ventilation-ACH bug — verify before touching.
---
## Process notes
- One slice = one commit, spec citation in the message, `Co-Authored-By:
Claude Opus 4.8` trailer. AAA tests, `abs(x-y) <= tol` (not `pytest.approx`).
- The golden worksheet PE/CO2 pins (.186) re-pin smaller as gaps close — never
widen. The lodged-residual pins (`test_golden_cert_residual_matches_pin`)
carry the API-vs-register residual and move when the calc improves.

View file

@ -1,5 +1,10 @@
# SAP 10.2 / RdSAP 10 calculator — module overview
> **New here? Start with [`AGENT_GUIDE.md`](AGENT_GUIDE.md)** — the
> accuracy bar (site-notes vs API), the debugging loop, and the
> tools/pipeline. This file is the deeper architecture + API reference;
> the `HANDOVER_*` files are point-in-time session notes, not onboarding.
Deterministic, bit-faithful replication of the RdSAP10 calculation engine.
Validated against the 6 Elmhurst U985 worksheet PDFs at **abs=1e-4 on
every line ref** for both the Rating cascade (UK-average climate, used
@ -7,7 +12,9 @@ for the published SAP rating + EI rating) and the Demand cascade
(postcode climate via PCDB Table 172, used for the EPC's published
Current Carbon, Current Primary Energy, and Fuel Bill).
**Current state: 930/930 pins green** (768 rating + 90 demand + 72 e2e).
**Current state: 941/941 pins green** (rating + demand section cascade
pins via `test_section_cascade_pins.py`, plus e2e SapResult + monthly
infiltration ACH pins via `test_e2e_elmhurst_sap_score.py`).
This document is the public API + architecture reference. For fixture
authoring see [`domain/sap10_calculator/README.md`](../../domain/sap10_calculator/README.md).
@ -373,3 +380,154 @@ PCDB10:
Table 105 (gas/oil boilers) domain/sap10_calculator/docs/specs/pcdb_table_105_...
Table 172 (postcode-district weather) domain/sap10_calculator/tables/pcdb/data/pcdb10.dat
```
---
## 8. Elmhurst-mirrored spec divergences
The calculator's contract is **bit-faithful replication of the BRE-approved
Elmhurst rdSAP engine**, not literal compliance with the SAP 10.2 spec
text. The two coincide >99% of the time, but in a few places the
worksheet PDFs from Elmhurst lodge a value that the spec text — read in
isolation — would call wrong. We mirror the engine in those cases and
document the divergence here.
Trigger to ADD a row: cascade matches spec literal interpretation, but
worksheet PDF disagrees, AND the worksheet PDF value is reproducible
across multiple Elmhurst-lodged certs (i.e. it's the engine's behaviour,
not a one-off lodging defect). Per
[[feedback-software-no-special-handling]] / [[feedback-spec-floor-skepticism]]
verify both the worksheet PDF and the cascade output before adding.
### 8.1 HW PE/CO2 factors on dual-rate tariffs use Table 12 annual, not Table 12e/12d monthly
**Slice:** S0380.163.
**Code:**
[`_hot_water_primary_factor`](../rdsap/cert_to_inputs.py),
[`_hot_water_co2_factor_kg_per_kwh`](../rdsap/cert_to_inputs.py).
**Test:** `test_electric_water_heating_factors_use_annual_table_12_on_dual_rate_tariff`.
SAP 10.2 Table 12 footnote (t) (PDF p.189) reads:
> *PE factors for grid electricity vary by month. The average figure
> given in this table is therefore not used directly. Instead the
> monthly factors given in Table 12e should be used in the SAP
> worksheet.*
(Footnote (s) says the same for CO2 / Table 12d.) Read literally this
applies to every electric end-use including dual-rate HW. The cascade
originally followed the literal reading: Σ(HW_m × F_m_12e) / ΣHW_m =
~1.521 PE for 18-hour HW on a winter-skewed demand profile.
The Elmhurst worksheet ((278) "Water heating (low-rate cost)") uses
1.5010 PE / 0.136 CO2 — the Table 12 ANNUAL row — on every dual-rate
tariff cert in the 41-variant controlled-variable corpus. The engine
applies monthly Table 12e for lighting (1.5338 winter-weighted) and
secondary heating (1.5715) on the same certs, but flat Table 12 for the
"low-rate cost" line items (SH main 1 + HW). It's an Elmhurst
implementation choice, not a documented spec exception.
**Cascade rule (post-S0380.163):**
| Tariff | HW PE / CO2 factor source |
|---|---|
| STANDARD | Table 12e / 12d monthly, weighted by HW demand seasonality (per spec literal) |
| 7-hour / 10-hour / 18-hour / 24-hour | Table 12 annual flat (1.501 PE / 0.136 CO2) |
The SH main factor (`_main_heating_primary_factor`) already
matches Elmhurst by accident: for dual-rate tariffs the
`_table_12a_system_for_main` lookup returns None for storage heaters /
electric direct-acting / electric boilers without PCDB → falls through
to `primary_energy_factor(fuel)` annual. STANDARD tariff goes through
the monthly cascade.
### Cohort impact
The 41-variant heating-systems corpus closed its HW PE/CO2 residual on
18 variants (all dual-rate electric HW: electric 1/2/3/5/6/7/8/9, solid
fuel 4/5/6/7/8/9/10/11, ashp, gshp). Each variant moved from PE +25.51
or +48.66 → ±0.0000, CO2 +6.31 or +11.95 → ±0.0000. Cohort-1 ASHP certs
(STANDARD tariff) and the 6 Elmhurst U985 fixtures (gas combi, STANDARD
tariff) are unaffected — they continue to use the monthly cascade.
### 8.2 §12.4.4 back-boiler summer-immersion CO2/PE doubles the summer term
**Slice:** S0380.164.
**Code:**
[`_section_12_4_4_hw_blend`](../rdsap/cert_to_inputs.py).
**Tests:**
`test_section_12_4_4_hw_blend_mirrors_elmhurst_summer_annual_pe_co2_double_count`,
`test_section_12_4_4_hw_blend_standard_tariff_keeps_spec_literal_monthly_cascade`.
SAP 10.2 §12.4.4 (PDF p.36-37) routes DHW through the boiler Oct-May and
an electric immersion Jun-Sep for back-boiler combos (Table 4a codes
156 + 158). The spec-literal CO2/PE formula multiplies summer-immersion
fuel by the Table 12d / 12e monthly cascade (per Table 12 footnotes
(s)/(t)). The BRE-approved Elmhurst engine adds a SECOND term —
`summer_fuel × Table 12 ANNUAL electric factor` — on top of the
monthly cascade for the (264) HW CO2 and (278) HW PE worksheet lines on
dual-rate tariffs. Same shape as §8.1 / S0380.163 but additive rather
than substitutive.
**Cascade rule (post-S0380.164):**
| Tariff | §12.4.4 winter CO2 / PE | §12.4.4 summer immersion CO2 / PE |
|---|---|---|
| STANDARD | `W_fuel × boiler_annual_factor` | `Σ wh_summer_m × Table 12d/e monthly` (spec literal) |
| 7-hour / 10-hour / 18-hour / 24-hour | `W_fuel × boiler_annual_factor` | `Σ wh_summer_m × Table 12d/e monthly` **+ `S_fuel × Table 12 annual electric`** (Elmhurst mirror) |
Cost is computed cleanly per spec (`W_fuel × boiler_price + S_fuel ×
off_peak_low_price`) — the double-count quirk only affects the CO2 and
PE factor lines.
### Cohort impact
The heating-systems corpus has exactly one §12.4.4 fixture: `solid fuel 2`
(Table 4a code 158, anthracite, 18-hour tariff, 110 L cylinder + cyl
thermostat). Pre-slice the cascade carried ΔCO2 = 93.10 kg/yr / ΔPE
= 1027.51 kWh/yr — matching `684.55 kWh × 0.136 CO2` and
`684.55 kWh × 1.501 PE` to within rounding. Post-slice closes to
±0.0000 on all four metrics, completing the cohort closure at 25/25
cascade-OK variants EXACT vs the Elmhurst worksheet.
### ⚠ Single-cert evidence
The §12.4.4 divergence is documented here on **one** worksheet (SF2)
because the corpus has no second §12.4.4 fixture (`solid fuel 1` =
code 156 is an empty folder). The math nonetheless matches the
worksheet to within rounding and aligns with §8.1's S0380.163 mirror
shape (Table 12 annual where spec literal says monthly), so the gate
is implemented under the same `dual-rate → annual on top of monthly`
discipline. If a second §12.4.4-eligible cert worksheet diverges from
this rule it should be raised against this row before re-tuning.
### 8.3 Community-heating CHP uses Table 12f "flexible operation" by default
**Slice S0380.182.** For RdSAP-defaulted community heating with CHP
(SAP code 302) that is **not** in the PCDB, the displaced-electricity
credit (worksheet (364)/(366) CO2 and (464)/(466) PE) needs a Table 12f
(PDF p.196) "fuel factor for electricity generated by CHP". Table 12f
offers three regimes per CHP vintage:
| Regime | CO2 kg/kWh | PE | Note |
|---|---|---|---|
| export only | 0.394 | 2.345 | |
| **flexible operation** | **0.420** | **2.369** | needs assessor evidence |
| standard | 0.348 | 2.149 | "all other operating regimes" |
Table 12f's own notes make **standard** the default ("Standard ... should
be used for all other operating regimes of gas CHP plants") and require
submitted evidence for **flexible**. Yet the BRE-approved Elmhurst rdSAP
engine emits **0.420 / 2.369 (flexible)** for these RdSAP-defaulted
community-CHP certs — verified line-by-line against the CH2 (gas) / CH4
(oil) / CH6 (coal) corpus worksheets (364)/(366)/(464)/(466), all of
which carry 0.4200 CO2 and 2.3690 PE regardless of the community fuel.
RdSAP 10 §C (p.58) is silent on the Table 12f regime, so this is an
engine default not derivable from the spec text.
Per [[feedback-software-no-special-handling]] / [[feedback-worksheet-not-api-reference]]
we mirror the engine: `_TABLE_12F_CHP_FLEXIBLE_{CO2,PE}` in
`cert_to_inputs`. CH2 + CH4 close to <1e-4 on both CO2 and PE with this
factor; "standard" (0.348/2.149) would leave a residual. If a future
PCDB-listed or evidence-backed CHP cert diverges, raise it against this
row before re-tuning.

File diff suppressed because it is too large Load diff

View file

@ -192,8 +192,14 @@ CO2_KG_PER_KWH: Final[dict[int, float]] = {
30: 0.136, 31: 0.136, 32: 0.136, 33: 0.136, 34: 0.136, 35: 0.136,
38: 0.136, 40: 0.136, 39: 0.136, 60: 0.136, 36: 0.136,
# Heat networks
51: 0.210, 52: 0.241, 53: 0.298, 54: 0.375, 55: 0.269,
56: 0.298, 57: 0.036, 58: 0.018,
# Heat-network oil (code 53 "assumes 'gas oil'") and mineral-oil/
# biodiesel boilers (code 56) carry 0.335 kg CO2/kWh per SAP 10.2
# Table 12 (p.189) — NOT the individual-appliance heating-oil factor
# (code 4 = 0.298). (Fixed in S0380.182 when the code-302 CHP CO2
# cascade first exercised heat-network oil; PE 1.180 was already
# correct.)
51: 0.210, 52: 0.241, 53: 0.335, 54: 0.375, 55: 0.269,
56: 0.335, 57: 0.036, 58: 0.018,
41: 0.136, 42: 0.015, 43: 0.029, 44: 0.024,
45: 0.015, 46: 0.011, 47: 0.011, 48: 0.136, 49: 0.136,
50: 0.0,

View file

@ -51,13 +51,17 @@ UNIT_PRICE_P_PER_KWH: Final[dict[int, float]] = {
# BRE technical papers (`docs/specs/sap10 technical papers/`) carry
# no Table 32 errata or fuel-price update, so the change is grounded
# in empirical cross-source evidence rather than a spec citation.
# FAME (code 73) shows the inverse pattern on oil 3/4 worksheets
# (worksheet 7.64 vs spec 5.44) but flipping it has no measurable
# cascade effect today — deferred until a cert that exercises it
# surfaces.
# FAME (code 73) shows the inverse pattern on oil 3/4 worksheets:
# the RdSAP 10 Spec PDF Table 32 lists 5.44 p/kWh but worksheet
# (240) "Space heating - main system 1" for variants oil 3 (EES
# BXE, SAP 128) + oil 4 (EES BXF, SAP 129) lodges 7.64. Slice
# S0380.168 flipped 5.44 → 7.64 to match the worksheet — same
# empirical-divergence justification as the .131 heating-oil flip;
# the Elmhurst engine is the canonical reference per
# [[feedback-software-no-special-handling]].
4: 5.44, # heating oil — see comment above (Slice S0380.131)
71: 7.64, # bio-liquid HVO
73: 5.44, # bio-liquid FAME
73: 7.64, # bio-liquid FAME — Slice S0380.168 flip (5.44 → 7.64)
75: 6.10, # B30K
76: 47.0, # bioethanol
# Solid fuels

View file

@ -59,6 +59,11 @@ _LIQUID_FUEL_WARM_AIR_PUMP_W: Final[float] = 10.0
_WARM_AIR_HEATING_VOLUME_COEFF: Final[float] = 0.04
_PIV_VOLUME_COEFF: Final[float] = 0.12
_BALANCED_MV_NO_HR_VOLUME_COEFF: Final[float] = 0.06
# Table 5a footnote c) default SFP when no PCDB warm-air-unit SFP is
# lodged: "otherwise 1.5 W/(l/s). These values of SFP include an
# in-use factor." Same default as Table 4f footnote e) for the kWh
# side (see `cert_to_inputs._TABLE_4F_WARM_AIR_FAN_DEFAULT_SFP_W_PER_L_PER_S`).
_TABLE_5A_WARM_AIR_FAN_DEFAULT_SFP_W_PER_L_PER_S: Final[float] = 1.5
_HIU_HOURS_PER_DAY: Final[float] = 24.0
_SUMMER_MONTHS: Final[frozenset[int]] = frozenset({6, 7, 8, 9})
@ -658,18 +663,151 @@ def _pump_date_category_from_cert(epc: EpcPropertyData) -> PumpDateCategory:
_HEAT_PUMP_MAIN_HEATING_CATEGORY: Final[int] = 4
def _all_main_systems_are_heat_pumps(epc: EpcPropertyData) -> bool:
"""True iff every lodged main heating system is a heat pump
(category 4). When True, SAP 10.2 Table 5a Note a) zeros the
central-heating-pump GAIN. When False (mixed HP + boiler, or
boiler-only), the non-HP system's pump gain still applies."""
# SAP 10.2 Table 5a row "Central heating pump in heated space" (PDF
# p.177) only applies to mains with a water-loop circulation pump.
# Dry mains — electric storage heaters (Table 4a Cat 7 codes 401-409,
# 421), warm-air heaters without HPs (Cat 9), solid-fuel room heaters
# without back-boilers (codes 631-636 minus the boiler combos at
# 151-161), electric direct-acting heaters — have no primary water
# loop, so the row simply doesn't apply and worksheet (70)m = 0.
#
# Mirrors `cert_to_inputs._WET_BOILER_CODE_RANGES` (Table 4f kWh
# accounting). Kept as a sibling constant here so the worksheet layer
# does not depend on rdsap. Same code-range coverage:
# 101-141 Gas/oil boilers (Table 4b)
# 151-161 Solid-fuel boilers + back-boiler combos (Table 4a)
# 191-196 Electric boilers + CPSU (Table 4a)
_WET_BOILER_SAP_CODE_RANGES: Final[tuple[range, ...]] = (
range(101, 142),
range(151, 162),
range(191, 197),
)
# Heat-emitter types (Table 4d) that imply a wet primary loop —
# radiators (1) and fan-coil units (3) require water-side delivery.
# UFH (2) excluded because it can be wet OR electric (in-screed cable);
# the SAP code or category disambiguates. Warm-air (4) and electric
# storage / direct-acting emitters are dry. Used only as a fallback
# when no SAP code / PCDB index / category is lodged (e.g. the 000490
# hand-built unit-test fixture).
_WET_HEAT_EMITTER_TYPES: Final[frozenset[int]] = frozenset({1, 3})
def _any_main_system_has_central_heating_pump(epc: EpcPropertyData) -> bool:
"""SAP 10.2 Table 5a row "Central heating pump in heated space"
(PDF p.177) predicate for whether the pump-gain row applies.
Identifies wet, non-HP mains by (any of):
- sap_main_heating_code in Table 4a/4b wet-boiler ranges
(gas/oil/solid-fuel/electric boilers)
- main_heating_index_number lodged + category not HP (PCDB
Table 322 gas/oil boiler record)
- main_heating_category in {1, 2} (RdSAP "central heating" with
or without separate HW both wet)
- heat_emitter_type in {1 radiators, 3 fan-coil} (Table 4d wet
emitter types; UFH/2 excluded as it can be electric)
HP mains (category 4) are skipped per Table 5a Note a) "Not
applicable for electric heat pumps from database." Where any
non-HP main qualifies as wet, the pump gain applies (per the
same note's clause about two mains in the same space).
Mirrors `cert_to_inputs._is_wet_boiler_main` see docstring there
for the kWh-side parallel in Table 4f.
Electric heat pump exception per SAP 10.2 Appendix N3.1 (PDF p.105):
"For electric heat pumps: The electricity used by the water
circulation pump or fan is included within the calculated annual
space and hot water heating efficiency and is not included in
worksheet (230c). **The default heat gain from Table 5a is included
via worksheet (70).**" → Cat 4 HPs WITHOUT a PCDB record (Table 4a
default cascade) get the Table 5a default pump gain. Cat 4 HPs
WITH a PCDB record (Table 362) embed the pump gain in the COP
no separate Table 5a gain. Cat 5 warm-air HPs (codes 521/523-527)
distribute via fans, not a water pump handled by the warm-air
fan row of Table 5a (see `_any_main_system_has_warm_air_distribution`).
"""
details = epc.sap_heating.main_heating_details
if not details:
return False
return all(
d.main_heating_category == _HEAT_PUMP_MAIN_HEATING_CATEGORY
for d in details
)
for d in details:
if d.main_heating_category == _HEAT_PUMP_MAIN_HEATING_CATEGORY:
# PCDB Table 362 record → pump electricity AND gain are
# embedded in COP (Appendix N1.2.1); no separate gain row.
if d.main_heating_index_number is not None:
continue
# Cat 5 warm-air HP (codes 521/523-527) → no water pump.
code = d.sap_main_heating_code
if code is not None and code in _TABLE_4A_WARM_AIR_SAP_CODES:
continue
# Cat 4 HP, Table 4a default cascade → apply Table 5a
# pump gain per Appendix N3.1.
return True
code = d.sap_main_heating_code
if code is not None and any(
code in r for r in _WET_BOILER_SAP_CODE_RANGES
):
return True
if d.main_heating_index_number is not None:
return True
if d.main_heating_category in {1, 2}:
return True
if d.heat_emitter_type in _WET_HEAT_EMITTER_TYPES:
return True
return False
# SAP 10.2 Table 4a (PDF p.165-166) warm-air heating SAP codes. The
# Table 5a "Warm air heating system fans" gain (and Table 4f
# electricity row) fire for these mains:
# - Cat 5 (heat pumps with warm-air distribution): 521, 523-527
# - Cat 9 (warm air NOT heat pump): 501-515, 520
# Mirrors `cert_to_inputs._TABLE_4A_WARM_AIR_SAP_CODES` — kept here as
# a sibling so the worksheet layer does not depend on rdsap. Keep in
# sync manually with the cert_to_inputs constant.
_TABLE_4A_WARM_AIR_SAP_CODES: Final[frozenset[int]] = frozenset({
501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511, 520,
512, 513, 514, 515,
521, 523, 524, 525, 526, 527,
})
# SAP 10.2 Table 5a footnote c) (PDF p.177) for the "Warm air heating
# system fans" row: "If the heating system is a warm air unit and
# there is balanced whole house mechanical ventilation, the gains for
# the warm air system should not be included."
# Mirrors `cert_to_inputs._BALANCED_MV_KIND_NAMES`. Balanced MV kinds
# = MVHR (balanced with HR) + MV (balanced without HR). MEV, PIV from
# outside, and natural ventilation do NOT trigger the omission.
_BALANCED_MV_KIND_NAMES: Final[frozenset[str]] = frozenset({"MVHR", "MV"})
def _any_main_system_has_warm_air_distribution(epc: EpcPropertyData) -> bool:
"""True iff any lodged main heating system distributes heat as warm
air (Table 4a Cat 5 HPs with warm-air dist. + Cat 9 warm-air not
HP) qualifying for the SAP 10.2 Table 5a "Warm air heating
system fans" gain row.
"""
details = epc.sap_heating.main_heating_details
if not details:
return False
for d in details:
code = d.sap_main_heating_code
if code is not None and code in _TABLE_4A_WARM_AIR_SAP_CODES:
return True
return False
def _has_balanced_mechanical_ventilation(epc: EpcPropertyData) -> bool:
"""SAP 10.2 Table 5a footnote c) / Table 4f footnote e) balanced-MV
gate: True when the cert lodges either MVHR or MV (both balanced).
Mirrors `cert_to_inputs._has_balanced_mechanical_ventilation`.
"""
sv = getattr(epc, "sap_ventilation", None)
if sv is None:
return False
name = getattr(sv, "mechanical_ventilation_kind", None)
return name in _BALANCED_MV_KIND_NAMES
def internal_gains_from_cert(
@ -725,26 +863,49 @@ def internal_gains_from_cert(
daylight_factor=c_daylight,
)
# SAP 10.2 Table 5a Note a) (PDF p.177): the central-heating-pump
# GAIN is "Not applicable for electric heat pumps from database".
# Zero only when EVERY lodged main heating system is an HP — when
# any non-HP system (gas boiler, oil boiler, etc.) is present, its
# circulation pump still contributes 3/7/10 W per the pump's
# installation date (Table 5a row 1). Cert 000565 lodges HP main 1
# + gas boiler main 2 → 3 W gain (worksheet line 70 confirms
# 3.0000 W in 8 winter months, 0 in summer). Cert 0380 (HP-only)
# → 0 W gain (worksheet line 70 confirms 0 every month).
if _all_main_systems_are_heat_pumps(epc):
pump_w = 0.0
else:
# SAP 10.2 Table 5a row "Central heating pump in heated space"
# (PDF p.177) — the gain applies only to mains with a water-loop
# circulation pump. Excludes:
# (i) HP mains per Table 5a Note a) "Not applicable for electric
# heat pumps from database" (cert 0380 HP-only → 0 W),
# (ii) Dry mains with no primary water loop — electric storage
# heaters (Cat 7), warm-air heaters (Cat 9), solid-fuel room
# heaters without back-boilers, electric direct-acting.
# Worksheet (70)m = 0 across the 41-variant controlled-
# variable corpus for every dry main; see
# `_any_main_system_has_central_heating_pump`.
# Mixed HP + wet-boiler mains (cert 000565: HP main 1 + gas boiler
# main 2) DO carry the gain via the non-HP main's pump (worksheet
# line 70 confirms 3.0000 W in 8 winter months, 0 in summer).
if _any_main_system_has_central_heating_pump(epc):
pump_w = central_heating_pump_w(
date_category=_pump_date_category_from_cert(epc)
)
# Liquid-fuel + warm-air + PIV + MV + HIU branches default to zero for
# the combi-gas-natural-vent population; future slices will detect them
else:
pump_w = 0.0
# SAP 10.2 Table 5a row "Warm air heating system fans a) c)" (PDF
# p.177): SFP × 0.04 × V W, heating-season only per footnote a),
# omitted when balanced whole-house MV is present per footnote c).
# Default SFP 1.5 W/(l/s) per footnote c) — no PCDB warm-air-unit
# SFP lookup yet. Sister to the Table 4f kWh-side wiring in
# `_table_4f_warm_air_heating_fans_kwh` (S0380.158). Cohort
# entry point: heating-systems corpus electric 2 (code 524 ASHP
# warm-air, V=227.25 m³, no MV → 13.6350 W matches worksheet (70)).
if (
_any_main_system_has_warm_air_distribution(epc)
and not _has_balanced_mechanical_ventilation(epc)
):
warm_air_fan_w = warm_air_heating_fan_w(
sfp_w_per_l_per_s=_TABLE_5A_WARM_AIR_FAN_DEFAULT_SFP_W_PER_L_PER_S,
dwelling_volume_m3=dwelling_volume_m3,
)
else:
warm_air_fan_w = 0.0
# Liquid-fuel + PIV + MV + HIU branches default to zero for the
# combi-gas-natural-vent population; future slices will detect them
# from epc.main_heating_details + epc.mechanical_ventilation.
pumps_fans = pumps_fans_monthly_w(
heating_season_w=pump_w,
heating_season_w=pump_w + warm_air_fan_w,
year_round_w=0.0,
)

View file

@ -67,7 +67,9 @@ def handler(body: dict[str, Any], context: Any) -> None:
logger.info(
f"Triggering MagicPlan fetcher for HubSpot deal ID {hubspot_deal_id}"
)
_trigger_magicplan_fetcher(sqs_client, hubspot_deal, listing, hubspot_deal_id)
_trigger_magicplan_fetcher(
sqs_client, hubspot_deal, listing, hubspot_deal_id
)
else:
# Deal already in db, check whether anything has changed
logger.info(
@ -119,13 +121,18 @@ def handler(body: dict[str, Any], context: Any) -> None:
logger.info(
f"Triggering MagicPlan fetcher for HubSpot deal ID {hubspot_deal_id}"
)
_trigger_magicplan_fetcher(sqs_client, hubspot_deal, listing, hubspot_deal_id)
_trigger_magicplan_fetcher(
sqs_client, hubspot_deal, listing, hubspot_deal_id
)
print("done")
def _trigger_magicplan_fetcher(
sqs_client: Any, hubspot_deal: Dict[str, str], listing: Optional[dict[str, str]], hubspot_deal_id: str
sqs_client: Any,
hubspot_deal: Dict[str, str],
listing: Optional[dict[str, str]],
hubspot_deal_id: str,
) -> None:
message_body = {
"address": hubspot_deal.get("dealname"),
@ -136,9 +143,7 @@ def _trigger_magicplan_fetcher(
QueueUrl=get_settings().MAGICPLAN_SQS_URL,
MessageBody=json.dumps(message_body),
)
logger.info(
f"Sent message to MagicPlan queue. MessageId: {response['MessageId']}"
)
logger.info(f"Sent message to MagicPlan queue. MessageId: {response['MessageId']}")
def _trigger_pashub_fetcher(
@ -148,7 +153,7 @@ def _trigger_pashub_fetcher(
"pashub_link": hubspot_deal["pashub_link"],
"address": None, # potentially available from Listing, leave as None for now
"hubspot_deal_id": deal_id,
"sharepoint_link": hubspot_deal.get("sharepoint_link", None),
# "sharepoint_link": hubspot_deal.get("sharepoint_link", None), # Don't send sharepoint link for now as they are inconsistent
"uprn": hubspot_deal.get("national_uprn", None),
"landlord_property_id": hubspot_deal.get("owner_property_id", None),
"deal_stage": hubspot_deal.get("deal_stage", None),

View file

@ -43,7 +43,7 @@ class PropertyBaselineOrchestrator:
effective_epc = prop.effective_epc
lodged = lodged_performance(effective_epc)
effective, reason = self._rebaseliner.rebaseline(
effective_epc, lodged
property_id, effective_epc, lodged
)
rhi = _require_rhi(effective_epc)
baseline = PropertyBaselinePerformance(

View file

@ -0,0 +1,27 @@
{
"period": "2026-04 to 2026-06",
"basis": "GB national average; Ofgem price cap (gas/electricity), DESNZ/NEP May 2026 (off-gas fuels)",
"sources": {
"gas_electricity": "Ofgem energy price cap unit rates and standing charges, announced 2026-02-25, cap period Apr-Jun 2026",
"off_gas": "DESNZ QEP petroleum table (oil, May 2026) + Nottingham Energy Partnership May 2026 comparison (LPG, smokeless, wood)",
"seg": "Solar Energy UK SEG league table, updated 2026-05-12"
},
"seg_export_p_per_kwh": 15.0,
"fuels": {
"MAINS_GAS": { "unit_rate_p_per_kwh": 5.74, "standing_charge_p_per_day": 29.09 },
"ELECTRICITY": { "unit_rate_p_per_kwh": 24.67, "standing_charge_p_per_day": 57.21 },
"ELECTRICITY_OFF_PEAK": { "day_p_per_kwh": 29.73, "night_p_per_kwh": 13.89, "standing_charge_p_per_day": 56.99 },
"OIL": { "unit_rate_p_per_kwh": 9.16, "standing_charge_p_per_day": 0.0 },
"LPG": { "unit_rate_p_per_kwh": 17.61, "standing_charge_p_per_day": 0.0 },
"SMOKELESS": { "unit_rate_p_per_kwh": 10.0, "standing_charge_p_per_day": 0.0 },
"WOOD_LOGS": { "unit_rate_p_per_kwh": 8.83, "standing_charge_p_per_day": 0.0 },
"WOOD_PELLETS": { "unit_rate_p_per_kwh": 7.99, "standing_charge_p_per_day": 0.0, "_note": "bagged pellets; blown bulk is 6.76 p/kWh" },
"COAL": null,
"HEAT_NETWORK": null
},
"_gaps": {
"COAL": "no standard domestic price (traditional house coal sale for domestic use is illegal in England)",
"HEAT_NETWORK": "scheme-specific; no national tariff or price-cap unit rate",
"ELECTRICITY_OFF_PEAK": "day/night split; priced once the off-peak slice adds the day/night accessor"
}
}

View file

@ -0,0 +1,17 @@
from __future__ import annotations
from abc import ABC, abstractmethod
from domain.fuel_rates.fuel_rates import FuelRates
class FuelRatesRepository(ABC):
"""Reads the current Fuel Rates used to price a Property's bill (ADR-0014).
A Repo, not a Fetcher (ADR-0011): it reads stored reference data, no live
API call. The adapter backs onto a committed static snapshot today; an
Ofgem-cap ETL is a future adapter behind this same port.
"""
@abstractmethod
def get_current(self) -> FuelRates: ...

View file

@ -0,0 +1,43 @@
from __future__ import annotations
import json
from pathlib import Path
from typing import Any, Optional
from domain.fuel_rates.fuel import Fuel
from domain.fuel_rates.fuel_rates import FuelRate, FuelRates
from repositories.fuel_rates.fuel_rates_repository import FuelRatesRepository
_DEFAULT_SNAPSHOT = Path(__file__).parent / "data" / "fuel_rates_2026_q2.json"
class StaticFileFuelRatesRepository(FuelRatesRepository):
"""Reads Fuel Rates from a committed JSON snapshot (ADR-0014).
Only **single-rate** fuels (those lodging a ``unit_rate_p_per_kwh``) are
exposed. Off-peak (day/night) and the unpriced gaps (null entries house
coal, heat network) are skipped, so pricing them raises ``UnpricedFuel``.
The day/night accessor for off-peak lands in a later slice.
"""
def __init__(self, snapshot_path: Optional[Path] = None) -> None:
self._snapshot_path = snapshot_path or _DEFAULT_SNAPSHOT
def get_current(self) -> FuelRates:
payload: dict[str, Any] = json.loads(self._snapshot_path.read_text())
fuels: dict[str, Any] = payload["fuels"]
rates: dict[Fuel, FuelRate] = {}
for name, entry in fuels.items():
if entry is None:
continue # an unpriced gap (house coal / heat network)
if "unit_rate_p_per_kwh" not in entry:
continue # off-peak day/night — priced in a later slice
rates[Fuel[name]] = FuelRate(
unit_rate_p_per_kwh=float(entry["unit_rate_p_per_kwh"]),
standing_charge_p_per_day=float(entry["standing_charge_p_per_day"]),
)
return FuelRates(
period=str(payload["period"]),
seg_export_p_per_kwh=float(payload["seg_export_p_per_kwh"]),
rates=rates,
)

Some files were not shown because too many files have changed in this diff Show more