feat(test): case-20 cascade fixture + close its CO2 via E7 per-end-use codes

Locks sim case 20 (storage heaters + Detailed RR + loose-jacket cylinder)
as a golden vector: _elmhurst_worksheet_001431_case20.build_epc() routes the
Summary PDF through extractor → mapper → calculator, registered in
test_e2e_elmhurst_sap_score with all 11 SapResult headline pins at 1e-4.
10 pinned exact off slices 1-2 (window extractor, RR stud walls); this slice
closes the last one, co2_kg_per_yr (was 3797.62 vs (272) 3815.4060).

Root cause: on a dual-rate (E7) meter the CO2 path ignored the tariff's
high/low Table-12 electricity codes that the cost path already uses:
  - Secondary (direct-acting portable heaters, on-peak) keyed the monthly
    Table 12d cascade on standard code 30 (0.15405) instead of the E7 HIGH
    code 32 → (263) 0.1616. SAP 10.2 Table 12a Grid 1 direct-acting electric
    is 100% high-rate; mirrors the cost side billing it at 15.29 p/kWh.
  - Main storage heaters fell through `_table_12a_system_for_main`=None to
    the FLAT annual factor (0.136) rather than the dual-rate LOW code: per
    the Table 12a design intent ("storage … 100% low rate") they charge
    off-peak → E7 LOW code 31 → (261) 0.1357.

case-20 co2 now EXACT. 2433 calculator + 112 golden + documents_parser tests
pass — no dual-meter/storage cohort regression; pyright strict net-zero (32=32).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Khalim Conn-Kowlessar 2026-06-06 11:23:10 +00:00
parent 1ed6d06804
commit 7dfe3f2c99
3 changed files with 150 additions and 1 deletions

View file

@ -3048,14 +3048,24 @@ def _main_heating_co2_factor_kg_per_kwh(
if monthly is None:
return _co2_factor_kg_per_kwh(main)
return monthly
codes = _TARIFF_HIGH_LOW_FUEL_CODES_TABLE_12.get(tariff)
system = _table_12a_system_for_main(main)
if system is None:
# An electric main on a dual tariff with no Table 12a Grid 1 row is
# an off-peak STORAGE system (storage heaters / electric storage
# boiler / CPSU): it charges 100% off-peak per the Table 12a design
# intent, so its monthly CO2 factor is the dual-rate LOW code
# cascade — NOT the flat annual factor. case-20 storage on E7:
# code 31 → (261) 0.1357, vs the 0.136 annual fallback.
if codes is not None:
low_only = _effective_monthly_co2_factor(main_fuel_monthly_kwh, codes[1])
if low_only is not None:
return low_only
return _co2_factor_kg_per_kwh(main)
try:
high_frac = space_heating_high_rate_fraction(system, tariff)
except NotImplementedError:
return _co2_factor_kg_per_kwh(main)
codes = _TARIFF_HIGH_LOW_FUEL_CODES_TABLE_12.get(tariff)
if codes is None:
return _co2_factor_kg_per_kwh(main)
high_code, low_code = codes
@ -3522,6 +3532,18 @@ def _secondary_heating_co2_factor_kg_per_kwh(
not the 0.136 electricity flat that the pre-S0380.70 hardcoded
`_STANDARD_ELECTRICITY_FUEL_CODE` path produced."""
code = _secondary_fuel_code(epc)
if code == _STANDARD_ELECTRICITY_FUEL_CODE:
# Secondary electric heaters are direct-acting (used on demand,
# daytime) → on-peak. On a dual-rate meter they draw HIGH-rate
# electricity, so the monthly Table 12d CO2 cascade keys on the
# tariff's HIGH code, not the standard all-day code 30 — mirroring
# the cost side billing secondary at the high rate (e.g. 15.29 p on
# E7). case-20 secondary on E7: code 32 → (263) 0.1616, vs the
# 0.15405 a code-30 weighting gives. STANDARD-tariff certs have no
# dual codes → code 30 unchanged.
dual_codes = _TARIFF_HIGH_LOW_FUEL_CODES_TABLE_12.get(_rdsap_tariff(epc))
if dual_codes is not None:
code = dual_codes[0]
monthly = _effective_monthly_co2_factor(secondary_fuel_monthly_kwh, code)
if monthly is not None:
return monthly

View file

@ -0,0 +1,111 @@
"""Mapper-driven cascade pin against the Elmhurst P960-0001-001431
"simulated case 20" worksheet a storage-heater dwelling with a
Detailed (type-2) room-in-roof, a loose-jacket hot-water cylinder, and a
multi-building-part shell.
Like 000565 / the _rr cases, this fixture does NOT hand-build the
EpcPropertyData: it routes the Summary PDF through
ElmhurstSiteNotesExtractor + from_elmhurst_site_notes so the SAP-result
pin grid exercises the WHOLE extractor + mapper + calculator pipeline.
This case was generated to validate three fronts in one worksheet:
- Detailed room-in-roof gables: a "Sheltered" gable (U=0.92) and a
"Connected" gable (U=0.00, excluded). The cascade already pins both.
- Window §11 layout where "Double between 2002 and 2021" wraps and the
Area cell splits onto its own line (fixed in the extractor see
test_summary_001431_case20_extracts_all_five_section11_windows).
- Detailed-RR "Stud Wall" surfaces lodged at Default U-value 0.00
internal knee walls the worksheet excludes from §3 and (31) (fixed in
the mapper drop only the U=0 studs, keep positive-U ones).
Source: user-simulated PDFs at `sap worksheets/golden fixture debugging/
simulated case 20/`. The Summary is mirrored into the tracked
`backend/documents_parser/tests/fixtures/Summary_001431_case20.pdf` so the
test runs without depending on the unstaged workspace.
Cert shape: Main + Extension 1, solid brick as-built (Main 220 mm / Ext1
240 mm), 2 storeys + Detailed room-in-roof on the Main, suspended
uninsulated ground floor (Main) + above-partially-heated floor (Ext1),
electric storage heaters (SAP code 402, control 2402 automatic charge
control, Economy-7 dual meter), portable electric secondary heaters (SAP
code 693), mains-gas water heating (code 911) with a loose-jacket
cylinder + thermostat, one instantaneous electric shower, no PV.
Worksheet pin targets (P960-0001-001431 block 1 existing dwelling SAP):
- SAP rating 44 (258); continuous 43.6322; ECF 4.0397 (257)
- Total fuel cost £1810.1556 (255)
- Total CO2 3815.4060 kg/year (272)
- Space heating 19873.6555 kWh/year ((98c))
- Main 1 fuel 16892.6072 kWh/year (211)
- Secondary fuel 2981.0483 kWh/year (215)
- Hot water fuel 4326.0619 kWh/year (219)
- Lighting 246.3083 kWh/year (232)
- Pumps/fans 0.0 kWh/year (231)
Per [[feedback-zero-error-strict]] + [[feedback-e2e-validation-
philosophy]]: pins are abs=1e-4 against the worksheet PDF. The pin
values live in `test_e2e_elmhurst_sap_score._FIXTURE_PINS`.
"""
from __future__ import annotations
import re
import subprocess
from pathlib import Path
from typing import Final
from backend.documents_parser.elmhurst_extractor import ElmhurstSiteNotesExtractor
from datatypes.epc.domain.epc_property_data import EpcPropertyData
from datatypes.epc.domain.mapper import EpcPropertyDataMapper
# parents[0]=worksheet/, [1]=sap10_calculator/, [2]=domain/, [3]=tests/,
# [4]=repo root.
_SUMMARY_PDF: Final[Path] = (
Path(__file__).resolve().parents[4]
/ "backend" / "documents_parser" / "tests" / "fixtures"
/ "Summary_001431_case20.pdf"
)
def _summary_pdf_to_textract_style_pages(pdf_path: Path) -> list[str]:
"""Convert a Summary PDF into the per-page text format the
ElmhurstSiteNotesExtractor expects (label\\nvalue sequences). Mirror
of the helper in `test_summary_pdf_mapper_chain.py` / the other
`_elmhurst_worksheet_*` fixtures.
"""
info = subprocess.run(
["pdfinfo", str(pdf_path)], capture_output=True, text=True, check=True,
).stdout
m = re.search(r"Pages:\s+(\d+)", info)
if m is None:
raise RuntimeError(f"Could not parse page count from {pdf_path}")
page_count = int(m.group(1))
pages: list[str] = []
for i in range(1, page_count + 1):
layout = subprocess.run(
[
"pdftotext", "-layout", "-f", str(i), "-l", str(i),
str(pdf_path), "-",
],
capture_output=True, text=True, check=True,
).stdout
tokens: list[str] = []
for line in layout.splitlines():
if not line.strip():
tokens.append("")
continue
parts = [p for p in re.split(r"\s{2,}", line.strip()) if p]
tokens.extend(parts)
pages.append("\n".join(tokens))
return pages
def build_epc() -> EpcPropertyData:
"""Route the simulated case-20 Summary through extractor + mapper.
No hand-built EpcPropertyData the extractor and mapper are part of
the test target.
"""
pages = _summary_pdf_to_textract_style_pages(_SUMMARY_PDF)
site_notes = ElmhurstSiteNotesExtractor(pages).extract()
return EpcPropertyDataMapper.from_elmhurst_site_notes(site_notes)

View file

@ -44,6 +44,7 @@ from tests.domain.sap10_calculator.worksheet import (
_elmhurst_worksheet_001431_case5 as _w001431_case5,
_elmhurst_worksheet_001431_case6 as _w001431_case6,
_elmhurst_worksheet_001431_case7 as _w001431_case7,
_elmhurst_worksheet_001431_case20 as _w001431_case20,
)
from tests.domain.sap10_calculator.worksheet._elmhurst_fixtures import (
ALL_FIXTURES as _ELMHURST_FIXTURES,
@ -278,6 +279,20 @@ _FIXTURE_PINS: Final[dict[str, FixtureCascadePins]] = {
lighting_kwh_per_yr=357.6571,
pumps_fans_kwh_per_yr=356.0,
),
# Mapper-driven — Summary_001431_case20.pdf → extractor → mapper →
# calculator. Storage heaters (SAP 402 / control 2402, Economy-7) +
# Detailed room-in-roof (Sheltered + Connected gables, U=0 stud walls)
# + loose-jacket cylinder. Pins are worksheet Block 1 line refs.
"001431_case20": FixtureCascadePins(
sap_score=44, sap_score_continuous=43.6322, ecf=4.0397,
total_fuel_cost_gbp=1810.1556, co2_kg_per_yr=3815.4060,
space_heating_kwh_per_yr=19873.6555,
main_heating_fuel_kwh_per_yr=16892.6072,
secondary_heating_fuel_kwh_per_yr=2981.0483,
hot_water_kwh_per_yr=4326.0619,
lighting_kwh_per_yr=246.3083,
pumps_fans_kwh_per_yr=0.0,
),
}
@ -296,6 +311,7 @@ _FIXTURE_MODULES: Final[dict[str, ModuleType]] = {
"001431_case5": _w001431_case5,
"001431_case6": _w001431_case6,
"001431_case7": _w001431_case7,
"001431_case20": _w001431_case20,
}