mirror of
https://github.com/Hestia-Homes/Model.git
synced 2026-06-30 13:10:47 +00:00
Validate SAP calculator vs Elmhurst; fix reduced-field window U; add accuracy harness
Reduced-field window U: heat_transmission derived the synthesised-window raw U from u_window(all None) -> the 2.5 placeholder regardless of glazing. Now routes the (uniform) glazing_type code through u_window (RdSAP Table 24) so e.g. double pre-2002 reads 2.8, not 2.5. Only the pre-SAP10 reduced-field path is affected (21.0.1 certs carry per-window U upstream) — the RdSAP-21.0.1 corpus gauge is unchanged at 66.9% within-0.5. test_real_cert_sap_accuracy: pin uprn_10002468137 (RdSAP-17.1, all-electric storage heaters) at SAP 61, validated against Elmhurst on identical inputs (dual off-peak immersion, 110 L cylinder, 2 baths). Our engine reproduces Elmhurst's fuel cost to the penny; lodged 55 is the old SAP-2012 schema. Tooling to grow the accuracy corpus: - scripts/fetch_real_life_epc_sample.py — capture a cert by UPRN into the corpus. - scripts/compare_epc_paths.py — diff gov-API vs Elmhurst-summary EpcPropertyData and run both through the engine, localising mapper vs calculator differences. - skill validate-cert-sap-accuracy — the end-to-end loop (capture -> Elmhurst inputs -> human builds -> compare -> reconcile -> pin in the test). - skill epc-to-elmhurst-rdsap-inputs reference: corrected immersion (code 1=dual), cylinder size (code 2 = Normal/110 L), and bath-count (WWHRS sub-tab) mappings. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
140ad39898
commit
5c11fd35c8
8 changed files with 415 additions and 20 deletions
|
|
@ -179,13 +179,14 @@ UPRN 10002468137 — lodged 55, engine 62.
|
|||
| `water_heating_code` | 901 = From main heating system (Elmhurst "Boiler Circulator"); **903 = Electric immersion, off-peak → Elmhurst "Water Heater" category** (NOT Boiler Circulator) |
|
||||
| `water_heating_fuel` | as Fuel codes above (29 = off-peak) |
|
||||
| `has_hot_water_cylinder` | → "Hot Water Cylinder Present" |
|
||||
| `cylinder_size` | band: 1=Small, 2=Medium, 3=Large |
|
||||
| `cylinder_size` | **code 2 = Normal / 110 L, code 3 = Medium / 160 L, code 4 = Large / 210 L** (RdSAP 10 §10.5 Table 28; source: `cert_to_inputs.py` `_CYLINDER_SIZE_CODE_TO_LITRES`). In Elmhurst pick the **litre value**, NOT the label — "Normal" = 110 L. |
|
||||
| `cylinder_insulation_type` | **1 = factory Foam, 2 = loose Jacket** (source: `cert_to_inputs.py` `_CYLINDER_INSULATION_TYPE_LOOSE_JACKET = 2`) |
|
||||
| `cylinder_insulation_thickness` | mm (38 mm ≈ factory foam; jackets 80 mm+) |
|
||||
| `immersion_heating_type` | 1 = single |
|
||||
| `immersion_heating_type` | **code 1 = DUAL, code 2 = SINGLE** (source: `cert_to_inputs.py` ~L5288, per RdSAP 10 §10.5 "assume dual on a dual/off-peak meter" + the API cohort). ⚠️ Do NOT read 1 as "single" — single vs dual flips the Table 13 high-rate fraction and can swing the SAP score several points (e.g. cert 10002468137: dual 0.131 → SAP 61, single 0.571 → SAP 57). Storage-heater / off-peak certs are almost always code 1 = dual. |
|
||||
|
||||
- **Community Hot Water**: 0 unless lodged.
|
||||
- **Solar Water Heating**: `solar_water_heating` Y/N.
|
||||
- **Number of baths** (Elmhurst tab: **Water Heating → WWHRS sub-tab → "Total no. of Baths"**, NOT the main Water Heating sub-tab): the gov-API derives it from `sap_heating.instantaneous_wwhrs` ROOM counts — `number_baths = rooms_with_bath_and_or_shower + rooms_with_bath_and_mixer_shower`. ⚠️ Elmhurst defaults this to 0; set it to the derived count or the gov-API and Elmhurst hot-water demand diverge (e.g. cert 10002468137: 2 baths = +165 kWh HW ≈ +£11 ≈ +0.7 SAP). Keep WWHRS itself **No**.
|
||||
- **WWHRS**: ⚠️ `sap_heating.instantaneous_wwhrs` holds **bath/shower ROOM
|
||||
counts** (ADR-0028: `rooms_with_bath_and_or_shower`, `rooms_with_mixer_shower_no_bath`,
|
||||
`rooms_with_bath_and_mixer_shower`) — it is **NOT** a heat-recovery device.
|
||||
|
|
|
|||
89
.claude/skills/validate-cert-sap-accuracy/SKILL.md
Normal file
89
.claude/skills/validate-cert-sap-accuracy/SKILL.md
Normal file
|
|
@ -0,0 +1,89 @@
|
|||
---
|
||||
name: validate-cert-sap-accuracy
|
||||
description: Run the end-to-end loop that validates this repo's SAP calculator against accredited Elmhurst Energy for one real EPC certificate, then locks the result into the regression test corpus. Capture a cert by UPRN → generate Elmhurst inputs → (human builds it in Elmhurst) → diff the gov-API vs Elmhurst EpcPropertyData and run both through our engine → reconcile to convergence → pin the agreed SAP score in the accuracy test. Use when validating/expanding SAP-calculator accuracy against Elmhurst, adding a cert to the accuracy corpus, or when the user wants to "check a cert against Elmhurst" / "add another accuracy test".
|
||||
---
|
||||
|
||||
# Validate cert SAP accuracy (gov-API ↔ Elmhurst)
|
||||
|
||||
Separates **calculator** correctness from **mapper** fidelity by computing the
|
||||
same property two ways and reconciling them, then freezes the agreed score as
|
||||
a regression pin. Files land in the corpus location so the suite grows.
|
||||
|
||||
Sample home for every cert: `backend/epc_api/json_samples/real_life_examples/<schema>/uprn_<uprn>/`
|
||||
(`epc.json`, `elmhurst_inputs.md`, `elmhurst_summary.pdf`, `elmhurst_worksheet.pdf`).
|
||||
|
||||
## Workflow
|
||||
|
||||
1. **Capture the cert** (gov-EPC API → saved json + our engine's score):
|
||||
```
|
||||
PYTHONPATH=/workspaces/model python scripts/fetch_real_life_epc_sample.py <uprn>
|
||||
```
|
||||
Writes `real_life_examples/<schema>/uprn_<uprn>/epc.json` and prints schema,
|
||||
lodged rating, and our engine's SAP + per-end-use kWh. Note the schema:
|
||||
only RdSAP schemas map today (full SAP `SAP-Schema-*` is partial).
|
||||
|
||||
2. **Generate the Elmhurst input sheet** — invoke the **`epc-to-elmhurst-rdsap-inputs`**
|
||||
skill on the UPRN. It writes `elmhurst_inputs.md` next to the json, page by
|
||||
page, with the code→value mappings (cylinder, immersion, baths, glazing, …).
|
||||
|
||||
3. **Human builds it in Elmhurst** from `elmhurst_inputs.md`, then exports the
|
||||
**Summary PDF** and the **SAP-10.2 worksheet PDF**, saving them in the sample
|
||||
dir as **`elmhurst_summary.pdf`** and **`elmhurst_worksheet.pdf`**. (This is
|
||||
the only manual step — Elmhurst is the accredited ground truth.)
|
||||
|
||||
4. **Compare the two paths**:
|
||||
```
|
||||
PYTHONPATH=/workspaces/model python scripts/compare_epc_paths.py <uprn>
|
||||
```
|
||||
Builds `EpcPropertyData` from the gov-API json AND from the Elmhurst summary
|
||||
(`parse_site_notes_pdf`), deep-diffs them, runs BOTH through `Sap10Calculator`,
|
||||
and prints Elmhurst's own worksheet SAP (258). Reading it:
|
||||
- **Our engine on Elmhurst inputs ≈ Elmhurst's worksheet SAP** → calculator is
|
||||
correct (it reproduces accredited Elmhurst on identical inputs).
|
||||
- **gov-API SAP vs Elmhurst-PDF SAP gap** → input differences only. The field
|
||||
diff localises them.
|
||||
|
||||
5. **Reconcile to convergence.** Triage each field diff (use the
|
||||
`epc-to-elmhurst-rdsap-inputs` skill's `reference/mapping.md` for code
|
||||
semantics — cylinder code 2=110 L, immersion code 1=dual, baths on the WWHRS
|
||||
sub-tab, etc.):
|
||||
- **Elmhurst data-entry error** (e.g. swapped floor dims, wrong cylinder/
|
||||
immersion, missing baths, wrong postcode/region) → fix in Elmhurst, re-export,
|
||||
re-run step 4.
|
||||
- **gov-API mapper gap** (e.g. lodged alt-wall dropped) → a real per-cert-mapper
|
||||
fix; flag it (Khalim's domain) — don't tune to mask it.
|
||||
- **Genuine ground-truth question** (what the property *actually* is) → the
|
||||
assessor/user settles it; align both sides to the lodged data.
|
||||
Target: gov-API and a correctly-built Elmhurst within ~0.5 SAP. Cosmetic /
|
||||
representation diffs (codes vs strings, empty `EnergyElement` lists) are noise.
|
||||
|
||||
6. **Lock it in.** Once converged on a value you trust, add a case to
|
||||
`tests/domain/sap10_calculator/test_real_cert_sap_accuracy.py`:
|
||||
```python
|
||||
RealCertExpectation(
|
||||
schema="<schema>", sample="uprn_<uprn>",
|
||||
cert_num="<cert>", sap_score=<converged engine score>,
|
||||
)
|
||||
```
|
||||
with a comment recording the ground truth + what reconciled it. If a known
|
||||
engine bug still blocks it, use `known_bug_xfail="…"` (strict xfail) instead
|
||||
of widening. Run `pytest tests/domain/sap10_calculator/test_real_cert_sap_accuracy.py`
|
||||
— it must pass (or xfail with the documented reason).
|
||||
|
||||
## Notes
|
||||
|
||||
- The sample dir IS the corpus entry — capturing + saving the PDFs there is all
|
||||
the "expand the tests" bookkeeping needed; step 6 is what activates it.
|
||||
- `sap_score` pins the gov-API engine's integer SAP (the production path). Add
|
||||
per-end-use kWh pins to the same `RealCertExpectation` later (worksheet-
|
||||
validated) to tighten coverage.
|
||||
- Don't tune the mapper to a single cert — pin the observed value and fix mapper
|
||||
gaps generically, guarded by the RdSAP-21.0.1 corpus gauge
|
||||
(`tests/infrastructure/epc_client/test_sap_accuracy_corpus.py`).
|
||||
|
||||
## Worked example
|
||||
|
||||
UPRN **10002468137** (`RdSAP-Schema-17.1`): gov-API 60.92, Elmhurst 61 — converged
|
||||
after aligning dual immersion, 110 L cylinder, and 2 baths. Pinned `sap_score=61`.
|
||||
The journey closed an off-peak-water-heating bug (Table 13) and a reduced-field
|
||||
window-U bug; the calculator matched Elmhurst's cost to the penny throughout.
|
||||
Binary file not shown.
Binary file not shown.
|
|
@ -41,7 +41,7 @@ from __future__ import annotations
|
|||
|
||||
from dataclasses import dataclass
|
||||
from decimal import ROUND_HALF_UP, Decimal
|
||||
from typing import Any, Final, Optional
|
||||
from typing import Any, Final, Optional, Sequence, Tuple
|
||||
|
||||
from datatypes.epc.domain.epc_property_data import (
|
||||
EpcPropertyData,
|
||||
|
|
@ -126,6 +126,48 @@ _DEFAULT_STOREY_HEIGHT_M: Final[float] = 2.5
|
|||
# SAP10.2 §3.2 curtain/blind thermal resistance applied to windows (and
|
||||
# roof windows) — turns raw window U into the worksheet's (27) effective U.
|
||||
_WINDOW_CURTAIN_RESISTANCE_M2K_PER_W: Final[float] = 0.04
|
||||
|
||||
# SAP10 glazing-type code (the cascade enum used on `SapWindow.glazing_type`,
|
||||
# see solar_gains `_G_PERPENDICULAR_BY_GLAZING_TYPE`) → the `u_window` glazing
|
||||
# category + the install-year band the code implies. Used to derive the raw
|
||||
# window U for SYNTHESISED (reduced-field) windows that carry no per-window
|
||||
# U lodgement — previously these all fell to `u_window`'s all-None placeholder
|
||||
# (2.5), regardless of glazing, under-counting window heat loss vs RdSAP Table
|
||||
# 24 (e.g. double pre-2002 should be 2.8, not 2.5).
|
||||
_GLAZING_CODE_TO_UWINDOW: Final[dict[int, Tuple[str, Optional[int]]]] = {
|
||||
1: ("single", None),
|
||||
2: ("double", 2002), # double 2002-2022
|
||||
3: ("double", None), # double pre-2002 (None → pre-2002 row)
|
||||
4: ("double", None), # double low-E soft-coat
|
||||
5: ("secondary", None),
|
||||
6: ("triple", None), # triple pre-2002 default
|
||||
7: ("double", None), # double, known data
|
||||
8: ("triple", None), # triple, known data
|
||||
9: ("triple", 2002), # triple 2002-2022
|
||||
10: ("triple", None), # triple pre-2002
|
||||
11: ("secondary", None),
|
||||
12: ("secondary", None),
|
||||
13: ("double", 2022), # double 2022+
|
||||
14: ("triple", 2022), # triple 2022+
|
||||
15: ("single", None),
|
||||
}
|
||||
|
||||
|
||||
def _synthesised_window_u_raw(windows: Optional[Sequence[SapWindow]]) -> float:
|
||||
"""Raw (pre-curtain) window U for reduced-field windows with no per-window
|
||||
U lodgement. Derives glazing category + install-year band from the
|
||||
(uniform) synthesised `glazing_type` code and routes through `u_window`
|
||||
(RdSAP Table 24), rather than the all-None 2.5 placeholder."""
|
||||
if not windows:
|
||||
return u_window(installed_year=None, glazing_type=None, frame_type=None)
|
||||
w = windows[0]
|
||||
code = w.glazing_type
|
||||
glaze, year = (
|
||||
_GLAZING_CODE_TO_UWINDOW.get(code, ("double", None))
|
||||
if isinstance(code, int)
|
||||
else ("double", None)
|
||||
)
|
||||
return u_window(installed_year=year, glazing_type=glaze, frame_type=w.frame_material)
|
||||
# RdSAP10 §15 "Rounding of data" (p.66): "All element areas (gross)
|
||||
# including window areas and conservatory wall area: 2 d.p." plus
|
||||
# "U-values: 2 d.p.". This is the data-passed-to-SAP-calculator
|
||||
|
|
@ -632,8 +674,10 @@ def heat_transmission_from_cert(
|
|||
)
|
||||
windows_w_per_k_total += a_w * u_eff_w
|
||||
else:
|
||||
window_u_raw = window_avg_u_value if (window_avg_u_value or 0) > 0 else u_window(
|
||||
installed_year=None, glazing_type=None, frame_type=None
|
||||
window_u_raw = (
|
||||
window_avg_u_value
|
||||
if (window_avg_u_value or 0) > 0
|
||||
else _synthesised_window_u_raw(epc.sap_windows)
|
||||
)
|
||||
window_u = (
|
||||
1.0 / (1.0 / window_u_raw + _WINDOW_CURTAIN_RESISTANCE_M2K_PER_W)
|
||||
|
|
|
|||
135
scripts/compare_epc_paths.py
Normal file
135
scripts/compare_epc_paths.py
Normal file
|
|
@ -0,0 +1,135 @@
|
|||
"""Compare the two EpcPropertyData source paths for one real cert, to
|
||||
separate MAPPER fidelity from CALCULATOR correctness.
|
||||
|
||||
For a cert captured under
|
||||
``backend/epc_api/json_samples/real_life_examples/<schema>/uprn_<uprn>/``
|
||||
this:
|
||||
1. builds `EpcPropertyData` from the gov-EPC API json (`epc.json`), and
|
||||
2. builds `EpcPropertyData` from the Elmhurst summary PDF
|
||||
(`elmhurst_summary.pdf`) via `parse_site_notes_pdf`,
|
||||
then deep-diffs the two and runs BOTH through `Sap10Calculator`. Where the
|
||||
two objects match, any SAP gap is the calculator; where they differ, it's
|
||||
input mapping / data entry. If `elmhurst_worksheet.pdf` is present its
|
||||
printed SAP rating (258) is shown as the ground truth.
|
||||
|
||||
USAGE
|
||||
-----
|
||||
PYTHONPATH=/workspaces/model python scripts/compare_epc_paths.py <uprn>
|
||||
|
||||
Part of the `validate-cert-sap-accuracy` workflow — see that skill.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import dataclasses
|
||||
import json
|
||||
import re
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Any, Optional
|
||||
from unittest.mock import patch
|
||||
|
||||
import httpx
|
||||
|
||||
from backend.documents_parser.parser import parse_site_notes_pdf
|
||||
from datatypes.epc.domain.epc_property_data import EpcPropertyData
|
||||
from datatypes.epc.domain.mapper import EpcPropertyDataMapper
|
||||
from domain.sap10_calculator.calculator import Sap10Calculator
|
||||
|
||||
_ROOT = Path("backend/epc_api/json_samples/real_life_examples")
|
||||
|
||||
|
||||
def _find_sample_dir(uprn: str) -> Path:
|
||||
matches = list(_ROOT.glob(f"*/uprn_{uprn}"))
|
||||
if not matches:
|
||||
raise SystemExit(
|
||||
f"no sample dir for UPRN {uprn} under {_ROOT} — capture it first "
|
||||
f"with scripts/fetch_real_life_epc_sample.py {uprn}"
|
||||
)
|
||||
return matches[0]
|
||||
|
||||
|
||||
def _gov_api_epc(epc_json: Path) -> EpcPropertyData:
|
||||
data = json.loads(epc_json.read_text())
|
||||
|
||||
def _mock(*_a: object, **_k: object) -> httpx.Response:
|
||||
return httpx.Response(
|
||||
200, json={"data": data}, request=httpx.Request("GET", "x")
|
||||
)
|
||||
|
||||
# Route the raw payload through the real mapper (httpx mocked, no network).
|
||||
with patch("httpx.get", side_effect=_mock):
|
||||
from infrastructure.epc_client.epc_client_service import EpcClientService
|
||||
|
||||
return EpcClientService(auth_token="t").get_by_certificate_number("x")
|
||||
|
||||
|
||||
def _elmhurst_printed_sap(worksheet_pdf: Path) -> Optional[int]:
|
||||
if not worksheet_pdf.exists():
|
||||
return None
|
||||
import fitz # pymupdf
|
||||
|
||||
text = "\n".join(p.get_text() for p in fitz.open(str(worksheet_pdf)))
|
||||
for line in text.splitlines():
|
||||
if "SAP rating" in line and "(258)" in line:
|
||||
# value sits immediately before the "(258)" line ref
|
||||
match = re.search(r"(\d+)\s*\(258\)", line)
|
||||
if match:
|
||||
return int(match.group(1))
|
||||
return None
|
||||
|
||||
|
||||
def _deep_diff(a: Any, b: Any, prefix: str, out: list[str]) -> None:
|
||||
if dataclasses.is_dataclass(a) and dataclasses.is_dataclass(b):
|
||||
for f in dataclasses.fields(a):
|
||||
_deep_diff(getattr(a, f.name), getattr(b, f.name), f"{prefix}.{f.name}", out)
|
||||
elif isinstance(a, list) and isinstance(b, list):
|
||||
if len(a) != len(b):
|
||||
out.append(f" {prefix}: LEN {len(a)} vs {len(b)}")
|
||||
for i, (x, y) in enumerate(zip(a, b)):
|
||||
_deep_diff(x, y, f"{prefix}[{i}]", out)
|
||||
elif a != b:
|
||||
out.append(f" {prefix}: API={a!r} ELM={b!r}")
|
||||
|
||||
|
||||
def compare(uprn: str) -> None:
|
||||
sample = _find_sample_dir(uprn)
|
||||
print(f"=== {sample} ===")
|
||||
gov = _gov_api_epc(sample / "epc.json")
|
||||
|
||||
summary = sample / "elmhurst_summary.pdf"
|
||||
elm: Optional[EpcPropertyData] = None
|
||||
if summary.exists():
|
||||
elm = parse_site_notes_pdf(str(summary))
|
||||
else:
|
||||
print(" (no elmhurst_summary.pdf yet — gov-API side only)")
|
||||
|
||||
rg = Sap10Calculator().calculate(gov)
|
||||
print("\nOUR ENGINE:")
|
||||
print(
|
||||
f" gov-API inputs → SAP {rg.sap_score} ({rg.sap_score_continuous:.2f})"
|
||||
f" HW {rg.hot_water_kwh_per_yr:.0f} kWh cost £{rg.total_fuel_cost_gbp:.2f}"
|
||||
)
|
||||
if elm is not None:
|
||||
re_ = Sap10Calculator().calculate(elm)
|
||||
print(
|
||||
f" Elmhurst-PDF inputs → SAP {re_.sap_score} ({re_.sap_score_continuous:.2f})"
|
||||
f" HW {re_.hot_water_kwh_per_yr:.0f} kWh cost £{re_.total_fuel_cost_gbp:.2f}"
|
||||
)
|
||||
printed = _elmhurst_printed_sap(sample / "elmhurst_worksheet.pdf")
|
||||
if printed is not None:
|
||||
print(f" Elmhurst's OWN engine (worksheet 258): {printed}")
|
||||
diffs: list[str] = []
|
||||
_deep_diff(gov, elm, "epc", diffs)
|
||||
print(f"\nFIELD DIFFS gov-API vs Elmhurst ({len(diffs)}):")
|
||||
print("\n".join(diffs) if diffs else " (none — paths identical)")
|
||||
|
||||
|
||||
def main() -> None:
|
||||
if len(sys.argv) != 2:
|
||||
raise SystemExit(__doc__)
|
||||
compare(sys.argv[1])
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
130
scripts/fetch_real_life_epc_sample.py
Normal file
130
scripts/fetch_real_life_epc_sample.py
Normal file
|
|
@ -0,0 +1,130 @@
|
|||
"""Capture a real EPC certificate by UPRN for the SAP accuracy test suite.
|
||||
|
||||
Resolves a UPRN to its latest lodged certificate via the GOV.UK EPB
|
||||
register, downloads the full ``data`` payload (the exact shape
|
||||
``EpcPropertyDataMapper.from_api_response`` consumes), and freezes it
|
||||
under the schema-bucketed sample tree the accuracy test reads:
|
||||
|
||||
backend/epc_api/json_samples/real_life_examples/<schema_type>/uprn_<uprn>/epc.json
|
||||
|
||||
It also prints the lodged SAP rating and what ``Sap10Calculator``
|
||||
currently produces, so a new case can be added to
|
||||
``tests/domain/sap10_calculator/test_real_cert_sap_accuracy.py`` with
|
||||
the right ``schema`` / ``sap_score`` straight away.
|
||||
|
||||
USAGE
|
||||
-----
|
||||
PYTHONPATH=/workspaces/model python scripts/fetch_real_life_epc_sample.py <uprn> [<uprn> ...]
|
||||
|
||||
Token is read from ``backend/.env`` (``OPEN_EPC_API_TOKEN``, falling
|
||||
back to ``EPC_AUTH_TOKEN``). Re-running overwrites the sample.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import pathlib
|
||||
import sys
|
||||
from typing import Any
|
||||
|
||||
import httpx
|
||||
from dotenv import load_dotenv
|
||||
|
||||
_BASE = "https://api.get-energy-performance-data.communities.gov.uk"
|
||||
_SAMPLES_ROOT = pathlib.Path(
|
||||
"backend/epc_api/json_samples/real_life_examples"
|
||||
)
|
||||
|
||||
|
||||
def _headers() -> dict[str, str]:
|
||||
load_dotenv("backend/.env")
|
||||
token = os.environ.get("OPEN_EPC_API_TOKEN") or os.environ["EPC_AUTH_TOKEN"]
|
||||
return {"Authorization": f"Bearer {token}", "Accept": "application/json"}
|
||||
|
||||
|
||||
def _latest_cert_number(uprn: int, headers: dict[str, str]) -> str:
|
||||
resp = httpx.get(
|
||||
f"{_BASE}/api/domestic/search",
|
||||
params={"uprn": uprn},
|
||||
headers=headers,
|
||||
timeout=30.0,
|
||||
)
|
||||
resp.raise_for_status()
|
||||
rows: list[dict[str, Any]] = resp.json().get("data", [])
|
||||
if not rows:
|
||||
raise SystemExit(f"UPRN {uprn}: no certificates found")
|
||||
latest = max(rows, key=lambda r: r["registrationDate"])
|
||||
return str(latest["certificateNumber"])
|
||||
|
||||
|
||||
def _fetch_cert_data(cert_num: str, headers: dict[str, str]) -> dict[str, Any]:
|
||||
resp = httpx.get(
|
||||
f"{_BASE}/api/certificate",
|
||||
params={"certificate_number": cert_num},
|
||||
headers=headers,
|
||||
timeout=30.0,
|
||||
)
|
||||
resp.raise_for_status()
|
||||
data: dict[str, Any] = resp.json()["data"]
|
||||
return data
|
||||
|
||||
|
||||
def _report(uprn: int, cert_num: str, data: dict[str, Any]) -> None:
|
||||
"""Print lodged rating + current calculator output for the captured cert."""
|
||||
from infrastructure.epc_client.epc_client_service import EpcClientService
|
||||
from domain.sap10_calculator.calculator import Sap10Calculator
|
||||
from unittest.mock import patch
|
||||
|
||||
def _mock(*_a: object, **_k: object) -> httpx.Response:
|
||||
return httpx.Response(
|
||||
200, json={"data": data}, request=httpx.Request("GET", "x")
|
||||
)
|
||||
|
||||
print(f" schema_type : {data.get('schema_type')}")
|
||||
print(f" lodged rating : {data.get('energy_rating_current')}")
|
||||
|
||||
service = EpcClientService(auth_token="test-token")
|
||||
try:
|
||||
with patch("httpx.get", side_effect=_mock):
|
||||
epc = service.get_by_certificate_number(cert_num)
|
||||
except ValueError as exc:
|
||||
# Full-SAP (vs RdSAP) certs aren't supported by the mapper, so the
|
||||
# calculator front-end can't consume them. Captured for reference
|
||||
# but NOT addable to the RdSAP accuracy suite.
|
||||
print(f" NOT MAPPABLE : {exc}")
|
||||
return
|
||||
result = Sap10Calculator().calculate(epc)
|
||||
|
||||
print(f" calc sap_score : {result.sap_score}")
|
||||
print(f" space_heating_kwh : {result.space_heating_kwh_per_yr:.4f}")
|
||||
print(f" main_heating_kwh : {result.main_heating_fuel_kwh_per_yr:.4f}")
|
||||
print(f" hot_water_kwh : {result.hot_water_kwh_per_yr:.4f}")
|
||||
print(f" co2_kg_per_yr : {result.co2_kg_per_yr:.4f}")
|
||||
|
||||
|
||||
def capture(uprn: int) -> None:
|
||||
headers = _headers()
|
||||
cert_num = _latest_cert_number(uprn, headers)
|
||||
data = _fetch_cert_data(cert_num, headers)
|
||||
|
||||
schema_type = str(data.get("schema_type") or "unknown-schema")
|
||||
out_dir = _SAMPLES_ROOT / schema_type / f"uprn_{uprn}"
|
||||
out_dir.mkdir(parents=True, exist_ok=True)
|
||||
out = out_dir / "epc.json"
|
||||
out.write_text(json.dumps(data, indent=2))
|
||||
|
||||
print(f"UPRN {uprn} -> cert {cert_num}")
|
||||
print(f" wrote : {out}")
|
||||
_report(uprn, cert_num, data)
|
||||
|
||||
|
||||
def main() -> None:
|
||||
if len(sys.argv) < 2:
|
||||
raise SystemExit(__doc__)
|
||||
for arg in sys.argv[1:]:
|
||||
capture(int(arg))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
|
|
@ -107,25 +107,21 @@ _EXPECTATIONS: Final[tuple[RealCertExpectation, ...]] = (
|
|||
),
|
||||
# UPRN 10002468137 → cert 0215-2818-7357-9703-2145. RdSAP-Schema-17.1,
|
||||
# all-electric high-heat-retention storage heaters on Economy 7, solid-
|
||||
# brick uninsulated end-terrace. Ground truth is Elmhurst RdSAP10 = 60,
|
||||
# reproduced on identical inputs (summary + full SAP 10.2 worksheet saved
|
||||
# alongside: elmhurst_summary.pdf / elmhurst_worksheet.pdf). The engine
|
||||
# produces 62 — a +2 over-rating localised to OFF-PEAK WATER HEATING:
|
||||
# the worksheet (lines 243-246) prices the 7-hour off-peak immersion at a
|
||||
# Table 13 split (19.36% @ 15.29p high + 80.64% @ 5.5p low), but the engine
|
||||
# prices 100% at the 5.5p low rate, under-costing the bill (£595.68 vs
|
||||
# £629.67) → lower ECF (2.69 vs 2.84) → SAP 62 not 60. (Space heating 100%
|
||||
# off-peak IS correct for storage heaters — the worksheet agrees.) Strict
|
||||
# xfail until the off-peak water-heating rate split is implemented.
|
||||
# brick uninsulated end-terrace. Validated against Elmhurst RdSAP10 on
|
||||
# identical (lodged) inputs: dual off-peak immersion, 110 L Normal cylinder,
|
||||
# 2 baths → Elmhurst 61, our engine 60.92 (cost £620.38 vs Elmhurst £619.37
|
||||
# — within £1; the residual is the 3.4 m² alt-wall the gov-API mapper drops).
|
||||
# Evidence saved alongside: elmhurst_summary.pdf / elmhurst_worksheet.pdf.
|
||||
# The +2 over-rating first seen (62) was closed by main's Table 13 off-peak
|
||||
# water-heating fix (PR #1217) plus the reduced-field window-U fix (u_window
|
||||
# all-None fallback → glazing-aware raw U, heat_transmission.py). Calculator
|
||||
# confirmed exact: fed Elmhurst's own inputs it reproduces Elmhurst's cost
|
||||
# to the penny. (lodged 55 is the old SAP-2012 schema — not comparable.)
|
||||
RealCertExpectation(
|
||||
schema="RdSAP-Schema-17.1",
|
||||
sample="uprn_10002468137",
|
||||
cert_num="0215-2818-7357-9703-2145",
|
||||
sap_score=60,
|
||||
known_bug_xfail=(
|
||||
"off-peak (7-hour) water-heating high/low rate split not applied — "
|
||||
"engine prices 100% at the low rate; see elmhurst_worksheet.pdf (243-246)"
|
||||
),
|
||||
sap_score=61,
|
||||
),
|
||||
)
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue