Closes the cohort-2 API-path +0.42..+0.44 cluster (certs 0300/9380
closed to <1e-4; cert 1536 partially closed +0.4445 → +0.0015 — a
sub-2e-3 secondary tail remains for Slice S0380.42).
Root cause: per `datatypes/epc/domain/epc_codes.csv` the GOV.UK API
schema RdSAP-Schema-21.0.0 defines `glazed_type=1` as "double glazing
installed before 2002 in EAW, 2003 in SCT, 2006 NI". Three cohort-2
certs (0300/1536/9380) lodge this code with `glazing_gap=16+` and
description "Fully double glazed" — but the API mapper passed the
raw code straight through to SapWindow.glazing_type, and:
1. `_api_glazing_transmission` had no (1, "16+") entry, so the
U-value lookup returned None and the cascade defaulted to U=2.5
instead of the spec-correct U=2.7 (RdSAP 10 Table 24 row 2,
PVC/wooden frame, 16+ gap = 2.7).
2. The cascade's `_G_LIGHT_BY_GLAZING_CODE` table is keyed on the
SAP 10.2 Table 6b enum (the Elmhurst extractor produces this
enum via `_ELMHURST_GLAZING_LABEL_TO_SAP10`), where code 1 means
"single glazed" (g_L=0.90). Passing RdSAP 21 code 1 straight
through gave the cascade the wrong g_L for the daylight factor
calculation, off by 0.90 vs spec 0.80.
Both gaps closed in one slice because they're the same misinterpretation:
- `_API_GLAZING_TYPE_TO_TRANSMISSION` + `_API_GLAZING_TYPE_GAP_TO_
TRANSMISSION` now alias code 1 as a schema sibling of code 3 — both
resolve to RdSAP 10 Table 24 row 2 ("DG pre-2002 / unknown install
date"). Per-gap entries cover the full 6mm=3.1 / 12mm=2.8 / 16+=2.7
row; type-only fallback uses the 12mm default U=2.8.
- New `_API_TO_SAP10_CASCADE_GLAZING_CODE = {1: 2}` remap is applied
in `_api_sap_window` AFTER the U-value lookup, so SapWindow.glazing_
type carries the SAP 10.2 cascade enum (code 2 = DG pre-2002 air-
filled, g_L=0.80) while the U lookup stays keyed on the raw GOV.UK
API code. The cohort-1 codes 2/3/13/14 already coincide with the
cascade table's intended SAP 10.2 g_L values, so no remap entry
required for them; only divergent codes get a remap.
Test impact:
- Cohort-2 API path: 34/38 → 36/38 at 1e-4 (0300 +4.8e-5; 9380 -5e-6
both move from _COHORT_2_API_OPEN to _COHORT_2_API_CLOSED).
- Cert 1536 pin updated from 66.337334 to 65.894324; ws Δ now +0.0015
(was +0.4445) — same root-cause fix dominated, residual tail is
distinct-cause work for the next slice.
- Cert 2102 unchanged (-6.30 residual, secondary-heating routing gap).
- Cohort-1 (9 ASHP certs) unaffected: 9/9 still < 1e-4 on both paths.
Test suite: 750 pass + 0 fail. Pyright net-zero per touched file.
Spec citations:
- RdSAP-Schema-21.0.0 glazed_type=1 → datatypes/epc/domain/epc_codes.csv
- RdSAP 10 Specification §8.2 Table 24 (p.49) row 2 "Double glazed:
Installed England/Wales before 2002 / Scotland before 2003 /
N. Ireland before 2006" — U=2.7 (PVC/wooden, 16+ gap).
- SAP 10.2 Table 6b: DG air-filled g_L=0.80 (vs single 0.90).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|---|---|---|
| .devcontainer | ||
| .github/workflows | ||
| .idea | ||
| .vscode | ||
| applications | ||
| asset_list | ||
| backend | ||
| backlog | ||
| datatypes | ||
| deployment/terraform | ||
| docs/adr | ||
| domain | ||
| epr_data_exports | ||
| etl | ||
| infrastructure | ||
| model_data/requirements | ||
| orchestration | ||
| recommendations | ||
| repositories | ||
| scripts | ||
| sfr/principal_pitch | ||
| survey_report | ||
| tests | ||
| utilities | ||
| utils | ||
| .coveragerc | ||
| .dockerignore | ||
| .gitignore | ||
| __init__.py | ||
| ara_backend_design.md | ||
| BaseUtility.py | ||
| CLAUDE.md | ||
| conftest.py | ||
| CONTEXT.md | ||
| devcontainer.sh | ||
| Dockerfile.test | ||
| Dockerfile.test.dockerignore | ||
| Makefile | ||
| MEMORY.md | ||
| package-lock.json | ||
| package.json | ||
| pyproject.toml | ||
| pyrightconfig.json | ||
| pytest.ini | ||
| README.md | ||
| run_lambda_local.sh | ||
| serverless.yml | ||
| test.requirements.txt | ||
| tox.ini | ||
| UBIQUITOUS_LANGUAGE.md | ||
Model Repository
This repository contains the code pertaining to the development of the data science and machine learning products being utilised by Hestia.
The different folders in this repository relate to services that can be used independently, or can be imported and used as part of a larger application
Getting Started
Prerequisites
Dev Container Setup
This repo uses a Docker Compose-based dev container. The model-backend service joins a shared-dev Docker network so it can communicate with other local services (e.g. a frontend container) running on your machine.
VS Code users: The initializeCommand in devcontainer.json creates the shared-dev network automatically before the container starts. No manual step required — just open the repo and select Reopen in Container.
Non-VS Code / CI workflows: Run the following once before starting the container:
make dev-setup
This is idempotent and safe to re-run if the network already exists.
Folders
backend/
This folder contains the code for the fastapi backend service, which provides an interface to much of the functionality in this repository, for the frontend
model_data/
This folder contains related to the reading and preparation of assessment model data, including pulling out epc attributes
Testing
All tests can be run, against the configuration in pytest.ini running
pytest
This will run the complete panel of tests and report on coverage in the locations specified by the pytest.ini file.
To run tests in a specific service, e.g. inside of model_data, simply run
pytest --cov-config=model_data/.coveragerc --cov=model_data
This will produce the test results and coverage reports