Two compounding bugs were over-counting the SAP 10.2 §4 (56)m cylinder
storage loss by ~76 kWh/yr across all 17 cylinder-with-immersion
corpus variants (cascade HW kWh 2460.40 vs worksheet 2384.12):
(1) **Extractor gap.** Elmhurst Summary §15.1 "Hot Water Cylinder"
block lodges `Cylinder Size` / `Insulation Thickness` but NOT
`Cylinder Thermostat`. The thermostat is lodged separately in
§16 "Recommendations" as `Cylinder thermostat (Already installed)`.
The extractor only searched §15.1, so `cylinder_thermostat`
resolved to None for every variant on property 001431. The
cascade then defaulted `has_cylinder_thermostat=False`, applying
SAP 10.2 Table 2b's ×1.3 "no thermostat" multiplier.
(2) **Cascade spec gap.** `_separately_timed_dhw` returned True for
any cylinder-lodged cert regardless of HW fuel. Per SAP 10.2
Table 2b note b) (PDF p.159):
> "Multiply Temperature Factor by 0.9 if there is separate time
> control of domestic hot water (boiler systems, warm air systems
> and heat pump systems)"
Electric immersion is NOT in the bracketed list — the ×0.9
reduction is restricted to boiler / warm-air / HP systems. Pre-
slice the cascade over-applied ×0.9 on electric-immersion certs.
Combined, the cascade computed TF = 0.60 × 1.3 × 0.9 = 0.702 vs the
worksheet's TF = 0.60 (base — thermostat present, immersion exempt).
After both fixes the cascade HW kWh matches the worksheet's (64) at
1e-3 precision (2384.116 vs 2384.12).
Corpus impact (16 cylinder-with-immersion variants on 18-hour meter):
| variant | SAP_c shift | Cost shift |
|--------------|------------:|-----------:|
| electric 1 | -0.20 → -0.06 | -£3.34 |
| electric 2 | -1.27 → +0.47 | -£4.44 |
| electric 3 | +2.42 → +2.55 | -£2.91 |
| electric 5 | -0.06 → +0.07 | -£3.06 |
| electric 6 | +1.19 → +1.33 | -£3.20 |
| electric 7 | +1.14 → +1.29 | -£3.35 |
| electric 8 | -0.41 → -0.26 | -£3.50 |
| electric 9 | -0.24 → -0.12 | -£2.91 |
| solid fuel 4-11 | -0.45..-0.09 → -0.29..+0.10 | -£3 to -£4 |
The HW kWh line closes cleanly; some SAP residuals sign-flip slightly
because the cascade's now-correct HW kWh exposes the SH+Sec demand
mismatch for storage heaters (electric 3/6/7 — open driver is the
Table 11 `main_heating_category=None` default for codes 401/402,
queued for a mapper-side slice).
Tests:
- new AAA test `test_separately_timed_dhw_excludes_electric_immersion_per_table_2b_note_b`
- 16 corpus pins re-tightened (8 electric + 8 solid fuel)
Extended handover suite: 883 pass (was 882; +1 new test), 0 fail.
Pyright net-zero on touched files (43 → 43 errors, all pre-existing).
Per [[feedback-spec-citation-in-commits]] +
[[feedback-spec-floor-skepticism]] (the "HW +76 kWh uniform overcount"
across 17 variants traced to TWO spec-citable defaults the cascade
was getting wrong, not a precision floor).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|---|---|---|
| .devcontainer | ||
| .github/workflows | ||
| .idea | ||
| .vscode | ||
| applications | ||
| asset_list | ||
| backend | ||
| backlog | ||
| datatypes | ||
| deployment/terraform | ||
| docs/adr | ||
| domain | ||
| epr_data_exports | ||
| etl | ||
| infrastructure | ||
| model_data/requirements | ||
| orchestration | ||
| recommendations | ||
| repositories | ||
| scripts | ||
| sfr/principal_pitch | ||
| survey_report | ||
| tests | ||
| utilities | ||
| utils | ||
| .coveragerc | ||
| .dockerignore | ||
| .gitignore | ||
| __init__.py | ||
| ara_backend_design.md | ||
| BaseUtility.py | ||
| CLAUDE.md | ||
| conftest.py | ||
| CONTEXT.md | ||
| devcontainer.sh | ||
| Dockerfile.test | ||
| Dockerfile.test.dockerignore | ||
| Makefile | ||
| MEMORY.md | ||
| package-lock.json | ||
| package.json | ||
| pyproject.toml | ||
| pyrightconfig.json | ||
| pytest.ini | ||
| README.md | ||
| run_lambda_local.sh | ||
| serverless.yml | ||
| test.requirements.txt | ||
| tox.ini | ||
| UBIQUITOUS_LANGUAGE.md | ||
Model Repository
This repository contains the code pertaining to the development of the data science and machine learning products being utilised by Hestia.
The different folders in this repository relate to services that can be used independently, or can be imported and used as part of a larger application
Getting Started
Prerequisites
Dev Container Setup
This repo uses a Docker Compose-based dev container. The model-backend service joins a shared-dev Docker network so it can communicate with other local services (e.g. a frontend container) running on your machine.
VS Code users: The initializeCommand in devcontainer.json creates the shared-dev network automatically before the container starts. No manual step required — just open the repo and select Reopen in Container.
Non-VS Code / CI workflows: Run the following once before starting the container:
make dev-setup
This is idempotent and safe to re-run if the network already exists.
Folders
backend/
This folder contains the code for the fastapi backend service, which provides an interface to much of the functionality in this repository, for the frontend
model_data/
This folder contains related to the reading and preparation of assessment model data, including pulling out epc attributes
Testing
All tests can be run, against the configuration in pytest.ini running
pytest
This will run the complete panel of tests and report on coverage in the locations specified by the pytest.ini file.
To run tests in a specific service, e.g. inside of model_data, simply run
pytest --cov-config=model_data/.coveragerc --cov=model_data
This will produce the test results and coverage reports