Two coupled fixes that together close the +903 kWh (45)m
energy-content over-count on cert 000565. Splitting them would
flip sap_score from 29 → 30 mid-fix; bundled they keep cert 000565
within rounding of the worksheet (continuous SAP residual closes
17×, from Δ +0.60 to Δ −0.035).
## 1. Elmhurst extractor — §1x.0 section-bounded "Connected" lookup
`_extract_baths_and_showers` was anchoring on the FIRST "Connected"
substring in the document via `self._lines.index("Connected")`.
Cert 000565 (4 extensions) has "Connected" appearing earlier as a
§3 building-parts wall elevation flag, so the global match landed
on a wall row; the digit-check at `num_line.isdigit()` failed
immediately on the "0.00" wall length and the shower roster came
back empty.
Both `1x.0 Baths and Showers` and `18.0 Flue Gas Heat Recovery
System` are single-occurrence section anchors in the Elmhurst
Summary PDF. Routing the "Connected" lookup through `_section_
lines(...)` bounds the search to the §1x.0 block, so multi-
extension certs no longer lose the shower roster.
## 2. SAP 10.2 §10a line (247a) — electric shower cost in fallback path
SAP 10.2 §10a (PDF p.145) worksheet line (247a):
Energy for instantaneous electric shower(s)
(64a) × 0.01 = (247a)
Total energy cost (240)...(242) + (245)...(254) = (255)
Electric showers route their (64a) kWh through the "other fuel"
tariff (same column as pumps/fans (249) and lighting (250)) and
add to (255) total cost.
`calculator.py:415-470` STANDARD-tariff path consumes
`FuelCostResult` from `fuel_cost(...)` which already plumbs
`instant_shower_cost_gbp` (worksheet/fuel_cost.py:214). The
fallback scalar path at `calculator.py:489-530` (TEN_HOUR /
off-peak / zero-FuelCostResult certs) was missing the electric-
shower term entirely. Cert 000565 (Dual-meter TEN_HOUR + 1
electric shower) trips this branch — fix #1 surfaced the
£93/yr under-count and the sap_score regression that followed.
Fix: add
electric_shower_cost = inputs.electric_shower_kwh_per_yr
× inputs.other_fuel_cost_gbp_per_kwh
into the `total_cost = max(0, ...)` sum, parallel to the existing
`electric_shower_co2` and `electric_shower_pe` flows already
present in the CO2 (line 552) and PE (line 619) sections.
## Why bundled
SAP 10.2 Appendix J §J2 step 2a (PDF p.81) routes baths via
`N_bath = 0.13 N + 0.19` when a shower is present, `0.35 N + 0.50`
when no shower is present — a 2.67× swing in (42b)m that
compounds into (45)m energy content. The extractor fix closes
(45)m to EXACT (1286.3266 = 1286.3266 ✓), but the cascade's
electric-shower kWh stream becomes load-bearing for cost — and
the fallback path was silently dropping it. Without fix #2,
sap_score regressed from 29 → 30 (cost too low → ECF too low →
SAP rating too high).
## Cert 000565 movements at HEAD (post-S0380.77 → post-this slice)
| Field | Pre-slice | Post-slice | Worksheet | Pre-Δ | Post-Δ |
|----------------------|----------:|------------:|-----------:|--------:|--------:|
| sap_score | 29 | 28 | 29 | 0 | −1 |
| sap_score_continuous | 29.1090 | 28.4735 | 28.5087 | +0.60 | **−0.035** |
| ecf | 5.3256 | 5.3904 | 5.3866 | −0.06 | **+0.004** |
| total_fuel_cost_gbp | 4627.10 | 4683.39 | 4680.26 | −53.16 | **+3.13** |
| co2_kg | 6616.0 | 6480.6 | 6447.6 | +168.4 | +32.94 |
| hot_water_kwh | 5154.0 | 4014.6 | 3755.0 | +1399 | +259.6 |
| space_heating_kwh | 58725.8 | 58793.0 | 59008.4 | −282.6 | −215.4 |
| main_heating_fuel | 34544.6 | 34584.1 | 34710.8 | −166.2 | −126.7 |
| (45)m sum | 2189.38 | **1286.33**| 1286.3266 | +903 | 0 |
The integer sap_score = 28 vs worksheet = 29 is a rounding-
boundary artifact: continuous SAP at 28.4735 rounds DOWN, just
0.035 below the 28.5 threshold. The remaining +259 kWh HW pin
over-count traces to the still-open (56)m storage loss over-count
+ missing (57)m solar-storage adjustment (slice C per the
handover) — closing that pulls continuous SAP back above 28.5 and
restores integer 29.
## Tests
- `test_summary_000565_extractor_finds_electric_shower_in_section_1x_0`
(test_summary_pdf_mapper_chain.py) — pins extractor finds the
Electric shower in §1x.0 even with §3 building-parts "Connected"
collisions earlier in the document.
- `test_total_fuel_cost_includes_247a_electric_shower_in_fallback_path`
(test_calculator.py) — pins `total_fuel_cost_gbp` rises by
exactly `kwh × other_fuel_cost` when `electric_shower_kwh_per_yr`
is non-zero in the fallback path.
Test baseline: 547 → 570 pass (+3 new tests across the 4 modified
files + indirect knock-ons in golden fixtures); 9 → 10 expected
`test_sap_result_pin[000565-*]` fails (now includes the integer
`sap_score` until slice C closes the remaining +259 kWh HW
residual). Pyright net-zero on all 4 touched files (50 baseline =
50 after).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|---|---|---|
| .devcontainer | ||
| .github/workflows | ||
| .idea | ||
| .vscode | ||
| applications | ||
| asset_list | ||
| backend | ||
| backlog | ||
| datatypes | ||
| deployment/terraform | ||
| docs/adr | ||
| domain | ||
| epr_data_exports | ||
| etl | ||
| infrastructure | ||
| model_data/requirements | ||
| orchestration | ||
| recommendations | ||
| repositories | ||
| scripts | ||
| sfr/principal_pitch | ||
| survey_report | ||
| tests | ||
| utilities | ||
| utils | ||
| .coveragerc | ||
| .dockerignore | ||
| .gitignore | ||
| __init__.py | ||
| ara_backend_design.md | ||
| BaseUtility.py | ||
| CLAUDE.md | ||
| conftest.py | ||
| CONTEXT.md | ||
| devcontainer.sh | ||
| Dockerfile.test | ||
| Dockerfile.test.dockerignore | ||
| Makefile | ||
| MEMORY.md | ||
| package-lock.json | ||
| package.json | ||
| pyproject.toml | ||
| pyrightconfig.json | ||
| pytest.ini | ||
| README.md | ||
| run_lambda_local.sh | ||
| serverless.yml | ||
| test.requirements.txt | ||
| tox.ini | ||
| UBIQUITOUS_LANGUAGE.md | ||
Model Repository
This repository contains the code pertaining to the development of the data science and machine learning products being utilised by Hestia.
The different folders in this repository relate to services that can be used independently, or can be imported and used as part of a larger application
Getting Started
Prerequisites
Dev Container Setup
This repo uses a Docker Compose-based dev container. The model-backend service joins a shared-dev Docker network so it can communicate with other local services (e.g. a frontend container) running on your machine.
VS Code users: The initializeCommand in devcontainer.json creates the shared-dev network automatically before the container starts. No manual step required — just open the repo and select Reopen in Container.
Non-VS Code / CI workflows: Run the following once before starting the container:
make dev-setup
This is idempotent and safe to re-run if the network already exists.
Folders
backend/
This folder contains the code for the fastapi backend service, which provides an interface to much of the functionality in this repository, for the frontend
model_data/
This folder contains related to the reading and preparation of assessment model data, including pulling out epc attributes
Testing
All tests can be run, against the configuration in pytest.ini running
pytest
This will run the complete panel of tests and report on coverage in the locations specified by the pytest.ini file.
To run tests in a specific service, e.g. inside of model_data, simply run
pytest --cov-config=model_data/.coveragerc --cov=model_data
This will produce the test results and coverage reports