mirror of https://github.com/Hestia-Homes/Model.git synced 2026-06-08 11:17:27 +00:00

History

Khalim Conn-Kowlessar 15613309df slice S-B24: parse measured U from full-SAP wall description Full SAP assessments (~15% of corpus, 4 403 of 30 000 scanned bulk-zip certs) lodge a measured/calculated wall U-value per BS EN ISO 6946 in walls[i].description, e.g. "Average thermal transmittance 0.18 W/m²K". These certs typically have wall_construction, wall_insulation_type and construction_age_band all None, which the cascade defaults previously resolved to U = 1.5 (uninsulated cavity at band E). RdSAP 10 §5.3: "U values are obtained from … the construction type, date of construction and, where applicable, thickness of additional insulation" — but a measured value supersedes the cascade. Corpus U-value distribution among parsed: median 0.21, mean 0.225, range 0.06-1.84 80% at U ≈ 0.2 (Part L-compliant new-builds) 10% at U ≈ 0.1 (passivhaus / very low) 7% at U ≈ 0.3 (older retrofitted full-SAP) 3% in the tail (conversions, edge cases) Per affected cert (100 m² new-build at U 1.5 → 0.21): walls_w_per_k drops 129 → 21 W/K PEUI drops ≈ 120 kWh/m² Implementation: - _measured_u_from_description() regex-parses the phrase from the wall description; returns None on no-match or non-numeric so the cascade fall-through is preserved. - u_wall checks the measured value FIRST, before any cascade logic. - No range cap — calculator mirrors what the assessor lodged, per the "deterministic except for input errors" principle. Parse failure falls through cleanly. Parity probe at 300 certs, seed=7: headlines unchanged. Direct check on the sample: 0/300 certs carry an "Average thermal transmittance" description. The v18a parquet filters full-SAP certs out somewhere upstream, so this slice is invisible in the parquet-based probe. The slice's correctness is proved by: - 4 unit tests in test_rdsap_uvalues.py (tracer + regression on ordinary descriptions + parse-failure fallback + filled-cavity description still routes correctly) - 1 end-to-end test in test_heat_transmission.py exercising a synthetic full-SAP cert through heat_transmission_from_cert - All 274 domain tests passing, no regressions Follow-up tooling: a bulk-zip-based parity probe that doesn't filter to the parquet's subset is needed to measure this slice's corpus impact. Separate dig. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>		2026-05-18 20:50:39 +00:00
..
domain	slice S-B24: parse measured U from full-SAP wall description	2026-05-18 20:50:39 +00:00
fetchers	added potential file scaffolding:	2026-05-15 10:56:53 +00:00
repos	added potential file scaffolding:	2026-05-15 10:56:53 +00:00
utils	added potential file scaffolding:	2026-05-15 10:56:53 +00:00
README.md	added potential file scaffolding:	2026-05-15 10:56:53 +00:00

README.md

Shared packages

Workspace packages consumed by services/*. Each package is its own Python distribution with its own pyproject.toml; services import via the workspace dependency mechanism ({ workspace = true }).

Package	Purpose
`domain/`	Shared domain types — `Property`, `BaselinePerformance`, `Plan`, `Scenario`, `EpcPropertyData`, etc. No persistence, no IO, no business logic.
`repos/`	Persistence layer — one repo per aggregate. Owns the SQL. Depends on `domain`.
`fetchers/`	External API clients (gov EPC, Ofgem, Google Solar, etc.). Depend on `domain` for response shapes.
`utils/`	Cross-cutting infra — logging, S3, CloudWatch URL builders, SQS task helpers.

Adding a new shared package

Only when a real second consumer materialises. Don't pre-shatter (repos-epc, repos-property, ...) — split when a deployment needs to drop a dep, not before.

See ../ara_backend_design.md §11 for the broader monorepo layout and ../CONTEXT.md for the domain glossary that names the types living in domain/.