mirror of
https://github.com/Hestia-Homes/Model.git
synced 2026-06-08 11:17:27 +00:00
Full SAP assessments (~15% of corpus, 4 403 of 30 000 scanned bulk-zip certs) lodge a measured/calculated wall U-value per BS EN ISO 6946 in walls[i].description, e.g. "Average thermal transmittance 0.18 W/m²K". These certs typically have wall_construction, wall_insulation_type and construction_age_band all None, which the cascade defaults previously resolved to U = 1.5 (uninsulated cavity at band E). RdSAP 10 §5.3: "U values are obtained from … the construction type, date of construction and, where applicable, thickness of additional insulation" — but a measured value supersedes the cascade. Corpus U-value distribution among parsed: median 0.21, mean 0.225, range 0.06-1.84 80% at U ≈ 0.2 (Part L-compliant new-builds) 10% at U ≈ 0.1 (passivhaus / very low) 7% at U ≈ 0.3 (older retrofitted full-SAP) 3% in the tail (conversions, edge cases) Per affected cert (100 m² new-build at U 1.5 → 0.21): walls_w_per_k drops 129 → 21 W/K PEUI drops ≈ 120 kWh/m² Implementation: - _measured_u_from_description() regex-parses the phrase from the wall description; returns None on no-match or non-numeric so the cascade fall-through is preserved. - u_wall checks the measured value FIRST, before any cascade logic. - No range cap — calculator mirrors what the assessor lodged, per the "deterministic except for input errors" principle. Parse failure falls through cleanly. Parity probe at 300 certs, seed=7: headlines unchanged. Direct check on the sample: 0/300 certs carry an "Average thermal transmittance" description. The v18a parquet filters full-SAP certs out somewhere upstream, so this slice is invisible in the parquet-based probe. The slice's correctness is proved by: - 4 unit tests in test_rdsap_uvalues.py (tracer + regression on ordinary descriptions + parse-failure fallback + filled-cavity description still routes correctly) - 1 end-to-end test in test_heat_transmission.py exercising a synthetic full-SAP cert through heat_transmission_from_cert - All 274 domain tests passing, no regressions Follow-up tooling: a bulk-zip-based parity probe that doesn't filter to the parquet's subset is needed to measure this slice's corpus impact. Separate dig. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| domain | ||
| fetchers | ||
| repos | ||
| utils | ||
| README.md | ||
Shared packages
Workspace packages consumed by services/*. Each package is its own Python distribution with its own pyproject.toml; services import via the workspace dependency mechanism ({ workspace = true }).
| Package | Purpose |
|---|---|
domain/ |
Shared domain types — Property, BaselinePerformance, Plan, Scenario, EpcPropertyData, etc. No persistence, no IO, no business logic. |
repos/ |
Persistence layer — one repo per aggregate. Owns the SQL. Depends on domain. |
fetchers/ |
External API clients (gov EPC, Ofgem, Google Solar, etc.). Depend on domain for response shapes. |
utils/ |
Cross-cutting infra — logging, S3, CloudWatch URL builders, SQS task helpers. |
Adding a new shared package
Only when a real second consumer materialises. Don't pre-shatter (repos-epc, repos-property, ...) — split when a deployment needs to drop a dep, not before.
See ../ara_backend_design.md §11 for the broader monorepo layout and ../CONTEXT.md for the domain glossary that names the types living in domain/.