Model

mirror of https://github.com/Hestia-Homes/Model.git synced 2026-06-30 13:10:47 +00:00

Author	SHA1	Message	Date
Daniel Roth	03dc0a3eef	add local handler and missing requirement	2026-06-15 15:03:07 +00:00
Khalim Conn-Kowlessar	1f26703dc5	feat(epc-prediction): geo-proximity weighting, per-component (#1227 ) Folds a haversine distance kernel into the categorical-mode weighting so a nearer neighbour counts for more — applied ONLY to the components that showed a clear distance signal in the corpus pre-check (age band, wall + floor construction, glazing: homes built/retrofitted together cluster). Roof construction showed no decay and is excluded; heating keeps its coherent donor. Predictor stays pure: weights come from target.coordinates vs each Comparable.coordinates (resolved at the boundary); geo is OFF when the target has no coords, neutral for a neighbour with none. Scale chosen on the harness: _GEO_SCALE_KM=0.1 is the gate-safe optimum (0.05 lifts the corpus more but regresses fixture floor_construction). Corpus (150pc/514, geo off->on): age 0.564->0.572, age_pm1 0.841->0.847, wall 0.902->0.912, floor_con 0.786->0.796, glazing 0.667->0.673; roof unchanged. Fixture: glazing 0.5278->0.5833 (floor ratcheted), all else held. Refactored recency into a reusable _recency_weights vector composed via _combine, so similarity/recency/geo factors multiply uniformly. Fixture ships a committed _coordinates.json (OGL OS OpenData; build script carries it from the corpus sidecar on rebuild) so the gate exercises geo without S3. This is the per-component method applied to geography ([[feedback_per_component_best_method]]). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-15 14:58:42 +00:00
Daniel Roth	9b21cc5512	remove breaking init file	2026-06-15 14:52:48 +00:00
Khalim Conn-Kowlessar	fdc314c857	feat(epc-prediction): thread coordinates onto Comparable + target (#1227 ) Adds coordinates: Optional[Coordinates] to Comparable and PredictionTarget (data carriers — the pure predictor stays IO-free), and wires load_corpus to read an optional _coordinates.json sidecar ({uprn: [lon, lat]}) and populate each Comparable from its cert's uprn; iter_predictions threads the held-out target's coordinates through. Absent sidecar -> geo-weighting stays off (no behaviour change yet — weighting lands next slice). fetch_corpus_coordinates now writes the sidecar into the corpus dir. load_corpus populates 99% of corpus comparables. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-15 14:46:01 +00:00
Jun-te Kim	140ad39898	Map full-SAP code-based heating systems via sap_main_heating_code 🟩 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 14:40:59 +00:00
Jun-te Kim	345154c6b7	Map full-SAP measured ventilation: air permeability, MV kind, sheltered sides 🟩 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 14:37:52 +00:00
Daniel Roth	9d56cd7c1e	Merge pull request #1234 from Hestia-Homes/feature/deploy-sharepoint-renamer Deploy sharepoint renamer: Correct dockerfile imports	2026-06-15 15:35:55 +01:00
Khalim Conn-Kowlessar	95719dd587	feat(geospatial): batch coordinates_for_uprns lookup (#1227 ) Adds GeospatialRepository.coordinates_for_uprns(uprns) -> dict — a batch coordinate lookup returning only covered UPRNs. The S3 adapter overrides it to read the meta once, group UPRNs by their covering partition, and read each partition once for all the UPRNs it covers; co-located (closely-numbered) UPRNs share a partition, so an EPC Prediction cohort is typically one or two reads instead of one per neighbour. Default port impl is a per-UPRN loop. Feeds the EPC Prediction geo-proximity work: a cohort's UPRNs resolve to coordinates in a couple of reads (validated at corpus scale: 170 partition reads for 2683 UPRNs). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-15 14:35:32 +00:00
Daniel Roth	b31db4b58b	correct Dockerfile imports	2026-06-15 14:29:04 +00:00
Khalim Conn-Kowlessar	c0a1bcac95	feat(epc-prediction): resolve corpus UPRN coordinates from S3 (#1227 signal check) One-time utility: resolves every corpus cert's uprn -> WGS84 lon/lat from the OS Open-UPRN parquet (DATA_BUCKET/spatial/) via boto3, grouping UPRNs by their covering partition so each ~1.7MB partition is read at most once (the efficient batch lookup we intend to add to GeospatialRepository). Caches {uprn:[lon,lat]} locally for the validation harness. Resolved 2609/2683 corpus UPRNs (97%). Signal pre-check result (does intra-postcode proximity predict components?): intra-postcode distances are non-trivial (median 44m, p90 138m, max ~1km), and nearer neighbours match the target markedly better on age band (0.63 at <20m -> 0.16 at >300m), wall, glazing and floor construction. Roof shows no decay. => geo-proximity is worth building, per-component (strongest for age, the weakest fabric component). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-15 14:28:39 +00:00
Jun-te Kim	c035d17f2b	Map full-SAP certs end-to-end through the dispatch ladder and pin observed score 🟩 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 14:25:48 +00:00
Daniel Roth	0ed17cfd39	Merge branch 'main' into feature/deploy-sharepoint-renamer	2026-06-15 14:24:10 +00:00
Jun-te Kim	acd0ed485d	Map full-SAP energy source, mains-gas inference and lighting bulbs 🟩 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 14:23:31 +00:00
Jun-te Kim	cb4d080da2	Map full-SAP heating systems onto the domain SapHeating model 🟩 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 14:18:01 +00:00
Jun-te Kim	6226575086	Merge pull request #1232 from Hestia-Homes/feature/deploy-sharepoint-renamer Sharepoint renamer: fix terraform issue and add dry_run option	2026-06-15 15:13:02 +01:00
Jun-te Kim	125ff6f4dd	Merge remote-tracking branch 'origin/main' into feature/hyde_make_it_more_accurate_with_tests # Conflicts: # datatypes/epc/domain/mapper.py	2026-06-15 14:12:38 +00:00
Daniel Roth	8b27a5fda2	correct lambda name	2026-06-15 14:08:40 +00:00
Daniel Roth	1af9d84f94	Merge branch 'main' into feature/deploy-sharepoint-renamer	2026-06-15 14:07:27 +00:00
Daniel Roth	963b7d70fe	fix terraform error and pass handler bool for dry runs	2026-06-15 14:06:54 +00:00
Khalim Conn-Kowlessar	4afab2c3d8	feat(epc-prediction): roof-insulation +/-1-bucket reporting Adds roof_insulation_thickness_pm1 (mirrors construction_age_band_pm1, issue #1222): adjacent RdSAP thickness buckets (0/NI,12mm..400mm+) carry near- identical roof U-values, so an off-by-one bucket is a SAP-neutral hit. 'ND' (no-data) is off the ordered scale, so only an exact match counts there. Honest measurement of SAP-relevant roof-insulation quality. Corpus (150pc/514): exact 49.3% -> +/-1 53.7% (the misses are often multiple buckets or ND, so the band gain is smaller than age's). Fixture: exact == +/-1 (0.4118) — its misses are all >1 bucket; gate floor added at 0.4118. Also fixes two pre-existing pyright errors in the touched test file (_epc main_fuel_type/main_heating_control were Optional but the MainHeatingDetail attributes are non-optional Union[int, str]). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-15 14:04:18 +00:00
Jun-te Kim	5a3228ab5e	Merge pull request #1217 from Hestia-Homes/feature/per-cert-mapper-validation Feature/per cert mapper validation	2026-06-15 15:03:05 +01:00
Jun-te Kim	5ebeb71090	Back-solve habitable-room count from full-SAP measured living area 🟩 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 13:58:03 +00:00
Khalim Conn-Kowlessar	fffb07d04b	test(harness): re-pin golden-cert plans to the gain-maximising packages Three more pre-existing failures (present at `9ee38211`, before this branch's recent commits; same family as the orchestration multi-measure re-pin) — golden-cert plan expectations that predate the ASHP generator (ADR-0025) and the optimiser folding forced dependencies into candidate gain (ADR-0016): - test_console: a multi-measure plan now leads with air_source_heat_pump, not cavity_wall_insulation (which is dropped — its forced ventilation makes the pair net-negative). Assert a measure actually in the package. - test_report 0330: package is now {solid_floor_insulation, air_source_heat_ pump}; cavity_wall + forced mechanical_ventilation correctly excluded. - test_report 0036: gain-maximising package is now {solid_floor_insulation, low_energy_lighting}. Same verified-correct optimiser evolution as `077e3a39` (cavity_wall +2.9 SAP alone but its forced fabric→ventilation dep drags the pair net-negative). Re-pin to the actual packages + their trigger fields; the forced wall→vent edge stays covered by test_measure_dependency / test_optimiser. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-15 13:57:27 +00:00
Jun-te Kim	af26688846	Derive heat-loss perimeter and party-wall length from full-SAP measured wall areas 🟩 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 13:56:31 +00:00
Khalim Conn-Kowlessar	7f48495ed5	feat(epc-prediction): surface CO2 + PEI calculator floors in the report (#1228 ) The validation report showed only the SAP calculator floor (calc(actual) vs lodged), so the headline PEI MAE (~40 kWh/m2) read as prediction error when much of it is the calculator's own API-path residual. Adds the CO2 + PEI floors alongside SAP. Diagnostic (150pc/514): PEI floor MAE 15.73 (calc(actual) vs lodged) vs SAP floor 1.57; calc(actual)/lodged PEI ratio ~1.06 (mean +10.7, ~+6% over- estimate). That RULES OUT the suspected gross unit/definition mismatch (a unit bug would be ~2x/3.6x, not 1.06) and reframes #1228: the PEI gap is a modest calculator bias (~16 floor, calc-branch) plus a larger prediction- sensitivity term (~24) — PEI is far more prediction-sensitive than SAP. CO2 floor 0.20 t. Script-only; no gate impact. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-15 13:55:20 +00:00
Khalim Conn-Kowlessar	06a66b3dd9	feat(epc-prediction): coherent heating donor selection (#1225 ) Heating sub-fields can't be field-moded without breaking system coherence, so the whole SapHeating cluster is now copied as a unit from a single coherent donor rather than inherited from the structural template: the neighbour matching the cohort's modal heating signature (main fuel + category + cylinder presence), most recent among the matches (recent cert = current system). Including cylinder presence in the signature is load-bearing — it protects has_hot_water_cylinder + cylinder_insulation (a bare fuel+cat signature regressed them). Corpus (150pc/514): heating_main_control 66.3 -> 73.9% (+7.6, the target), main_fuel 92.8 -> 96.9, category 90.7 -> 95.7, water_fuel 92.8 -> 96.3, water_code 88.5 -> 95.3, has_cylinder 81.1 -> 89.7, secondary 36.2 -> 42.0. SAP MAE vs lodged 7.08 -> 6.00 (calculator floor 1.57). cylinder_insulation -13.6 corpus (tiny-n) but +33pp on the fixture; AC requires control up + fuel/category hold + SAP not worsened, all met. Gate (36-target fixture): zero regression; ratcheted main_category 0.8889->0.9444, main_control 0.7500->0.8056, water_fuel 0.9167->0.9722, water_code 0.8889->0.9444, cylinder_insulation_type 0.1667->0.5000. This is the per-component heating method ([[feedback_per_component_best_method]]): coherent donor, never field-mode. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-15 13:48:15 +00:00
Jun-te Kim	8746eabb70	Fail loud on unmapped full-SAP opening-type codes 🟩 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 13:48:14 +00:00
Jun-te Kim	dde98fb684	Collapse full-SAP roof-window openings onto sap_roof_windows 🟩 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 13:46:32 +00:00
Khalim Conn-Kowlessar	077e3a3947	test(orchestration): re-pin multi-measure plan to the gain-maximising package The optimiser-package expectation was stale: it predated the optimiser folding a triggered measure's forced dependency into its candidate gain (ADR-0016). The run considers ALL measures (considered_measures defaults to None — no restriction), so once the ASHP bundle became SAP-beneficial (ADR-0025) the gain-maximising package shifted. Verified the new package is CORRECT, not a regression: on the test EPC, cavity-wall insulation earns +2.9 SAP alone but its forced fabric→ ventilation dependency (ADR-0016) drags the wall+ventilation pair to a NET −1.8 SAP (−0.9 on top of the ASHP package), so the gain-maximising Optimiser correctly excludes the wall and its forced ventilation. Update the expected set to {air_source_heat_pump, suspended_floor_insulation, low_energy_lighting, secondary_heating_removal} and drop the wall/vent- specific assertions — the forced wall→ventilation edge is covered by test_measure_dependency / test_optimiser; this integration test keeps its end-to-end optimise→persist→telescope coverage on the chosen package. Pre-existing failure (present before this branch's recent commits), outside the handover regression gate. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-15 13:46:22 +00:00
Jun-te Kim	36929accf7	Collapse full-SAP door openings onto door count and area-weighted U-value 🟩 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 13:39:53 +00:00
Khalim Conn-Kowlessar	d762b25808	feat(epc-prediction): recency-weighted glazing mode (#1223 ) Per-component method: glazing type is now the recency-weighted cohort mode applied to every predicted window, rather than copied from the template. Glazing is retrofitted over a dwelling's life (single -> double), so a recent neighbour reflects the current state — same family as roof-insulation thickness. Recency is the CORRECT weighting here: plain moding regressed the fixture (-5.6pp) and was previously reverted; similarity weighting also regressed it; recency improves BOTH (window geometry stays on the template, only the glazing categorical moves). modal_glazing_type: corpus (150pc/514) 60.7 -> 66.7% (+6.0pp); fixture 0.5000 -> 0.5278 (floor ratcheted up). Heating, geometry residuals and all other components unchanged. Refactored _recency_weighted_mode to a reusable _recency_weighted_choice(value_of) shared by roof insulation + glazing. Closes the #1223 per-component approach: floor-area (median estimate) + glazing (recency) shipped as distinct best-fit methods rather than a global recency template, which would have disturbed the coherence-coupled heating cluster. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-15 13:35:03 +00:00
Jun-te Kim	70460935b8	Collapse full-SAP window openings onto the engine's sap_windows model 🟩 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 13:32:20 +00:00
Khalim Conn-Kowlessar	4fdc23f83d	test(worksheet): pin simulated case 38 — mains-gas secondary reproduces worksheet exactly The realistic re-generation of case 37 (code-117 gas boiler, control 2102, + a MAINS-GAS condensing gas-fire secondary code 611, vs case 37's biogas 605). The full extractor -> mapper -> calculator pipeline reproduces the worksheet's SAP-rating block EXACTLY: continuous SAP 60.9152 (Δ 2e-5) and (272) CO2 5801.0770 (Δ ~0). This confirms the boiler-efficiency / control-2102 −5pp interlock / secondary-fuel handling are all correct, and that case 37's +7 gap was purely the biogas sub-fuel the Summary export cannot carry. Summary mirrored into backend/documents_parser/tests/fixtures so the pin runs without the unstaged workspace. PE not pinned — it is a separate DPER block (different scope) already guarded by the corpus PE gauge. Worksheet harness 47/47 unchanged; pyright net-zero. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-15 13:31:36 +00:00
Khalim Conn-Kowlessar	51cdc25ce8	feat(epc-prediction): cohort-median floor-area estimate (#1223 ) Per-component method, not a global template change: the predicted floor area is now the cohort median (the MAD-minimising point estimate of the target's size) rather than whichever structural template's own area. The calculator derives heat loss from building-part geometry, not this scalar, so decoupling them is safe and the scalar becomes a better size estimate. floor_area mean\|.\|: corpus (150pc/514 targets) 10.62 -> 10.48; fixture 12.2175 -> 11.8983 (ceiling ratcheted down). No other component moves. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-15 13:30:33 +00:00
Jun-te Kim	0eaf87b106	Carry full-SAP measured fabric U-value descriptions into the domain model 🟩 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 13:10:05 +00:00
Jun-te Kim	c3fd9a6872	Map full-SAP cert identity and scalar fields to EpcPropertyData 🟩 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 13:05:38 +00:00
Daniel Roth	17420408e4	Merge pull request #1230 from Hestia-Homes/feature/deploy-sharepoint-renamer Deploy sharepoint renamer	2026-06-15 13:45:52 +01:00
Daniel Roth	b9cbea367d	correct import in test file	2026-06-15 12:21:32 +00:00
Jun-te Kim	0079752eab	inviestigation with hyde values	2026-06-15 12:13:11 +00:00
Jun-te Kim	5923f8d072	Parse full-SAP SAP-Schema-17.1 certificate payloads 🟥 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 12:11:26 +00:00
Daniel Roth	a6050fc1c7	remove tests/ from pytest.ini	2026-06-15 12:04:33 +00:00
Daniel Roth	0fc81da4cf	move input files out of scripts/	2026-06-15 11:14:09 +00:00
Daniel Roth	5c314e2914	move tests out of scripts/	2026-06-15 11:11:08 +00:00
Daniel Roth	38b9e63844	revert pytest.ini	2026-06-15 11:02:48 +00:00
Daniel Roth	beb4e5d0d9	Move SharePoint renamer logic from scripts/ into orchestrator and app-root handler	2026-06-15 11:01:51 +00:00
Daniel Roth	8cb0e986e6	Deploy SharePoint renamer as Lambda with SQS trigger 🟩	2026-06-15 10:52:52 +00:00
Daniel Roth	b3e9d858d9	SharePoint renamer Lambda handler stub created 🟥	2026-06-15 10:49:01 +00:00
Daniel Roth	383b8b0c37	SharePoint renamer build_canonical_filename behaviour verified by tests 🟩	2026-06-15 10:48:17 +00:00
Daniel Roth	9daf6a8668	Merge pull request #1221 from Hestia-Homes/improve-sharepoint-renamer Sharepoint renamer recursively looks for files in subfolders	2026-06-15 11:18:03 +01:00
Khalim Conn-Kowlessar	c11eb46b8a	fix(modelling): HHR overlay sets off-peak immersion type so HW Table 13 applies The HHR-storage HeatingOverlay (ADR-0024) added an off-peak electric immersion cylinder but never set `immersion_heating_type`, so the overlaid cert left it None. The calculator then could not resolve `immersion_single` for the SAP 10.2 Table 13 HW high-rate split and billed hot water 100% at the off-peak low rate — £127.41 vs the relodged after-cert's £169.39, overstating the overlay's SAP by +1.26 (CO2/PE matched, isolating it to the HW cost path). Add `immersion_heating_type` to HeatingOverlay, route it through `_fold_heating` (it lives on `sap_heating`), and set it to 1 (single off-peak immersion) on the HHR overlay to match the relodged reference. Closes both `test_hhr_storage_overlay_reproduces_the_relodged_after_*` cascade pins (electric-storage and no-system befores share the after). Pre-existing failure (present before this branch's recent commits), outside the handover regression gate. Full modelling suite 220 pass, pyright net- zero. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-15 06:53:14 +00:00

... 10 11 12 13 14 ...

7203 commits