Commit graph

716 commits

Author SHA1 Message Date
Khalim Conn-Kowlessar
53d9f21f73 fix(modelling): offer ASHP when the catalogue has no ASHP row
The ASHP bundle is priced from the rate sheet (ADR-0025); the catalogue
row is read only for its material id, which is nullable end-to-end. The
live `material` catalogue has no `air_source_heat_pump` row, so
`products.get` raised `ValueError: no active product` and aborted every
ASHP-eligible property.

Add `ProductNotFound(ValueError)` + a concrete `ProductRepository
.get_optional`, raise the typed error from both repos, and have
`_ashp_option` look the row up optionally — a missing row now yields an
ASHP Option with `material_id=None` rather than crashing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 14:55:41 +00:00
KhalimCK
90bed458f4
Merge pull request #1238 from Hestia-Homes/feature/epc-prediction
Feature/epc prediction
2026-06-16 21:58:40 +08:00
Khalim Conn-Kowlessar
7ca1f815f6 refactor(epc-prediction): PR review — rename ComparableProperty, relocate PredictionTarget
Two review points from @dancafc:

1) Rename the `Comparable` dataclass → `ComparableProperty` (it models one
   comparable *property*; the collection stays `ComparableProperties`). Applied
   across domain, repositories, orchestration, harness, scripts, and tests with a
   word-boundary rename so `ComparableProperties` is untouched.

2) Move `PredictionTarget` out of comparable_properties.py into prediction_target.py
   (where `PredictionTargetAttributes` + `build_prediction_target` already live).
   comparable_properties.py now imports it; no import cycle (prediction_target no
   longer depends on comparable_properties). Importers updated.

92 tests pass across the touched suites; pyright strict clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 13:34:44 +00:00
Jun-te Kim
b78e9c7768 Merge branch 'main' of https://github.com/Hestia-Homes/Model into feature/hyde_make_it_more_accurate_with_tests 2026-06-16 09:17:33 +00:00
Khalim Conn-Kowlessar
419e340477 test(worksheet): pin simulated case 43 at 1e-4 (RR + dry-line + mixed roof)
Golden regression fixture for the multi-feature dwelling that surfaced the
two Elmhurst-extractor bugs in a33707f8. case 43 is a 2-BP mid-terrace with
a DETAILED room-in-roof (two slopes, two flat ceilings, party + exposed
gables, two common walls), a MIXED-insulation multi-section roof (Main
insulated + Extension uninsulated 2.30), a DRY-LINED extension solid wall,
a mains-gas boiler (102 / control 2106) and a House-coal solid-fuel
secondary (633).

Routes the Summary PDF through the WHOLE extractor + mapper + calculator
pipeline (no hand-built EpcPropertyData) and pins the §3 fabric + SAP-rating
block at abs=1e-4: (29a) walls 74.5800, (30) roof 38.5008, (33) fabric
172.7844, continuous SAP 73.2332 = (258), CO2 3518.3037 = (272). Guards the
detailed-RR slope/common_wall surfaces, the dry-lining R=0.17 adjustment,
and the per-part mixed-roof billing together. Summary mirrored to
backend/documents_parser/tests/fixtures/Summary_001431_case43.pdf; provider
module mirrors the _case6/_case21 pattern, assertion in
test_section_cascade_pins. Harness 47/47; regression = the 3 pre-existing
fails; pyright net-zero.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 08:26:05 +00:00
Khalim Conn-Kowlessar
8a70d22278 test(corpus): ratchet SAP ceiling 1.00->0.99 (detailed-RR common walls)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 06:23:11 +00:00
Khalim Conn-Kowlessar
e4adab0e88 test(corpus): ratchet SAP floor 0.65->0.67, ceiling 1.08->1.00
Lock in the detailed-RR slope + stud-wall gain (corpus within-0.5
67.3% -> 67.5%, MAE 1.020 -> 0.987). The corpus is a fixed 1000-cert
deterministic gauge, so the thresholds track measured HEAD with a small
margin per the ratchet convention.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 05:57:27 +00:00
Khalim Conn-Kowlessar
b55b969b84 fix(water-heating): use lodged cylinder_heat_loss declared-loss factor
The gov API lodges a manufacturer's declared cylinder loss factor
(kWh/day) in `sap_heating.cylinder_heat_loss`, in which case it leaves
the cylinder volume / insulation type / thickness None. That field was
undeclared on the 21.0.x schemas, so `from_dict` dropped it — then
`_cylinder_storage_loss_override` hit its insulation-None / volume-None
guards and returned None, dropping the §4 storage loss ENTIRELY. The
dwelling over-rated (the declared loss is typically ~1.5 kWh/day ≈
550 kWh/yr).

SAP 10.2 §4 branch a) (PDF p.136): when the declared loss factor is
known, storage loss (50) = (48) declared loss × (49) Table-2b
temperature factor — replacing the Table 2 V×L×VF computation.

- declare `cylinder_heat_loss` on RdSapSchema21_0_0/21_0_1.SapHeating +
  EpcPropertyData.SapHeating; thread through the 21.0.x mappers.
- `cylinder_storage_loss_monthly_kwh` gains `declared_loss_kwh_per_day`:
  when set, combined_55 = declared × TF (volume/insulation unused).
- `_cylinder_storage_loss_override` resolves the declared loss BEFORE the
  insulation/volume guards (the gov omits those when the loss is lodged).

12 /tmp certs carry it (mean |err| 3.00 -> 2.51; the clean ones close
hard, e.g. 2360 2.65 -> 0.30, 0245 2.25 -> 0.53). Corpus within-0.5
67.0% -> 67.3% (MAE 1.025 -> 1.020); /tmp 71.2% -> 71.4% (0.889 ->
0.882). Worksheet harness 47/47; regression = only the 3 pre-existing
fails; pyright net-zero.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 05:27:47 +00:00
Khalim Conn-Kowlessar
7cfd54129b fix(mapper): read the dropped rafter_insulation_thickness API field
Roofs lodged insulated at rafters carry their thickness in a DEDICATED
gov-EPC API field, `rafter_insulation_thickness` (e.g. "225mm"), while
`roof_insulation_thickness` stays None (rafters aren't loft joists). That
field was undeclared on the 21.0.x schemas, so `from_dict` silently
dropped it — the rafter certs only *looked* redacted (roof EER 2-4 =
insulated, yet no thickness), and the cascade fell to the Table 18 col (2)
unknown default (2.30), badly under-rating them.

- declare `rafter_insulation_thickness` on RdSapSchema21_0_0/21_0_1 +
  EpcPropertyData.SapBuildingPart (mirrors the existing
  sloping_ceiling_/flat_roof_insulation_thickness dropped-field handling).
- thread it through `from_rdsap_schema_21_0_0/21_0_1` (older schemas get
  None via getattr).
- `heat_transmission` prefers `rafter_insulation_thickness` over
  `roof_insulation_thickness` when the part is at-rafters, so the measured
  RdSAP 10 §5.11.2 Table 16 column (2) row applies (225 mm → 0.25).

Completes the rafters roof fix: with the real thickness read, the rafter
certs are recovered rather than over-stated — cert 3100-8675-0922-8628
(band E, rafters 225mm) +8.93 → +0.43 SAP. Corpus within-0.5 67.0%
(MAE 1.025) and /tmp 71.2% (MAE 0.889) — both NET ABOVE the pre-rafters
baseline (66.9% / 70.6%). Worksheet harness 47/47; regression = only the
3 pre-existing fails; pyright net-zero.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 05:04:39 +00:00
Khalim Conn-Kowlessar
5d556faf86 fix(roof): bill at-rafters insulation on RdSAP 10 Table 16/18 column (2)
`u_roof` only implemented the joist column, so roofs lodged insulated at
rafters (`roof_insulation_location == 1`) were mis-billed at the joist U
on both the API and Summary paths — under-stating loss, over-rating SAP.

RdSAP 10 §5.11.2 Table 16 (spec p.42-43) gives a distinct "insulation at
rafters" column (2): the rafter cavity is shallower than a loft void, so
the same depth yields a higher U (200 mm: rafters 0.29 vs joists 0.21).
§5.11 Table 18 (p.45) likewise carries a rafters column (2) for unknown /
as-built thickness (footnote (1): "The value from the table applies for
unknown and as built") — band A-D = 2.30, E = 1.50, F = 0.68, diverging
from the joist column's 100 mm-equivalent 0.40 default (footnote (4)).

- add `_ROOF_RAFTERS_BY_THICKNESS` (Table 16 col 2) + `_ROOF_RAFTERS_BY_AGE`
  (Table 18 col 2) to rdsap_uvalues; `u_roof` selects them via a new
  `insulation_at_rafters` flag (ignored for flat / sloping-ceiling roofs).
- `heat_transmission` derives the flag PER BUILDING PART from
  `roof_insulation_location` (gov-API int 1 / Summary "R Rafters"), which
  also fixes the multi-part dedup-roof-join problem: each part's own
  location now drives its U, replacing the unattributable joined
  `epc.roofs[]` description.

Worksheet-validated to 1e-4: simulated case 41 (4-bp — Ext1 rafters 200mm
→ 0.29, Ext3 rafters As-Built band F → 0.68; roof total 24.8350) and case
42 (6 variants — rafters 50mm → 0.88, rafters unknown band C → 2.30,
joists/none unchanged). Case 40 stays exact (roof 35.340, total 441.1606);
worksheet harness 47/47.

Corpus within-0.5 66.9% → 66.5% (gates 0.65/1.08 hold) — a spec-correct
shift, NOT a regression: all 15 corpus rafter certs carry redacted (None)
thickness yet lodge roof EER 2-4 (insulated), so the open API blanked a
specified thickness and the spec's unknown-rafter 2.30 default correctly
over-states them. Recovery needs a roof-EER→thickness inference on the
API path (follow-up), not a change to the U-table.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 04:42:44 +00:00
Khalim Conn-Kowlessar
f66e2cb020 docs(epc-prediction): module README + end-to-end showcase test
README at domain/epc_prediction/README.md — the flow diagram, where each piece
lives, links to the ADRs/CONTEXT/handover/migration note, and a runnable test
command. The team's entry point.

tests/e2e/test_epc_prediction_e2e.py — the whole gap-fill flow against the REAL
Postgres Unit of Work + EPC/Property repositories + EpcComparablePropertiesRepository
+ EpcPrediction, with only the three external HTTP clients faked (EPC API,
geospatial S3, Solar). Proves: EPC-less Property → Ingestion predicts from its
postcode cohort → persists to the predicted slot → reloaded Property resolves
effective_epc via source_path == "predicted". The canonical "see it in action".

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 04:13:30 +00:00
Khalim Conn-Kowlessar
5727ac53c1 feat(epc-prediction): slice-5e ingestion wiring (gate → predict → persist)
Wire EPC Prediction gap-fill into IngestionOrchestrator (ADR-0031). When the
predictor collaborators are injected (ComparablesRepo + PredictionAttributesReader
+ EpcPrediction), an EPC-less Property is predicted from its postcode cohort and
persisted to the predicted slot; the eligibility gate (unknown property_type) and
"a lodged EPC is never predicted over" both hold. The two-phase contract is kept:
prediction attributes (Landlord Overrides) resolve in the unit prep phase, the
cohort fetch + select + predict run in the no-unit IO phase, persistence in the
write phase. All three collaborators are OPTIONAL — unwired, ingestion behaves
exactly as before (existing tests unchanged).

3 tests (predict+persist, gate, lodged-wins); 228 pass across orchestration +
epc_prediction + repositories; pyright strict clean. Production composition-root
wiring (real ComparableProperties + override-attributes adapters) is part of the
Jun-te handover.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 04:03:02 +00:00
Khalim Conn-Kowlessar
f2f954f459 feat(epc-prediction): slice-5d target assembly + eligibility gate
build_prediction_target assembles an EPC-less Property's PredictionTarget from
its identity (postcode), resolved coordinates, and Landlord-Override attributes
(property_type / built_form / wall_construction). The eligibility GATE: a Property
whose property_type is unknown returns None — never sized from a mixed-type
cohort (ADR-0031). property_type is the hard cohort filter.

The override attributes are read through a PredictionTargetAttributesReader port
(stub seam) — the real adapter (a read over property_overrides) is being built
separately by the team; ingestion wiring depends on the abstraction and tests
substitute a fake. 2 tests (assembly + gate); pyright strict clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 03:56:57 +00:00
Khalim Conn-Kowlessar
fd43cf2d23 feat(epc-prediction): slice-5c predicted-EPC persistence slot
Add a `source` discriminator (lodged | predicted) to the EPC store so a Property
holds a lodged EPC and a predicted one (EPC Prediction gap-fill) at once
(ADR-0031). EpcRepository.save gains source="lodged"; idempotent delete is now
per-source (a predicted save no longer wipes lodged, and vice versa);
get_for_property/get_for_properties filter lodged; new get_predicted_for_property
/ get_predicted_for_properties read predicted. PropertyPostgresRepository.get +
get_many hydrate Property.predicted_epc, so the predicted picture reaches the
modelling read (both load via get_many). FakeEpcRepo mirrors the dual slot.

EpcPropertyModel gains `source` (default "lodged"); the test DB builds from the
SQLModel mirror so this is exercised without the prod migration. The matching
Drizzle change (column + per-(property_id,source) uniqueness) is the team's to
action before merge — docs/MIGRATION_NOTE_predicted_epc_source.md.

3 store tests (coexist, idempotent predicted re-save leaves lodged, lodged-only
has no predicted) + property-repo wiring; 85 pass across affected suites; new
code pyright-clean (2 pre-existing wwhrs errors in epc_property_table untouched).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 03:50:19 +00:00
Khalim Conn-Kowlessar
6979607ace feat(epc-prediction): slice-5b ComparableProperties repo port + adapter
Build the cohort IO port ADR-0029 deferred (ADR-0031 slice-5b):
`ComparablePropertiesRepository.candidates_for(postcode) -> list[Comparable]`,
with an EPC-API + geospatial adapter that lists the postcode's lodged certs
(search_by_postcode), fetches + maps each (get_by_certificate_number), and
resolves their UPRNs to coordinates in ONE batched read. Register metadata the
cert doesn't carry (address, registration date) is threaded off the search row;
a UPRN-less or unparseable-date cert is kept, just uncoordinated / unweighted.
The domain select_comparables then filters these candidates into the cohort.

Thin CohortEpcClient / CohortGeospatial Protocols keep the adapter testable
against fakes; EpcClientService + GeospatialS3Repository satisfy them
structurally (no changes). 3 tests; pyright strict clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 03:40:59 +00:00
Khalim Conn-Kowlessar
086187ddc7 feat(epc-prediction): slice-5a predicted source path on Property
Add a `predicted_epc` slot to the Property aggregate and a "predicted" branch to
SourcePath / source_path / effective_epc (ADR-0031 decisions 1+3). A
neighbour-synthesised EpcPropertyData resolves as the Effective EPC ONLY when
there is neither a lodged EPC nor Site Notes — a real source always wins
(prediction is last-resort gap-fill). The slot is distinct from `epc` so a
predicted picture coexists with any lodged one (provenance is structural, not a
flag on EpcPropertyData); downstream consumers are untouched.

3 tests: predicted resolves when sole source; lodged EPC wins over predicted;
Site Notes win over predicted. 10/10 green, pyright strict clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 03:33:47 +00:00
Khalim Conn-Kowlessar
be3e51bae9 feat(epc-prediction): geo-proximity-weighted floor-area median
Size the predicted dwelling from the geo-proximity-weighted median of the
cohort's floor areas rather than the plain median: homes built together share a
footprint, so a nearer neighbour's area should count for more (the same street
signal #1227 already wired into age / wall / glazing). Reuses `_geo_weights` and
adds `_weighted_median`, which reduces exactly to `statistics.median` under
uniform weights (geo off / no target coordinates) — including the even-count
midpoint average — so the MAD-minimising guarantee is preserved.

Measured over the 514-target SAP-10.2 corpus (leave-one-out):
  floor_area MAE  10.48 -> 9.73 m²   MAPE 13.2% -> 12.2%

Re-baselines the n=36 fixture floor_area ceiling 11.8983 -> 12.0378 (a method
change, not a loosening; the small fixture subset moved +0.14 the other way as
sample noise while the population improved decisively). The ceiling still pins
the new deterministic value exactly, so the tighten-only ratchet resumes.

Investigation ruling out the adjacent floor-area levers (kept in the follow-up):
lowering minimum_cohort (9.78-10.03, worse), hard same-form filter (10.19),
mean instead of median (10.68), constant bias correction (10.47),
extension-conditioning (oracle 9.50, not worth the misclassification cost) and
room-in-roof conditioning/additive (RiR is a confound for large multi-part
outliers — RiR area is only ~21% of total, and the increment breaks the homes
already predicted exactly). Remaining cohort lever is built-form soft-weighting,
gated on a denser corpus.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 00:08:05 +00:00
Khalim Conn-Kowlessar
b2b6f8e954 fix(mapper): map Elmhurst "Value known" cylinder to measured volume (code 6)
The Elmhurst Summary §15.1 lodges "Cylinder Size: Value known" with the
measured volume in the "Cylinder Volume (l)" line — the Summary-path
equivalent of the gov-API "Exact" descriptor. The mapper had no entry for
"Value known" so `_elmhurst_cylinder_size_code` raised UnmappedElmhurstLabel,
and even once mapped the measured volume was never threaded through, so the
cascade dropped the cylinder storage loss (~468 kWh/yr) from (219) water
heating on every measured-volume-cylinder Summary.

Per RdSAP 10 §10.5 Table 28 (p.55) a measured cylinder volume is used
directly. Map "Value known" → cascade code 6 (Exact) and thread the §15.1
"Cylinder Volume (l)" value into SapHeating.cylinder_volume_measured_l, which
`_cylinder_volume_l_from_code` (cert_to_inputs.py:5281) already reads for
code 6 — mirroring the gov-API path (mapper.py:1575/1885).

Pins simulated case 39 (P960-0001-001431): an age-A mid-terrace on direct-
acting electric room heaters (SAP code 691, cat 10, control 2602) with
electric-immersion DHW off a 117 L "Value known" cylinder. The full
extractor→mapper→calculator cascade now reproduces the worksheet's SAP-rating
block EXACTLY — SAP value 36.6365 (band F) and (272) CO2 2056.0731 kg/yr,
with (219) water heating 2637.5049 and (255) total energy cost 1802.0039.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 23:57:25 +00:00
Jun-te Kim
5c11fd35c8 Validate SAP calculator vs Elmhurst; fix reduced-field window U; add accuracy harness
Reduced-field window U: heat_transmission derived the synthesised-window
raw U from u_window(all None) -> the 2.5 placeholder regardless of glazing.
Now routes the (uniform) glazing_type code through u_window (RdSAP Table 24)
so e.g. double pre-2002 reads 2.8, not 2.5. Only the pre-SAP10 reduced-field
path is affected (21.0.1 certs carry per-window U upstream) — the RdSAP-21.0.1
corpus gauge is unchanged at 66.9% within-0.5.

test_real_cert_sap_accuracy: pin uprn_10002468137 (RdSAP-17.1, all-electric
storage heaters) at SAP 61, validated against Elmhurst on identical inputs
(dual off-peak immersion, 110 L cylinder, 2 baths). Our engine reproduces
Elmhurst's fuel cost to the penny; lodged 55 is the old SAP-2012 schema.

Tooling to grow the accuracy corpus:
- scripts/fetch_real_life_epc_sample.py — capture a cert by UPRN into the corpus.
- scripts/compare_epc_paths.py — diff gov-API vs Elmhurst-summary EpcPropertyData
  and run both through the engine, localising mapper vs calculator differences.
- skill validate-cert-sap-accuracy — the end-to-end loop (capture -> Elmhurst
  inputs -> human builds -> compare -> reconcile -> pin in the test).
- skill epc-to-elmhurst-rdsap-inputs reference: corrected immersion (code 1=dual),
  cylinder size (code 2 = Normal/110 L), and bath-count (WWHRS sub-tab) mappings.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 15:26:11 +00:00
Khalim Conn-Kowlessar
aea2d7150f test(epc-prediction): re-baseline modal_glazing floor after main merge
main's 'ND' multiple_glazing_type mapper fix (361abc12) changes the mapped
ground-truth glazing for one fixture cert, so modal_glazing_type re-baselines
0.5833 -> 0.5556 (21/36 -> 20/36). A mapper change shifts the deterministic
fixture rates like a fixture change does — re-baseline, not a prediction
regression. All other component floors + residual ceilings unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 15:04:34 +00:00
Khalim Conn-Kowlessar
0b2827e9ff Merge remote-tracking branch 'origin/main' into feature/epc-prediction 2026-06-15 15:03:27 +00:00
Khalim Conn-Kowlessar
1f26703dc5 feat(epc-prediction): geo-proximity weighting, per-component (#1227)
Folds a haversine distance kernel into the categorical-mode weighting so a
nearer neighbour counts for more — applied ONLY to the components that showed
a clear distance signal in the corpus pre-check (age band, wall + floor
construction, glazing: homes built/retrofitted together cluster). Roof
construction showed no decay and is excluded; heating keeps its coherent
donor. Predictor stays pure: weights come from target.coordinates vs each
Comparable.coordinates (resolved at the boundary); geo is OFF when the target
has no coords, neutral for a neighbour with none.

Scale chosen on the harness: _GEO_SCALE_KM=0.1 is the gate-safe optimum
(0.05 lifts the corpus more but regresses fixture floor_construction).
Corpus (150pc/514, geo off->on): age 0.564->0.572, age_pm1 0.841->0.847,
wall 0.902->0.912, floor_con 0.786->0.796, glazing 0.667->0.673; roof
unchanged. Fixture: glazing 0.5278->0.5833 (floor ratcheted), all else held.

Refactored recency into a reusable _recency_weights vector composed via
_combine, so similarity/recency/geo factors multiply uniformly. Fixture ships
a committed _coordinates.json (OGL OS OpenData; build script carries it from
the corpus sidecar on rebuild) so the gate exercises geo without S3.

This is the per-component method applied to geography ([[feedback_per_component_best_method]]).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 14:58:42 +00:00
Khalim Conn-Kowlessar
fdc314c857 feat(epc-prediction): thread coordinates onto Comparable + target (#1227)
Adds coordinates: Optional[Coordinates] to Comparable and PredictionTarget
(data carriers — the pure predictor stays IO-free), and wires load_corpus to
read an optional _coordinates.json sidecar ({uprn: [lon, lat]}) and populate
each Comparable from its cert's uprn; iter_predictions threads the held-out
target's coordinates through. Absent sidecar -> geo-weighting stays off (no
behaviour change yet — weighting lands next slice). fetch_corpus_coordinates
now writes the sidecar into the corpus dir. load_corpus populates 99% of
corpus comparables.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 14:46:01 +00:00
Jun-te Kim
345154c6b7 Map full-SAP measured ventilation: air permeability, MV kind, sheltered sides 🟩
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 14:37:52 +00:00
Khalim Conn-Kowlessar
95719dd587 feat(geospatial): batch coordinates_for_uprns lookup (#1227)
Adds GeospatialRepository.coordinates_for_uprns(uprns) -> dict — a batch
coordinate lookup returning only covered UPRNs. The S3 adapter overrides it
to read the meta once, group UPRNs by their covering partition, and read each
partition once for all the UPRNs it covers; co-located (closely-numbered)
UPRNs share a partition, so an EPC Prediction cohort is typically one or two
reads instead of one per neighbour. Default port impl is a per-UPRN loop.

Feeds the EPC Prediction geo-proximity work: a cohort's UPRNs resolve to
coordinates in a couple of reads (validated at corpus scale: 170 partition
reads for 2683 UPRNs).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 14:35:32 +00:00
Jun-te Kim
c035d17f2b Map full-SAP certs end-to-end through the dispatch ladder and pin observed score 🟩
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 14:25:48 +00:00
Jun-te Kim
cb4d080da2 Map full-SAP heating systems onto the domain SapHeating model 🟩
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 14:18:01 +00:00
Jun-te Kim
125ff6f4dd Merge remote-tracking branch 'origin/main' into feature/hyde_make_it_more_accurate_with_tests
# Conflicts:
#	datatypes/epc/domain/mapper.py
2026-06-15 14:12:38 +00:00
Daniel Roth
1af9d84f94 Merge branch 'main' into feature/deploy-sharepoint-renamer 2026-06-15 14:07:27 +00:00
Daniel Roth
963b7d70fe fix terraform error and pass handler bool for dry runs 2026-06-15 14:06:54 +00:00
Khalim Conn-Kowlessar
4afab2c3d8 feat(epc-prediction): roof-insulation +/-1-bucket reporting
Adds roof_insulation_thickness_pm1 (mirrors construction_age_band_pm1, issue
#1222): adjacent RdSAP thickness buckets (0/NI,12mm..400mm+) carry near-
identical roof U-values, so an off-by-one bucket is a SAP-neutral hit. 'ND'
(no-data) is off the ordered scale, so only an exact match counts there.
Honest measurement of SAP-relevant roof-insulation quality.

Corpus (150pc/514): exact 49.3% -> +/-1 53.7% (the misses are often multiple
buckets or ND, so the band gain is smaller than age's). Fixture: exact ==
+/-1 (0.4118) — its misses are all >1 bucket; gate floor added at 0.4118.

Also fixes two pre-existing pyright errors in the touched test file
(_epc main_fuel_type/main_heating_control were Optional but the
MainHeatingDetail attributes are non-optional Union[int, str]).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 14:04:18 +00:00
Jun-te Kim
5a3228ab5e
Merge pull request #1217 from Hestia-Homes/feature/per-cert-mapper-validation
Feature/per cert mapper validation
2026-06-15 15:03:05 +01:00
Khalim Conn-Kowlessar
fffb07d04b test(harness): re-pin golden-cert plans to the gain-maximising packages
Three more pre-existing failures (present at 9ee38211, before this branch's
recent commits; same family as the orchestration multi-measure re-pin) —
golden-cert plan expectations that predate the ASHP generator (ADR-0025)
and the optimiser folding forced dependencies into candidate gain (ADR-0016):

- test_console: a multi-measure plan now leads with air_source_heat_pump,
  not cavity_wall_insulation (which is dropped — its forced ventilation makes
  the pair net-negative). Assert a measure actually in the package.
- test_report 0330: package is now {solid_floor_insulation, air_source_heat_
  pump}; cavity_wall + forced mechanical_ventilation correctly excluded.
- test_report 0036: gain-maximising package is now {solid_floor_insulation,
  low_energy_lighting}.

Same verified-correct optimiser evolution as 077e3a39 (cavity_wall +2.9 SAP
alone but its forced fabric→ventilation dep drags the pair net-negative).
Re-pin to the actual packages + their trigger fields; the forced wall→vent
edge stays covered by test_measure_dependency / test_optimiser.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 13:57:27 +00:00
Khalim Conn-Kowlessar
06a66b3dd9 feat(epc-prediction): coherent heating donor selection (#1225)
Heating sub-fields can't be field-moded without breaking system coherence,
so the whole SapHeating cluster is now copied as a unit from a single
coherent donor rather than inherited from the structural template: the
neighbour matching the cohort's modal heating signature (main fuel +
category + cylinder presence), most recent among the matches (recent cert =
current system). Including cylinder presence in the signature is load-bearing
— it protects has_hot_water_cylinder + cylinder_insulation (a bare fuel+cat
signature regressed them).

Corpus (150pc/514): heating_main_control 66.3 -> 73.9% (+7.6, the target),
main_fuel 92.8 -> 96.9, category 90.7 -> 95.7, water_fuel 92.8 -> 96.3,
water_code 88.5 -> 95.3, has_cylinder 81.1 -> 89.7, secondary 36.2 -> 42.0.
SAP MAE vs lodged 7.08 -> 6.00 (calculator floor 1.57). cylinder_insulation
-13.6 corpus (tiny-n) but +33pp on the fixture; AC requires control up +
fuel/category hold + SAP not worsened, all met.

Gate (36-target fixture): zero regression; ratcheted main_category
0.8889->0.9444, main_control 0.7500->0.8056, water_fuel 0.9167->0.9722,
water_code 0.8889->0.9444, cylinder_insulation_type 0.1667->0.5000. This is
the per-component heating method ([[feedback_per_component_best_method]]):
coherent donor, never field-mode.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 13:48:15 +00:00
Khalim Conn-Kowlessar
077e3a3947 test(orchestration): re-pin multi-measure plan to the gain-maximising package
The optimiser-package expectation was stale: it predated the optimiser
folding a triggered measure's forced dependency into its candidate gain
(ADR-0016). The run considers ALL measures (considered_measures defaults
to None — no restriction), so once the ASHP bundle became SAP-beneficial
(ADR-0025) the gain-maximising package shifted.

Verified the new package is CORRECT, not a regression: on the test EPC,
cavity-wall insulation earns +2.9 SAP alone but its forced fabric→
ventilation dependency (ADR-0016) drags the wall+ventilation pair to a
NET −1.8 SAP (−0.9 on top of the ASHP package), so the gain-maximising
Optimiser correctly excludes the wall and its forced ventilation. Update
the expected set to {air_source_heat_pump, suspended_floor_insulation,
low_energy_lighting, secondary_heating_removal} and drop the wall/vent-
specific assertions — the forced wall→ventilation edge is covered by
test_measure_dependency / test_optimiser; this integration test keeps its
end-to-end optimise→persist→telescope coverage on the chosen package.

Pre-existing failure (present before this branch's recent commits), outside
the handover regression gate.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 13:46:22 +00:00
Khalim Conn-Kowlessar
d762b25808 feat(epc-prediction): recency-weighted glazing mode (#1223)
Per-component method: glazing type is now the recency-weighted cohort mode
applied to every predicted window, rather than copied from the template.
Glazing is retrofitted over a dwelling's life (single -> double), so a
recent neighbour reflects the current state — same family as roof-insulation
thickness. Recency is the CORRECT weighting here: plain moding regressed the
fixture (-5.6pp) and was previously reverted; similarity weighting also
regressed it; recency improves BOTH (window geometry stays on the template,
only the glazing categorical moves).

modal_glazing_type: corpus (150pc/514) 60.7 -> 66.7% (+6.0pp); fixture
0.5000 -> 0.5278 (floor ratcheted up). Heating, geometry residuals and all
other components unchanged. Refactored _recency_weighted_mode to a reusable
_recency_weighted_choice(value_of) shared by roof insulation + glazing.

Closes the #1223 per-component approach: floor-area (median estimate) +
glazing (recency) shipped as distinct best-fit methods rather than a global
recency template, which would have disturbed the coherence-coupled heating
cluster.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 13:35:03 +00:00
Khalim Conn-Kowlessar
4fdc23f83d test(worksheet): pin simulated case 38 — mains-gas secondary reproduces worksheet exactly
The realistic re-generation of case 37 (code-117 gas boiler, control 2102,
+ a MAINS-GAS condensing gas-fire secondary code 611, vs case 37's biogas
605). The full extractor -> mapper -> calculator pipeline reproduces the
worksheet's SAP-rating block EXACTLY: continuous SAP 60.9152 (Δ 2e-5) and
(272) CO2 5801.0770 (Δ ~0). This confirms the boiler-efficiency /
control-2102 −5pp interlock / secondary-fuel handling are all correct, and
that case 37's +7 gap was purely the biogas sub-fuel the Summary export
cannot carry.

Summary mirrored into backend/documents_parser/tests/fixtures so the pin
runs without the unstaged workspace. PE not pinned — it is a separate
DPER block (different scope) already guarded by the corpus PE gauge.
Worksheet harness 47/47 unchanged; pyright net-zero.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 13:31:36 +00:00
Khalim Conn-Kowlessar
51cdc25ce8 feat(epc-prediction): cohort-median floor-area estimate (#1223)
Per-component method, not a global template change: the predicted floor
area is now the cohort median (the MAD-minimising point estimate of the
target's size) rather than whichever structural template's own area. The
calculator derives heat loss from building-part geometry, not this scalar,
so decoupling them is safe and the scalar becomes a better size estimate.

floor_area mean|.|: corpus (150pc/514 targets) 10.62 -> 10.48; fixture
12.2175 -> 11.8983 (ceiling ratcheted down). No other component moves.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 13:30:33 +00:00
Daniel Roth
b9cbea367d correct import in test file 2026-06-15 12:21:32 +00:00
Jun-te Kim
0079752eab inviestigation with hyde values 2026-06-15 12:13:11 +00:00
Daniel Roth
5c314e2914 move tests out of scripts/ 2026-06-15 11:11:08 +00:00
Khalim Conn-Kowlessar
c11eb46b8a fix(modelling): HHR overlay sets off-peak immersion type so HW Table 13 applies
The HHR-storage HeatingOverlay (ADR-0024) added an off-peak electric
immersion cylinder but never set `immersion_heating_type`, so the overlaid
cert left it None. The calculator then could not resolve `immersion_single`
for the SAP 10.2 Table 13 HW high-rate split and billed hot water 100% at
the off-peak low rate — £127.41 vs the relodged after-cert's £169.39,
overstating the overlay's SAP by +1.26 (CO2/PE matched, isolating it to the
HW cost path).

Add `immersion_heating_type` to HeatingOverlay, route it through
`_fold_heating` (it lives on `sap_heating`), and set it to 1 (single
off-peak immersion) on the HHR overlay to match the relodged reference.
Closes both `test_hhr_storage_overlay_reproduces_the_relodged_after_*`
cascade pins (electric-storage and no-system befores share the after).

Pre-existing failure (present before this branch's recent commits), outside
the handover regression gate. Full modelling suite 220 pass, pyright net-
zero.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 06:53:14 +00:00
Khalim Conn-Kowlessar
718455e971 feat(epc-prediction): physical-similarity-weighted categorical mode (#1224)
ADR-0029 decision 5: survivors were treated equally; now each neighbour's
vote in the cohort mode decays with its distance from the cohort's physical
centre (floor area from the median, age band from the modal band), so the
mode leans on the most representative neighbours instead of being swayed by
size/era outliers. Scales (size 20 m^2, age weight 0.5) chosen on the
validation corpus; the tight size kernel is load-bearing (looser scales
regress floor_insulation on the fixture).

Corpus (181 SAP-10.2 targets): wall_insulation 83.4->86.2%,
roof_construction 86.2->87.3%, floor_construction 78.8->81.2%,
floor_insulation 92.9->94.1%; net +7.5pp gained vs -1.1pp (two 1-cert dips,
both held on the fixture). Geometry/residuals untouched (template unchanged).

Gate (36-target fixture): zero regression across all 24 floors/ceilings;
ratcheted wall_insulation_type 0.7778->0.8333, floor_construction
0.7500->0.8125, floor_insulation 0.9062->0.9375. Dead _mode/_int_mode
removed (superseded by the weighted variants).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 10:46:51 +00:00
Khalim Conn-Kowlessar
07051b9401 feat(epc-prediction): per-prediction confidence signal (#1226)
Adds PredictionConfidence (cohort size + per-component agreement = the
modal value's share among neighbours that lodge one) and
EpcPrediction.confidence(), a compute-only signal so downstream can flag
low-confidence components (ADR-0029 open item: 'confidence signal').

Sanity check on the 40-postcode corpus (1068 component predictions):
agreement is strongly predictive of correctness — pooled hit-rate 21.9%
(<0.5) / 46.7% (0.5-0.7) / 73.6% (0.7-0.9) / 95.5% (>=0.9); point-biserial
corr(agreement, correct) = 0.582. Cohort size tracks too (<6 -> 68.4%,
>=20 -> 96.0%). Surfacing / persistence is a separate HITL follow-up.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 10:35:59 +00:00
Khalim Conn-Kowlessar
ffaedd8d14 feat(epc-prediction): ±1-band age scoring + window_count cosmetic (#1222)
Measurement honesty so we optimise SAP-relevant accuracy, not SAP-neutral
misses (ADR-0030 Component Accuracy):
- Add construction_age_band_pm1: an exact-or-adjacent-band hit. Adjacent
  RdSAP age bands carry near-identical U-values, so an off-by-one is
  ~SAP-neutral. Full corpus: exact 78.5% but ±1-band 91.7% (fixture
  63.9% -> 83.3%) — most age misses are adjacent.
- Drop window_count from the gate's residual ceilings (cosmetic): the
  predicted picture clusters at a mapper-default 4 windows vs actuals 1-21,
  but total_window_area (the SAP-relevant signal) stays tight at ~3.4 m2.

Gate: + construction_age_band_pm1 floor 0.8333; window_count no longer gated.

Closes #1222

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 10:01:20 +00:00
Khalim Conn-Kowlessar
ac77624d67 test(pv-battery): pin SAP cost-neutrality on export-capable standard tariff
End-to-end API-path regression pin for the battery behaviour validated by
the user-simulated Elmhurst worksheet pair (cert 001431 "simulated case
35/36", 5 kWh, export-capable, mains-gas, standard tariff). The official
SAP rating ("10a. Fuel costs - using Table 12 prices") values PV used-in-
dwelling and PV exported identically at 13.19 p/kWh (export code 60 ==
import code 30, ADR-0010), so a battery only redistributes PV between two
equally-priced lines: worksheet PV credit (252) = -455.6458 and SAP (258)
= 88.0859 are IDENTICAL with/without the battery (ΔSAP = 0).

Two tests over the committed RdSAP-21.0.1 corpus:
- standard tariff (meter 2): toggling the battery holds continuous SAP
  EXACTLY constant, while at least one cert's primary energy DOES respond
  (proving the App-M1 §3c β-split is wired, not a dropped battery).
- off-peak tariff (meter != 2): the battery STRICTLY raises SAP, because
  self-consumed PV displaces high-rate import (15.29) above the 13.19
  export credit — confirming the standard-tariff neutrality is a price
  coincidence, not a no-op.

Guards table_32 export price (code 60) and the battery β-split against
silent regression. Complements the unit-level β tests in
test_photovoltaic.py.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 09:51:34 +00:00
Khalim Conn-Kowlessar
a5b7310911 feat(epc-prediction): recency-weighted mode for roof insulation (ADR-0029/0030)
Investigated recency-weighting (weight cohort votes by an exponential decay
in cert age). Key finding: it must be SELECTIVE. On the validation corpus it
HURTS permanent categoricals (wall 91.2->89.5, age 78.5->75.7 — discards
still-valid data) but clearly HELPS time-varying ones, where a recent
neighbour reflects the current physical state:
  roof_insulation_thickness  56.7 -> 60.7%  corpus   (+4pp)
                             29.4 -> 41.2%  fixture  (+12pp)

So apply a recency-weighted mode only to roof_insulation_thickness (loft
top-ups happen over time); keep the plain mode for permanent categoricals.
tau = 4yr (~2.8yr half-life); falls back to plain mode when no registration
dates are lodged. Gate floor ratcheted 0.2941 -> 0.4118.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 09:45:22 +00:00
Khalim Conn-Kowlessar
9dd23477ac feat(epc-prediction): cohort-mode roof + floor insulation (ADR-0030)
These independent fabric categoricals were template-copied; mode them like
the construction categoricals. Verified mode beats template before applying.
Big fixture win on roof insulation thickness (doubled), floor insulation
neutral-to-positive:
  roof_insulation_thickness  14.7% -> 29.4%  (gate floor ratcheted up)
  floor_insulation           90.6% (unchanged on the fixture)

Glazing type was tried too (+1.6pp on the 40-postcode corpus) but REGRESSED
the 36-target fixture (0.50 -> 0.44) — the gate caught it. Glazing moding is
marginal/noisy, so it's left on the template; revisit with a larger corpus.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 09:37:45 +00:00
Khalim Conn-Kowlessar
e3a2720e5c feat(epc-prediction): Tier-1 ratcheting Component Accuracy gate (ADR-0030)
The committed CI gate: run the calculator-free leave-one-out scorer over the
frozen anonymised fixture (36 SAP-10.2 targets) and assert each per-component
classification rate / geometry residual is no worse than a committed baseline.
Prediction is deterministic + the fixture frozen, so the numbers reproduce
exactly — a failure is a real regression, never sample noise.

- 19 rate floors + 5 residual ceilings, seeded at the currently-measured
  values; they only ever tighten (no-widening ethos on an aggregate).
- Calculator-FREE — component floors are the real gate; the end-to-end
  SAP/carbon/PE guards stay out (their floor is the separate API-path
  calculator workstream).
- Skips with a message when the fixture is absent.

25 parametrized assertions, all green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 09:19:39 +00:00
Khalim Conn-Kowlessar
008c1922c4 feat(epc-prediction): anonymised Tier-1 fixture + builder (ADR-0030)
The committed gate needs frozen, reproducible data without dumping real UK
addresses into the repo. Add:
- harness anonymise_payload + stable_hash: hash street address + cert number
  into opaque, dedup-stable tokens; blank secondary address lines + post_town;
  keep postcode + all component/lodged fields (gov data is OGL). Unit-tested.
- scripts/build_epc_prediction_fixture.py: curate qualifying postcodes (>=1
  SAP 10.2 target + >=2 distinct addresses) from the local scratch corpus,
  anonymise, freeze under tests/fixtures/epc_prediction/.
- The frozen fixture: 15 postcodes / 280 certs / 36 SAP-10.2 targets.
  Verified no plaintext address_line_1 and post_town all blank.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 09:17:27 +00:00