mirror of
https://github.com/Hestia-Homes/Model.git
synced 2026-06-30 13:10:47 +00:00
docs(epc-prediction): handover for the accuracy backlog + geo work
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
d8f015fb0e
commit
da3fc92d53
1 changed files with 144 additions and 0 deletions
144
docs/HANDOVER_EPC_PREDICTION.md
Normal file
144
docs/HANDOVER_EPC_PREDICTION.md
Normal file
|
|
@ -0,0 +1,144 @@
|
|||
# EPC Prediction — handover
|
||||
|
||||
Branch `feature/epc-prediction` @ `d8f015fb` (37 ahead of `origin/main`; local-only,
|
||||
not pushed). Tree clean. All ranked backlog (#1222–1228) closed.
|
||||
|
||||
## What this is
|
||||
Deterministic **neighbour synthesis** that predicts a structured `EpcPropertyData`
|
||||
for an EPC-less UK home from its postcode-cohort of neighbours, so it flows through
|
||||
the modelling pipeline. NOT ML. Validation methodology + harness are built; the work
|
||||
is a measurable accuracy backlog.
|
||||
|
||||
## READ FIRST (hold the full state)
|
||||
- Memory `project_epc_prediction` — the spine: design, every commit, metrics, the
|
||||
open fronts, gotchas. Read it first.
|
||||
- `docs/adr/0029-…` (design, 6 forks) and `docs/adr/0030-…component-first.md`
|
||||
(validation methodology — internalise: predict components, SAP/carbon/PE are a
|
||||
calculator-floored *secondary* guard).
|
||||
- Memory `feedback_per_component_best_method` — THE load-bearing principle this
|
||||
session established (see below).
|
||||
- Convention memories: `feedback_aaa_test_convention`,
|
||||
`feedback_abs_diff_over_pytest_approx`, `feedback_commit_per_slice`,
|
||||
`feedback_bigger_slices_for_uniform_work`.
|
||||
|
||||
## The methodology (ADR-0030)
|
||||
- **Component Accuracy is the PRIMARY signal** — predicted vs API-actual components,
|
||||
calculator-free. SAP/CO₂/PE vs lodged is SECONDARY and calculator-floored.
|
||||
- Source cohort keeps ALL cert vintages; only held-out validation TARGETS are
|
||||
SAP 10.2 (`sap_version == 10.2`).
|
||||
- The committed **Tier-1 gate** (`tests/domain/epc_prediction/test_component_accuracy_gate.py`)
|
||||
runs the calculator-free scorer over the frozen anonymised fixture
|
||||
(`tests/fixtures/epc_prediction/`, 36 SAP-10.2 targets) and asserts per-component
|
||||
ratchet floors. Deterministic → exact. **Tighten-only**: when you improve a
|
||||
component, bump its floor in the same commit. A *mapper or fixture change*
|
||||
re-baselines floors (not a regression) — document it.
|
||||
|
||||
## THE PRINCIPLE that drove this session
|
||||
**Give each component its own best-fit synthesis method; never force one global
|
||||
mechanism on all of them.** Validated head-to-head on the harness:
|
||||
- Permanent fabric categoricals (wall, age) → **physical-similarity-weighted mode**
|
||||
(size×age toward cohort centre).
|
||||
- Time-varying components (roof insulation, glazing) → **recency-weighted mode**.
|
||||
- Coherence-coupled cluster (heating) → **coherent whole-cluster donor**, NEVER
|
||||
field-moded.
|
||||
- Point-estimate scalar (floor area) → **cohort median** (MAD-minimising).
|
||||
- Geo-varying components (age, wall, floor, glazing) → additionally **geo-proximity
|
||||
weighted**; roof showed no geo signal → excluded.
|
||||
All live in `domain/epc_prediction/epc_prediction.py` as composable weight vectors
|
||||
(`_similarity_weights` × `_recency_weights` × `_geo_weights`, combined via `_combine`,
|
||||
fed to `_weighted_mode`).
|
||||
|
||||
## Closed this session (#1222 was done before; #1223–1228 this session)
|
||||
- **#1226** per-prediction confidence (`PredictionConfidence`, compute-only;
|
||||
agreement strongly predicts correctness, r=0.582).
|
||||
- **#1224** physical-similarity-weighted categorical mode (wall_insul/roof/floor +1–3pp).
|
||||
- **#1223** per-component, NOT a global recency template: floor-area→cohort median +
|
||||
glazing→recency mode. (A global recency template was rejected — it disturbed the
|
||||
coherence-coupled heating cluster.)
|
||||
- **#1225** coherent heating donor (modal signature = fuel+category+cylinder, recency
|
||||
tie-break). Biggest SAP lever: control 66→74%, SAP MAE 7.08→6.00 pre-merge.
|
||||
- **#1228** PEI investigation — DISPROVED the unit-bug hypothesis (calc/lodged ratio
|
||||
1.06); reframed as calc floor + prediction-sensitivity. Report now surfaces CO₂/PEI
|
||||
calc floors. (Open calc-branch remnant; largely closed by the main merge — see below.)
|
||||
- **#1227** geo-proximity weighting — grilled, signal-checked (STRONG GO, esp. age),
|
||||
built per-component. Batch `GeospatialRepository.coordinates_for_uprns`, coords
|
||||
threaded onto `Comparable`/`PredictionTarget`, haversine kernel (`_GEO_SCALE_KM=0.1`,
|
||||
gate-safe optimum). Intra-postcode lift modest (cohort = 1 postcode); the bigger
|
||||
prize is cross-postcode expansion (deferred, needs dense corpus).
|
||||
- **Corpus grown 40→150 postcodes** (`6e9f8312`); roof-insulation ±1 reporting.
|
||||
- **Merged `origin/main`** (96 commits of calculator/mapper gap fixes, `0b2827e9`).
|
||||
|
||||
## Current metrics (post-merge, 150-pc corpus, 514 SAP-10.2 targets)
|
||||
Component Accuracy (calculator-free): wall 91.2, wall_insul 79.0, age 57.2 (±1 84.7),
|
||||
roof_construction 78.2, floor_construction 79.6, heating_fuel 96.9, heating_category
|
||||
95.7, heating_control 73.9, water_fuel 96.3, water_code 95.3, has_cylinder 89.7,
|
||||
cylinder_insul 52.4, secondary 42.0, roof_insul 49.3 (±1 53.7), floor_insul 94.7,
|
||||
room_in_roof 96.5, glazing 67.3, pv 98.8, solar 99.8.
|
||||
|
||||
Floor area: **MAE 10.48 m² / MAPE 13.2% / typical (median actual) 61 m²** (cohort
|
||||
median, unweighted).
|
||||
|
||||
End-to-end vs lodged (SECONDARY, calculator-floored):
|
||||
SAP pred MAE 6.25 / **calc floor 0.95** (was 1.57 pre-merge, orig 3.25 — the calc
|
||||
fixes nearly validated the calculator, so the gap is now almost all prediction);
|
||||
CO₂ 0.61 / floor 0.18; PEI 39.6 / floor 13.7.
|
||||
|
||||
## Key files
|
||||
- `domain/epc_prediction/epc_prediction.py` — `EpcPrediction.predict`: median floor
|
||||
area + per-component weighted modes + glazing + heating donor + overrides.
|
||||
- `domain/epc_prediction/comparable_properties.py` — `select_comparables` ladder;
|
||||
`Comparable`/`PredictionTarget` (carry `coordinates`).
|
||||
- `domain/epc_prediction/prediction_comparison.py` — `compare_prediction` (25 signals).
|
||||
- `domain/epc_prediction/validation.py` — `iter_predictions` + `evaluate_component_accuracy`
|
||||
(one scorer, calculator-free).
|
||||
- `harness/epc_prediction_corpus.py` — `load_corpus` (+ `_coordinates.json` sidecar),
|
||||
`load_coordinates`, `anonymise_payload`.
|
||||
- `repositories/geospatial/` — `GeospatialRepository.coordinates_for_uprns` (batch).
|
||||
- `scripts/validate_epc_prediction.py` (full report), `build_epc_prediction_fixture.py`,
|
||||
`fetch_epc_prediction_corpus.py`, `fetch_corpus_coordinates.py`.
|
||||
|
||||
## Open fronts (ranked)
|
||||
1. **Geo-weighted floor-area median** — measured quick win: MAE 10.48→**9.77**,
|
||||
MAPE 13.2→12.2%. Swap `_median_floor_area` for a geo-weighted median (reuse
|
||||
`_geo_weights`); gate-check + ratchet the floor_area ceiling. Smallest next slice.
|
||||
2. **Cross-postcode geo expansion** — the real geo payoff (distance-weighted cohort
|
||||
beyond the single postcode). Needs a *densely-sampled* corpus (current 150 are
|
||||
scattered, so a target's true geo-neighbours aren't in-corpus). Design grilled;
|
||||
build a dense corpus first.
|
||||
3. **Slice-5 production wiring** — `ComparableProperties` repo + the
|
||||
`ModellingOrchestrator` owning the EPC *estimation* + distance calcs (a deliberate
|
||||
shift from ADR-0029, which put the fallback in Ingestion). WRITE AN ADR when this
|
||||
lands (it reverses where the fallback lives). Add a provenance marker
|
||||
(`EpcPropertyData` has no predicted/source field yet).
|
||||
4. Weak components with headroom only via NEW signals: age 57% / roof_insul 49%
|
||||
(method-exhausted — confirmed recency/similarity/plain all tie-or-worse);
|
||||
cylinder_insul / secondary are tiny-n.
|
||||
|
||||
## How to run
|
||||
- Token + S3 creds: `set -a; . backend/.env; set +a` (AWS creds mounted at `~/.aws`).
|
||||
- Tests: `PYTHONPATH=. python -m pytest tests/domain/epc_prediction tests/harness/test_epc_prediction_corpus.py tests/repositories/geospatial -o addopts="" -p no:cacheprovider -q`
|
||||
- Full report: `PYTHONPATH=. python scripts/validate_epc_prediction.py` (corpus
|
||||
`/tmp/epc_prediction_corpus`).
|
||||
- Gate is just a pytest test (deterministic, calculator-free).
|
||||
- pyright strict, zero new errors, on every touched file.
|
||||
|
||||
## In-flight / gotchas
|
||||
- **Corpus lives in `/tmp/epc_prediction_corpus`** (gitignored; 150 pc / 3719 certs +
|
||||
`_coordinates.json`). Backed up to `/workspaces/home/epc_prediction_corpus_backup`
|
||||
(persistent host mount — survives container rebuild; `/tmp` does NOT). Coords backup
|
||||
at `/workspaces/home/epc_prediction_corpus_coords_backup.json`. If `/tmp` is wiped,
|
||||
restore from the backup before running the full report.
|
||||
- **Coordinates**: OS Open-UPRN parquet is `DATA_BUCKET/spatial/` (boto3 — s3fs NOT
|
||||
installed; read via `get_object`→BytesIO; `boto3.client` needs
|
||||
`# pyright: ignore[reportUnknownMemberType, reportUnknownVariableType]`). The cert
|
||||
payload carries `uprn` (the join key). The committed fixture ships `_coordinates.json`
|
||||
(OGL OS OpenData) so the gate exercises geo without S3.
|
||||
- **NEVER commit** the API token, `/tmp` corpus, or the coords cache. The
|
||||
`tests/fixtures/epc_prediction` one is anonymised + intentional.
|
||||
- Conventions: AAA test headers; `abs(x-y) <= tol` not `pytest.approx`; commit per
|
||||
slice (stage by name, watch untracked); ADR-cite in commit messages; class is
|
||||
`EpcPrediction` (no "Service").
|
||||
- Per-item workflow: implement TDD red→green on this branch → run the harness →
|
||||
record before/after → ratchet gate floors → `gh issue comment` impact → close.
|
||||
- The merge is **local, not pushed** — push only if asked.
|
||||
- Update memory `project_epc_prediction` as state changes.
|
||||
Loading…
Add table
Reference in a new issue