From b677448fa048fe1254d3ba8da8cd883539f31da7 Mon Sep 17 00:00:00 2001 From: Khalim Conn-Kowlessar Date: Tue, 16 Jun 2026 04:05:00 +0000 Subject: [PATCH] docs(epc-prediction): slice-5f production-wiring handover for Jun-te MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The gap-fill is wired end-to-end (slices 5a-5e) behind seams; this note is what's left to switch it on in production: (1) implement the PredictionTargetAttributesReader stub over property_overrides — with the override-value → API-code mapping select_comparables needs; (2) run the epc_property.source Drizzle migration; (3) pass the three optional collaborators at the IngestionOrchestrator composition root. Plus the open Validation-Cohort exclusion (no code path exists yet — exclude on source_path == "predicted" when one is built) and the anomaly dual-use pointer. No code change: the validation exclusion has no consumer to attach to today, and the structural signal (source_path == "predicted") already exists from slice-5a. Co-Authored-By: Claude Opus 4.8 --- docs/HANDOVER_EPC_PREDICTION_WIRING.md | 94 ++++++++++++++++++++++++++ 1 file changed, 94 insertions(+) create mode 100644 docs/HANDOVER_EPC_PREDICTION_WIRING.md diff --git a/docs/HANDOVER_EPC_PREDICTION_WIRING.md b/docs/HANDOVER_EPC_PREDICTION_WIRING.md new file mode 100644 index 00000000..06acbd0e --- /dev/null +++ b/docs/HANDOVER_EPC_PREDICTION_WIRING.md @@ -0,0 +1,94 @@ +# EPC Prediction — production wiring handover (for Jun-te) + +The EPC Prediction **gap-fill** is wired end-to-end behind seams, with one real +dependency stubbed: reading an EPC-less Property's resolved Landlord Overrides. +This note is what's needed to finish it once your `property_overrides` read path +lands. Design is **ADR-0031**; terms in **CONTEXT.md** (EPC Prediction, Effective +EPC, EPC Anomaly Flag). + +## What's already built (slices 5a–5e, all on `feature/epc-prediction`) + +- **5a** `Property.predicted_epc` slot + a `"predicted"` `source_path` / + `effective_epc` branch — used only when there's no lodged EPC and no Site Notes + (a real source always wins). +- **5b** `ComparablePropertiesRepository.candidates_for(postcode)` + + `EpcComparablePropertiesRepository` adapter (postcode search → per-cert fetch → + batched UPRN→coords). Composes with `EpcClientService` + `GeospatialS3Repository`. +- **5c** EPC store `source` discriminator (`lodged` | `predicted`) so the two + coexist per property; `get_predicted_for_property` / `_for_properties`; + `PropertyPostgresRepository` hydrates `predicted_epc`. **Needs a DB migration — + see `docs/MIGRATION_NOTE_predicted_epc_source.md`.** +- **5d** `build_prediction_target(identity, coords, attributes)` + the eligibility + **gate** (unknown `property_type` → not predicted). Override attributes come + through the `PredictionTargetAttributesReader` port (the stub). +- **5e** `IngestionOrchestrator` wiring: when the three prediction collaborators + are injected, an EPC-less Property is predicted from its cohort and persisted to + the predicted slot. The collaborators are **optional** — unwired, ingestion is + unchanged. + +## Your part — three things + +### 1. Implement `PredictionTargetAttributesReader` (the stub) + +`repositories/property/prediction_target_attributes_reader.py` defines the port: +`attributes_for(property_id) -> PredictionTargetAttributes` (property_type, +built_form, wall_construction). Build the adapter as a read over the +`property_overrides` fact layer (the finaliser writes it via +`PropertyOverrideRepository.upsert_all`; you're adding the read side). + +**Code-space gotcha.** `select_comparables` filters +`comparable.epc.property_type == target.property_type`, and the cohort EPCs carry +gov **API codes** (e.g. `"0"`/`"2"`). Landlord Overrides resolve to enum *value* +strings (e.g. `"House"`). Your adapter must map override value → the API-code +space, or `property_type` will never match and every cohort comes back empty. +Same for `built_form`. (`domain/epc/property_type.py`, `built_form_type.py` are +the enums; `datatypes/epc/domain/epc_codes.csv` has the code table.) +`property_type` unresolved → return `PredictionTargetAttributes(property_type=None)` +so the gate skips the Property. + +### 2. Run the Drizzle migration + +`epc_property.source` column — full spec in +`docs/MIGRATION_NOTE_predicted_epc_source.md` (column + default `'lodged'` + +relax any `property_id` uniqueness to `(property_id, source)`). + +### 3. Wire the collaborators at the composition root + +Wherever `IngestionOrchestrator(...)` is constructed for the real run, pass the +three optional kwargs: + +```python +IngestionOrchestrator( + ..., + comparables_repo=EpcComparablePropertiesRepository(epc_client, geospatial_repo), + prediction_attributes_reader=, + epc_prediction=EpcPrediction(), +) +``` + +That's the on-switch. Until all three are passed, ingestion ignores prediction. + +## One open item — Validation Cohort exclusion + +A predicted-source Property has **no real lodged record**, so it must not be +scored as if it did (CONTEXT: Validation Cohort; ADR-0031 dec-3). There is **no +Validation-Cohort code path today** to exclude it from — when one is built (or in +any QA that compares `calc(effective_epc)` vs lodged), exclude on the structural +signal: + +```python +if prop.source_path == "predicted": + continue # predicted EPC — no ground truth to validate against +``` + +Note too: `PropertyBaselinePerformance.lodged` is derived from `effective_epc` +regardless of source (`property_baseline_orchestrator` → `lodged_performance`), so +for a predicted Property that "lodged" is synthesised, not real. Decide whether +baseline should null/flag it for predicted properties when this lands. + +## Anomaly dual-use (later, not now) + +Slice-5 is gap-fill only (`epc is None`). The slot model already supports +predicting for *every* Property to compare predicted vs lodged (**EPC Anomaly +Flags**) — see ADR-0031 dec-4. Reuses the same `ComparableProperties` repo + the +predicted slot.