docs(epc-prediction): slice-5f production-wiring handover for Jun-te

The gap-fill is wired end-to-end (slices 5a-5e) behind seams; this note is what's left to switch it on in production: (1) implement the PredictionTargetAttributesReader stub over property_overrides — with the override-value → API-code mapping select_comparables needs; (2) run the epc_property.source Drizzle migration; (3) pass the three optional collaborators at the IngestionOrchestrator composition root. Plus the open Validation-Cohort exclusion (no code path exists yet — exclude on source_path == "predicted" when one is built) and the anomaly dual-use pointer. No code change: the validation exclusion has no consumer to attach to today, and the structural signal (source_path == "predicted") already exists from slice-5a. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 13:10:47 +00:00 · 2026-06-16 04:05:00 +00:00 · 2026-06-16 04:05:00 +00:00 · b677448fa0
commit b677448fa0
parent 5727ac53c1
1 changed files with 94 additions and 0 deletions
--- a/docs/HANDOVER_EPC_PREDICTION_WIRING.md
+++ b/docs/HANDOVER_EPC_PREDICTION_WIRING.md
@ -0,0 +1,94 @@
+# EPC Prediction — production wiring handover (for Jun-te)
+
+The EPC Prediction **gap-fill** is wired end-to-end behind seams, with one real
+dependency stubbed: reading an EPC-less Property's resolved Landlord Overrides.
+This note is what's needed to finish it once your `property_overrides` read path
+lands. Design is **ADR-0031**; terms in **CONTEXT.md** (EPC Prediction, Effective
+EPC, EPC Anomaly Flag).
+
+## What's already built (slices 5a–5e, all on `feature/epc-prediction`)
+
+- **5a** `Property.predicted_epc` slot + a `"predicted"` `source_path` /
+  `effective_epc` branch — used only when there's no lodged EPC and no Site Notes
+  (a real source always wins).
+- **5b** `ComparablePropertiesRepository.candidates_for(postcode)` +
+  `EpcComparablePropertiesRepository` adapter (postcode search → per-cert fetch →
+  batched UPRN→coords). Composes with `EpcClientService` + `GeospatialS3Repository`.
+- **5c** EPC store `source` discriminator (`lodged` | `predicted`) so the two
+  coexist per property; `get_predicted_for_property` / `_for_properties`;
+  `PropertyPostgresRepository` hydrates `predicted_epc`. **Needs a DB migration —
+  see `docs/MIGRATION_NOTE_predicted_epc_source.md`.**
+- **5d** `build_prediction_target(identity, coords, attributes)` + the eligibility
+  **gate** (unknown `property_type` → not predicted). Override attributes come
+  through the `PredictionTargetAttributesReader` port (the stub).
+- **5e** `IngestionOrchestrator` wiring: when the three prediction collaborators
+  are injected, an EPC-less Property is predicted from its cohort and persisted to
+  the predicted slot. The collaborators are **optional** — unwired, ingestion is
+  unchanged.
+
+## Your part — three things
+
+### 1. Implement `PredictionTargetAttributesReader` (the stub)
+
+`repositories/property/prediction_target_attributes_reader.py` defines the port:
+`attributes_for(property_id) -> PredictionTargetAttributes` (property_type,
+built_form, wall_construction). Build the adapter as a read over the
+`property_overrides` fact layer (the finaliser writes it via
+`PropertyOverrideRepository.upsert_all`; you're adding the read side).
+
+**Code-space gotcha.** `select_comparables` filters
+`comparable.epc.property_type == target.property_type`, and the cohort EPCs carry
+gov **API codes** (e.g. `"0"`/`"2"`). Landlord Overrides resolve to enum *value*
+strings (e.g. `"House"`). Your adapter must map override value → the API-code
+space, or `property_type` will never match and every cohort comes back empty.
+Same for `built_form`. (`domain/epc/property_type.py`, `built_form_type.py` are
+the enums; `datatypes/epc/domain/epc_codes.csv` has the code table.)
+`property_type` unresolved → return `PredictionTargetAttributes(property_type=None)`
+so the gate skips the Property.
+
+### 2. Run the Drizzle migration
+
+`epc_property.source` column — full spec in
+`docs/MIGRATION_NOTE_predicted_epc_source.md` (column + default `'lodged'` +
+relax any `property_id` uniqueness to `(property_id, source)`).
+
+### 3. Wire the collaborators at the composition root
+
+Wherever `IngestionOrchestrator(...)` is constructed for the real run, pass the
+three optional kwargs:
+
+```python
+IngestionOrchestrator(
+    ...,
+    comparables_repo=EpcComparablePropertiesRepository(epc_client, geospatial_repo),
+    prediction_attributes_reader=<your property_overrides adapter>,
+    epc_prediction=EpcPrediction(),
+)
+```
+
+That's the on-switch. Until all three are passed, ingestion ignores prediction.
+
+## One open item — Validation Cohort exclusion
+
+A predicted-source Property has **no real lodged record**, so it must not be
+scored as if it did (CONTEXT: Validation Cohort; ADR-0031 dec-3). There is **no
+Validation-Cohort code path today** to exclude it from — when one is built (or in
+any QA that compares `calc(effective_epc)` vs lodged), exclude on the structural
+signal:
+
+```python
+if prop.source_path == "predicted":
+    continue  # predicted EPC — no ground truth to validate against
+```
+
+Note too: `PropertyBaselinePerformance.lodged` is derived from `effective_epc`
+regardless of source (`property_baseline_orchestrator` → `lodged_performance`), so
+for a predicted Property that "lodged" is synthesised, not real. Decide whether
+baseline should null/flag it for predicted properties when this lands.
+
+## Anomaly dual-use (later, not now)
+
+Slice-5 is gap-fill only (`epc is None`). The slot model already supports
+predicting for *every* Property to compare predicted vs lodged (**EPC Anomaly
+Flags**) — see ADR-0031 dec-4. Reuses the same `ComparableProperties` repo + the
+predicted slot.