docs(epc-prediction): slice-5f production-wiring handover for Jun-te

The gap-fill is wired end-to-end (slices 5a-5e) behind seams; this note is what's left to switch it on in production: (1) implement the PredictionTargetAttributesReader stub over property_overrides — with the override-value → API-code mapping select_comparables needs; (2) run the epc_property.source Drizzle migration; (3) pass the three optional collaborators at the IngestionOrchestrator composition root. Plus the open Validation-Cohort exclusion (no code path exists yet — exclude on source_path == "predicted" when one is built) and the anomaly dual-use pointer. No code change: the validation exclusion has no consumer to attach to today, and the structural signal (source_path == "predicted") already exists from slice-5a. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 13:10:47 +00:00 · 2026-06-16 04:05:00 +00:00 · 2026-06-16 04:05:00 +00:00 · b677448fa0
commit b677448fa0
parent 5727ac53c1
1 changed files with 94 additions and 0 deletions
--- a/docs/HANDOVER_EPC_PREDICTION_WIRING.md
+++ b/docs/HANDOVER_EPC_PREDICTION_WIRING.md
@ -0,0 +1,94 @@
 # EPC Prediction — production wiring handover (for Jun-te)
 The EPC Prediction **gap-fill** is wired end-to-end behind seams, with one real
 dependency stubbed: reading an EPC-less Property's resolved Landlord Overrides.
 This note is what's needed to finish it once your `property_overrides` read path
 lands. Design is **ADR-0031**; terms in **CONTEXT.md** (EPC Prediction, Effective
 EPC, EPC Anomaly Flag).
 ## What's already built (slices 5a–5e, all on `feature/epc-prediction`)
 - **5a** `Property.predicted_epc` slot + a `"predicted"` `source_path` /
  `effective_epc` branch — used only when there's no lodged EPC and no Site Notes
  (a real source always wins).
 - **5b** `ComparablePropertiesRepository.candidates_for(postcode)` +
  `EpcComparablePropertiesRepository` adapter (postcode search → per-cert fetch →
  batched UPRN→coords). Composes with `EpcClientService` + `GeospatialS3Repository`.
 - **5c** EPC store `source` discriminator (`lodged` | `predicted`) so the two
  coexist per property; `get_predicted_for_property` / `_for_properties`;
  `PropertyPostgresRepository` hydrates `predicted_epc`. **Needs a DB migration —
  see `docs/MIGRATION_NOTE_predicted_epc_source.md`.**
 - **5d** `build_prediction_target(identity, coords, attributes)` + the eligibility
  **gate** (unknown `property_type` → not predicted). Override attributes come
  through the `PredictionTargetAttributesReader` port (the stub).
 - **5e** `IngestionOrchestrator` wiring: when the three prediction collaborators
  are injected, an EPC-less Property is predicted from its cohort and persisted to
  the predicted slot. The collaborators are **optional** — unwired, ingestion is
  unchanged.
 ## Your part — three things
 ### 1. Implement `PredictionTargetAttributesReader` (the stub)
 `repositories/property/prediction_target_attributes_reader.py` defines the port:
 `attributes_for(property_id) -> PredictionTargetAttributes` (property_type,
 built_form, wall_construction). Build the adapter as a read over the
 `property_overrides` fact layer (the finaliser writes it via
 `PropertyOverrideRepository.upsert_all`; you're adding the read side).
 **Code-space gotcha.** `select_comparables` filters
 `comparable.epc.property_type == target.property_type`, and the cohort EPCs carry
 gov **API codes** (e.g. `"0"`/`"2"`). Landlord Overrides resolve to enum *value*
 strings (e.g. `"House"`). Your adapter must map override value → the API-code
 space, or `property_type` will never match and every cohort comes back empty.
 Same for `built_form`. (`domain/epc/property_type.py`, `built_form_type.py` are
 the enums; `datatypes/epc/domain/epc_codes.csv` has the code table.)
 `property_type` unresolved → return `PredictionTargetAttributes(property_type=None)`
 so the gate skips the Property.
 ### 2. Run the Drizzle migration
 `epc_property.source` column — full spec in
 `docs/MIGRATION_NOTE_predicted_epc_source.md` (column + default `'lodged'` +
 relax any `property_id` uniqueness to `(property_id, source)`).
 ### 3. Wire the collaborators at the composition root
 Wherever `IngestionOrchestrator(...)` is constructed for the real run, pass the
 three optional kwargs:
 ```python
 IngestionOrchestrator(
    ...,
    comparables_repo=EpcComparablePropertiesRepository(epc_client, geospatial_repo),
    prediction_attributes_reader=<your property_overrides adapter>,
    epc_prediction=EpcPrediction(),
 )
 ```
 That's the on-switch. Until all three are passed, ingestion ignores prediction.
 ## One open item — Validation Cohort exclusion
 A predicted-source Property has **no real lodged record**, so it must not be
 scored as if it did (CONTEXT: Validation Cohort; ADR-0031 dec-3). There is **no
 Validation-Cohort code path today** to exclude it from — when one is built (or in
 any QA that compares `calc(effective_epc)` vs lodged), exclude on the structural
 signal:
 ```python
 if prop.source_path == "predicted":
    continue  # predicted EPC — no ground truth to validate against
 ```
 Note too: `PropertyBaselinePerformance.lodged` is derived from `effective_epc`
 regardless of source (`property_baseline_orchestrator` → `lodged_performance`), so
 for a predicted Property that "lodged" is synthesised, not real. Decide whether
 baseline should null/flag it for predicted properties when this lands.
 ## Anomaly dual-use (later, not now)
 Slice-5 is gap-fill only (`epc is None`). The slot model already supports
 predicting for *every* Property to compare predicted vs lodged (**EPC Anomaly
 Flags**) — see ADR-0031 dec-4. Reuses the same `ComparableProperties` repo + the
 predicted slot.