From 02df38e207828c1a0730ef5357525597e95c0c2c Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Wed, 13 May 2026 21:52:02 +0000
Subject: [PATCH] note kwh service not needing predictions

---
 CLAUDE.md              | 14 +++----
 CONTEXT.md             | 12 ++++--
 UBIQUITOUS_LANGUAGE.md | 77 ++-----------------------------------
 ara_backend_design.md  | 86 ++++++++++++++++++++++++------------------
 4 files changed, 66 insertions(+), 123 deletions(-)

diff --git a/CLAUDE.md b/CLAUDE.md
index f88a59d5..faa857ce 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -36,28 +36,26 @@ Five Claude Code skills are installed in this repo's dev container. Each maps to
 |-------|--------|-------------|
 | **grill-me** | `/grill-me` | Before implementing — stress-tests a design through sequential questioning |
 | **to-prd** | `/to-prd` | After a planning conversation — formalises context into a GitHub issue PRD |
-| **ubiquitous-language** | `/ubiquitous-language` | When domain terms are drifting or ambiguous — builds/updates `UBIQUITOUS_LANGUAGE.md` |
+| **grill-with-docs** | `/grill-with-docs` | When domain terms are drifting or new concepts are landing — challenges plans against `CONTEXT.md`, sharpens terminology inline, and writes ADRs for load-bearing decisions in `docs/adr/`. Replaces the older `ubiquitous-language` skill. |
 | **tdd** | `/tdd` | During implementation — enforces vertical-slice TDD (one test → one impl → repeat) |
 | **improve-codebase-architecture** | `/improve-codebase-architecture` | During refactoring — surfaces shallow modules and proposes deepening opportunities |
 
+Domain glossary lives at [CONTEXT.md](./CONTEXT.md); load-bearing decisions live at [docs/adr/](./docs/adr/). The legacy [UBIQUITOUS_LANGUAGE.md](./UBIQUITOUS_LANGUAGE.md) is a redirect.
+
 ### Typical session chains
 
 **Feature planning:**
-`/grill-me` → `/to-prd` → `/ubiquitous-language`
+`/grill-me` → `/to-prd` → `/grill-with-docs`
 
 **Implementation:**
 `/tdd` (+ `/grill-me` if a design fork appears mid-session)
 
 **Refactoring:**
-`/improve-codebase-architecture` → `/grill-me` → `/tdd` → `/ubiquitous-language`
+`/improve-codebase-architecture` → `/grill-me` → `/tdd` → `/grill-with-docs`
 
 ### First time setting up?
 
-New containers install all skills automatically via the Dockerfile. If you're in an existing container, run:
-
-```bash
-bash .devcontainer/backend/install-claude-skills.sh
-```
+Skills are installed automatically when the dev container is built, via the postCreate step that pulls from `Hestia-Homes/agentic-toolkit` (see `.devcontainer/backend/Dockerfile`). If an existing container is missing skills, rebuild the dev container.
 
 ## Type Safety
 
diff --git a/CONTEXT.md b/CONTEXT.md
index bd71d6b5..69de3529 100644
--- a/CONTEXT.md
+++ b/CONTEXT.md
@@ -82,13 +82,17 @@ The EpcPropertyData scored by the modelling pipeline for a single Property, deri
 _Avoid_: modelling EPC, working EPC, resolved EPC, derived EPC
 
 **Rebaselining**:
-Recomputing a Property's Baseline Performance via ML when its Effective EPC diverges from the originally lodged public EPC, or when no previous baseline exists.
+Re-predicting a Property's SAP, carbon emissions, and heat demand via ML when its Effective EPC's physical state diverges from the originally lodged public EPC (because Site Notes or Landlord Overrides have changed walls / heating / windows / etc.). Does not include kWh — that is always derived deterministically.
 _Avoid_: re-scoring, re-prediction, performance recomputation
 
 **Baseline Performance**:
-The set of ML-predicted performance values for a single Property — SAP, carbon emissions, heat demand, annual kWh — produced by scoring the Effective EPC against the kWh model; distinct from the originally recorded performance fields on the Effective EPC.
+A Property's current performance values — SAP, carbon emissions, heat demand, annual kWh, fuel split, bills — held against the Effective EPC. SAP / carbon / heat come directly from the Effective EPC's recorded values when no override applies, or from Rebaselining when an override changes physical state. Annual kWh and the fuel split are always derived deterministically by the EPC Energy Derivation Service.
 _Avoid_: baseline predictions, predicted baseline, rebaselined values
 
+**EPC Energy Derivation**:
+The deterministic process that derives a Property's annual kWh, fuel split (gas / electric / other), and bills from the Effective EPC's energy fields — applying a UCL-style correction for known EPC over/under-prediction and deducing fuel type for heating + hot water from the SAP heating fields. No ML.
+_Avoid_: kWh prediction, baseline kWh, energy estimation
+
 **EPC Anomaly Flag**:
 A per-field indicator that a Property's value for an EPC field differs significantly from Comparable Properties; advisory only — surfaces in the UI to prompt user review, does not block modelling.
 _Avoid_: outlier, mismatch, divergence flag
@@ -171,7 +175,7 @@ _Avoid_: API key, auth token, secret
 - An **EPC** carries an **EPC Band** and is identifiable by its **Registration Date**; the most recent one is the current.
 - A **UPRN** identifies a physical dwelling permanently; it does not change when the property changes owner — but each portfolio gets its own **Property** keyed against it.
 - When a **Property** has both **Site Notes** and a public **EPC**, the newer of the two derives the **Effective EPC**. **Landlord Overrides** apply only when the **EPC** is the source — never when **Site Notes** are.
-- **Rebaselining** produces **Baseline Performance** for a Property; triggered when the **Effective EPC** diverges from the originally lodged EPC (because of **Site Notes**, **Landlord Overrides**, an expired EPC, or an estimated EPC).
+- **Rebaselining** contributes the SAP / carbon / heat parts of **Baseline Performance** when the **Effective EPC** physical state diverges from the originally lodged EPC. **EPC Energy Derivation** contributes the kWh / fuel split / bills parts unconditionally for every Property.
 - The **EPC Prediction Service** uses **Comparable Properties** for both gap-filling and producing **EPC Anomaly Flags**.
 - A **Scenario** contains many **Plans** (one per Property). A **Plan** carries many **Recommendations**; the **Optimised Package** is the subset selected for installation.
 - A **Recommendation** references one **Measure Type** and carries property-specific cost and impact.
@@ -181,7 +185,7 @@ _Avoid_: API key, auth token, secret
 
 > **Dev:** "A landlord uploads a corrected boiler for one of their properties. What happens?"
 >
-> **Domain expert:** "That's a **Landlord Override** on the heating fields. Save it against the **Property**, then trigger **Rebaselining** — the **Effective EPC** has changed, so we need fresh **Baseline Performance** before regenerating **Recommendations**."
+> **Domain expert:** "That's a **Landlord Override** on the heating fields. Save it against the **Property**. The **Effective EPC** has changed, so **Rebaselining** runs to re-predict SAP / carbon / heat, and **EPC Energy Derivation** re-runs to update kWh / bills based on the new fuel deduction. With fresh **Baseline Performance** we regenerate **Recommendations**."
 
 > **Dev:** "What if the same Property also has Site Notes?"
 >
diff --git a/UBIQUITOUS_LANGUAGE.md b/UBIQUITOUS_LANGUAGE.md
index 1765cbc8..66684925 100644
--- a/UBIQUITOUS_LANGUAGE.md
+++ b/UBIQUITOUS_LANGUAGE.md
@@ -1,78 +1,7 @@
 # Ubiquitous Language
 
-Domain terminology glossary for this project. Generated and maintained by the `/ubiquitous-language` Claude Code skill.
+This file has been **superseded by [CONTEXT.md](./CONTEXT.md)**.
 
-Invoke `/ubiquitous-language` in any session to extract new terms from the conversation, flag ambiguities, and update this file with canonical definitions.
+The project's domain glossary now lives at the repo root in `CONTEXT.md`, maintained by the `/grill-with-docs` skill (which replaced `/ubiquitous-language`).
 
----
-
-## Energy Performance Certificates
-
-| Term | Definition | Aliases to avoid |
-|------|------------|------------------|
-| **EPC** | An Energy Performance Certificate — a government-issued document rating a dwelling's energy efficiency from A (best) to G (worst). | "energy certificate", "energy report" |
-| **Certificate Number** | The unique identifier assigned to an EPC by the government registry. | "cert number", "EPC ID" |
-| **Registration Date** | The date an EPC was lodged with the government register; used to identify the most recent certificate for a property. | "assessment date", "submission date" |
-| **EPC Band** | A single letter A–G representing a property's current or potential energy efficiency rating. | "energy rating", "EPC grade", "EPC score" |
-| **Schema Type** | The versioned RdSAP or SAP schema that describes the structure of a certificate's raw data (e.g. `RdSAP-Schema-21.0.1`). | "schema version", "EPC format" |
-| **Domestic Certificate** | An EPC issued for a residential dwelling, as opposed to a commercial one. | "residential EPC", "home EPC" |
-
-## Properties and Addresses
-
-| Term | Definition | Aliases to avoid |
-|------|------------|------------------|
-| **UPRN** | Unique Property Reference Number — the government-issued permanent identifier for a physical address in the UK. | "property ID", "address ID", "code" |
-| **Postcode** | A UK postal code used to group nearby addresses; the primary search key for finding EPC records. | "zip code", "postal code" |
-| **User Address** | A free-text address string provided by a user or imported from a customer dataset, before any normalisation or matching. | "user input", "raw address", "user_inputed_address" |
-| **Dwelling** | A single residential unit that can hold an EPC — a house, flat, or maisonette. | "property", "unit", "home" |
-
-## Address Matching
-
-| Term | Definition | Aliases to avoid |
-|------|------------|------------------|
-| **Lexiscore** | A similarity score in [0, 1] between a user address and a candidate EPC address; combines token overlap and character-level similarity. | "score", "match score", "similarity" |
-| **Lexirank** | Dense rank of candidates sorted by lexiscore descending; rank 1 = best match. | "rank", "position" |
-| **UPRN Candidate** | An EPC search result that is a plausible match for a given user address, before scoring decides the winner. | "match candidate", "result" |
-| **Score Threshold** | The minimum lexiscore (currently 0.6) below which no match is returned even if a candidate exists. | "minimum score", "cutoff" |
-| **Ambiguous Match** | A matching outcome where two or more candidates share lexirank 1, making it impossible to select a unique winner. | "tie", "draw", "duplicate" |
-| **Best Match** | The single UPRN candidate with lexirank 1 that meets or exceeds the score threshold. | "winner", "top result" |
-
-## API and Integration
-
-| Term | Definition | Aliases to avoid |
-|------|------------|------------------|
-| **EPC Search Result** | A lightweight record returned by the government domestic search endpoint — contains address lines, postcode, UPRN, band, and certificate number but not the full certificate data. | "search row", "EPC row", "result" |
-| **EPC Property Data** | The fully mapped domain object produced after fetching and parsing a complete EPC certificate. | "EPC data", "certificate data", "parsed EPC" |
-| **Old EPC API** | The retired government API (`epc.opendatacommunities.org`) using HTTP Basic auth; decommissioned May 2026. | "legacy API" |
-| **New EPC API** | The replacement government API (`api.get-energy-performance-data.communities.gov.uk`) using Bearer token auth. | "new API", "current API" |
-| **Bearer Token** | The auth credential required by the new EPC API; stored in the `EPC_AUTH_TOKEN` environment variable. | "API key", "auth token", "secret" |
-
-## Relationships
-
-- An **EPC** belongs to exactly one **Dwelling** and has one **Certificate Number**.
-- A **Dwelling** may have multiple **EPCs** across time; the one with the most recent **Registration Date** is the current one.
-- A **UPRN** identifies a **Dwelling** permanently; it does not change when the property changes owner.
-- An **EPC Search Result** is a summary; it points to a full **EPC** via its **Certificate Number**.
-- **Address Matching** uses a **User Address** and **Postcode** to find a **UPRN** by scoring **UPRN Candidates** from an EPC search.
-- A **Lexirank** of 1 with no **Ambiguous Match** and a **Lexiscore** ≥ the **Score Threshold** produces a **Best Match**.
-
-## Example dialogue
-
-> **Dev:** "We have a user address and postcode. How do we find the UPRN?"
-
-> **Domain expert:** "Search the **New EPC API** by **Postcode** — you get back a list of **EPC Search Results** for that area. Each one has an address and a **UPRN**. Score each against the **User Address** using the **Lexiscore**. If the top **UPRN Candidate** scores above the **Score Threshold** and there's no **Ambiguous Match**, that's your **Best Match**."
-
-> **Dev:** "What if two results share the same address line 1?"
-
-> **Domain expert:** "That's an **Ambiguous Match** — two candidates at **Lexirank** 1. Fall back to scoring on the full address using all address lines joined together. If that still ties, return nothing."
-
-> **Dev:** "Once we have the best match, do we use the UPRN or fetch the full EPC?"
-
-> **Domain expert:** "Depends on what you need. The **EPC Search Result** gives you the **EPC Band** and **Certificate Number**. If you need energy efficiency detail, use the **Certificate Number** to fetch the full **EPC Property Data**."
-
-## Flagged ambiguities
-
-- **"address"** appears as both the raw **User Address** (free-text from customer data) and a structured field on an **EPC Search Result** (normalised address lines). Always qualify: "user address" vs "EPC address" or "address line 1".
-- **"score"** is used for the `AddressMatch.score()` function output, the `lexiscore` DataFrame column, and informally in conversation. Prefer **Lexiscore** in domain discussions; reserve "score" for method-level code comments.
-- **"user_inputed_address"** in `backend/address2UPRN/main.py` is a misspelling and a synonym for **User Address** — the canonical term. New code should use `user_address`.
-- **"EPC"** is overloaded as both the document (an Energy Performance Certificate) and the rating band letter. Use **EPC** for the document and **EPC Band** for the letter.
+If you arrived here from a link in `CLAUDE.md` or older docs, follow the link above. This file is kept only to preserve git history and may be removed once internal references are updated.
diff --git a/ara_backend_design.md b/ara_backend_design.md
index e235cf6d..4d50b3b6 100644
--- a/ara_backend_design.md
+++ b/ara_backend_design.md
@@ -48,7 +48,7 @@ The contracts this PRD defines are the inputs each sub-PRD consumes.
 3. **Make every service unit-testable against fakes** — no test needs a real DB, a real gov API, or a real ML lambda to verify business logic.
 4. **Establish a single `Property` aggregate root** as the domain centrepiece; all 9 modelling concerns are slices of one aggregate.
 5. **Versioned ML data contract** — the EPC-to-features transform is the single shared artifact between this repo and the autogluon repo.
-6. **Per-property UI surfaces** — fetched data can be shown to users for review and override **before** modelling runs; modelling is triggered separately. This will enable a landlord facing version of the product where we fetch the open data, present back to the user for review and then perform tbe modelling.
+6. **Per-property UI surfaces** — fetched data can be shown to users for review and override **before** modelling runs; modelling is triggered separately. This will enable a landlord facing version of the product where we fetch the open data, present back to the user for review and then perform the modelling.
 
 ### 2.2 Non-goals
 
@@ -59,27 +59,30 @@ The contracts this PRD defines are the inputs each sub-PRD consumes.
 
 ---
 
-## 3. Cutover strategy
+## 3. Cutover plan
 
-Two-phase cutover, driven by the 30 May deadline.
+Forced cut-over, driven by the 30 May deadline. There is no strangler period because the Old EPC API death takes `model_engine` with it.
 
-### 3.1 Phase 0 — Stopgap (now → end of May)
+### 3.1 Phase 0 — Status quo (now → 30 May)
 
-- The current `model_engine` keeps running. `SearchEpc` is rewired to delegate to `EpcClientService` (the new gov API client already built on this branch).
-- Old-schema EPCs persisted in the DB are read as-is; the EPC re-mapping service is not yet wired in.
-- Goal: no modelling outage at the API death date. Some degraded behaviour acceptable; clients are aware.
+- `model_engine` keeps running against the Old EPC API for as long as it works.
+- Build of the 9 new services starts **this week**, in parallel to the old engine continuing to serve traffic.
+- The new `ara/` package lives alongside `backend/` but is not yet wired into any production endpoint.
+- Goal: keep the lights on until the API dies; start the build immediately so the dark period is short.
 
-### 3.2 Phase 1 — Strangler (June → ~Q4 2026)
+### 3.2 Phase 1 — Forced cut-over (30 May onwards)
 
-- New `ara/` package built alongside the old code. New endpoints expose the new pipeline. The old `model_engine` keeps running.
-- Per-portfolio feature flag: when set, the trigger endpoint routes the portfolio through the new pipeline. Default is the old pipeline.
-- Each of the 9 services is built, tested, and ships independently. Adding a service to the new pipeline does not require deleting the old one.
-- When confidence is high (last portfolio migrated, no regressions seen for N weeks), the old engine is deleted.
+- On 30 May the Old EPC API dies; `model_engine` ceases to function for any new modelling run.
+- Some downtime is expected and accepted. Clients are aware.
+- Modelling resumes when the new pipeline is ready end-to-end. There is no per-portfolio feature flag, no parallel pipelines, no traffic split — the new pipeline is the only pipeline.
+- **Calico** and **Hyde** are the first live clients onto the new pipeline in June.
+- `model_engine`, `SearchEpc`, the legacy `Property`, and surrounding modules in `backend/` are deleted once the new pipeline is serving all traffic.
 
-### 3.3 What is **not** done
+### 3.3 What is *not* done
 
-- No parallel-shadow run (run both, diff outputs). Reason: doubles compute per plan, requires diff tooling we don't have, and the old engine is already known to return bad data — diffs would be noise.
-- No big-bang switch. Reason: 9 services is too much change to land in one PR.
+- No strangler — there is nothing to strangle once the Old EPC API dies on 30 May.
+- No parallel-shadow run — would double compute and require diff tooling we don't have, while the old engine is already known to return bad data so diffs would be noise.
+- No per-portfolio feature flag — the cut-over is all-or-nothing.
 
 ---
 
@@ -126,7 +129,7 @@ Every class falls into exactly one of four roles:
 |------|-----|----------|
 | **Fetchers** | Call external APIs. Return raw response data. No DB. | `EpcClientService`, `GeospatialFetcher`, `SolarFetcher`, `SiteNotesIngester` |
 | **Repos** | Persist and load domain aggregates. SQL hidden inside. No external IO. | `PropertyRepo`, `EpcCacheRepo`, `SiteNotesRepo`, `LandlordOverridesRepo`, `RecommendationsRepo`, `GenericDataRepo`, `SubtaskRepo` |
-| **Services** | Business logic over domain objects. No external IO except via injected Fetchers / Repos. | `EpcRemappingService`, `EpcPredictionService`, `KwhPredictionService`, `ImpactPredictionService`, `RecommendationService`, `OptimiserService`, `FeatureBuilder`, `ResultsPersister` |
+| **Services** | Business logic over domain objects. No external IO except via injected Fetchers / Repos. | `EpcRemappingService`, `EpcPredictionService`, `EpcEnergyDerivationService`, `KwhImpactService`, `ImpactPredictionService`, `RecommendationService`, `OptimiserService`, `FeatureBuilder`, `ResultsPersister` |
 | **Orchestrators** | Compose Fetchers + Services + Repos to produce an end-to-end result. The only place where step order is encoded. | `IngestionPipeline`, `ModellingPipeline`, `RefreshOrchestrator` |
 
 This taxonomy is **strict**. A class that fetches *and* persists belongs in the Service layer and depends on a Fetcher + a Repo. No back-channels.
@@ -160,8 +163,8 @@ UPRN partitioning: the trigger endpoint groups UPRNs by **locality** (postcode p
 
 The team will decide at implementation time whether Ingestion and Modelling sit behind:
 
-- **(a) One unified API** with a single trigger endpoint that runs both phases.
-- **(b) Two APIs**, each with its own trigger, RefreshOrchestrator chains them.
+- **(a) One unified API** with a single trigger endpoint that runs both phases. Most closely mimics what's live today.
+- **(b) Two APIs**, each with its own trigger, RefreshOrchestrator chains them. Separate API call for fetching and modelling.
 
 Either is workable if the class taxonomy is preserved. Deferred to implementation review.
 
@@ -197,7 +200,7 @@ class Property:
     epc_anomaly_flags: Optional[EpcAnomalyFlags]  # from EpcPredictionService vs neighbours
 
     # --- Modelling outputs ---
-    baseline_predictions: Optional[BaselinePredictions]   # SAP/carbon/heat after rebaselining
+    baseline_performance: Optional[BaselinePerformance]   # SAP/carbon/heat (from EPC or rebaselined ML) + kWh + fuel split (always EPC + UCL + fuel deduction)
     recommendations: list[Recommendation]
     impact_predictions: Optional[ImpactPredictions]
     optimised_package: Optional[OptimisedPackage]
@@ -383,8 +386,8 @@ Both ML services use the same transform:
 
 | Service | Lambda | Target |
 |---|---|---|
-| `KwhPredictionService` (service #5) | `kwh-models-*` | annual kWh + bills |
-| `ImpactPredictionService` (service #7) | `impact-models-*` | SAP, carbon, heat demand, post-retrofit kWh |
+| `KwhImpactService` (service #5) | `kwh-models-*` | per-measure annual kWh + bills delta (post-optimisation re-score only) |
+| `ImpactPredictionService` (service #7) | `impact-models-*` | SAP, carbon, heat demand per-measure impact |
 
 The two families are trained against the same input feature schema; only target columns differ. Sub-PRD (ii) handles training-time details.
 
@@ -410,9 +413,11 @@ Twelve classes implement the modelling pipeline end-to-end. Detailed signatures
 | S1 | `EpcRemappingService` | 4 | Re-map legacy / historical EPCs into new `EpcPropertyData` shape. | `EpcCacheRepo` | `EpcCacheRepo` (mapped column) |
 | S2 | `EpcPredictionService` | 3 | For every property: produce predicted EPC + per-field anomaly flags vs neighbours. Used both for gap-fill (Path 2 if EPC missing) and UI surfacing. | `EpcCacheRepo`, `GenericDataRepo` | — |
 | S3 | `FeatureBuilder` | (new) | Wraps `EpcMlTransform`. Converts `Properties` → scoring DataFrame. | — | — |
-| S4 | `KwhPredictionService` | 5 | Calls kWh + bills ML lambda; attaches results to `Property.baseline_predictions` / per-measure. | `FeatureBuilder` | — |
+| S4a | `EpcEnergyDerivationService` | (new) | Derives baseline kWh + fuel split + bills from the Effective EPC's energy fields (`energy_consumption_current`, `heating_cost_current`, `hot_water_cost_current`). Applies UCL-style correction for known EPC over/under-prediction, then deduces fuel type (gas/electric/other) for heating + hot water to split consumption. Deterministic, no ML. | — | — |
+| S4b | `RebaseliningService` | (new, partial overlap with old "rebaselining" logic) | When the Effective EPC's physical state differs from the originally lodged EPC (Site Notes or Landlord Overrides applied), calls SAP/carbon/heat ML lambdas to produce new baseline values. kWh under the new state is re-derived via `EpcEnergyDerivationService`, not ML. | `FeatureBuilder` | — |
 | S5 | `RecommendationService` | 6 | Generates per-property recommendations using `effective_epc`, materials, exclusions, etc. Replaces current `Recommendations` (1383 LOC). | `MaterialsRepo` | — |
-| S6 | `ImpactPredictionService` | 7 | Calls SAP / carbon / heat / bills impact lambda for each recommendation. | `FeatureBuilder` | — |
+| S6 | `ImpactPredictionService` | 7 | Calls SAP / carbon / heat impact lambda for each recommendation. | `FeatureBuilder` | — |
+| S6b | `KwhImpactService` | 5 (partial) | Calls kWh ML lambda to predict the kWh delta per recommendation; used to compute bill savings on the optimised package. | `FeatureBuilder` | — |
 | S7 | `OptimiserService` | 8 | Produces optimised retrofit packages. Wraps current `CostOptimiser` / `GainOptimiser` / `optimise_with_scenarios`. | — | — |
 | S8 | `ResultsPersister` | 9 | Final step: writes plans, recommendations, property updates via repos under one UoW. | — | All write repos |
 
@@ -429,19 +434,22 @@ Twelve classes implement the modelling pipeline end-to-end. Detailed signatures
 For each `Property` in the batch:
 
 ```
-1. PropertyRepo.get()  →  Property (epc, site_notes, overrides, geospatial, solar)
-2. EpcRemappingService — if epc is in legacy schema, upgrade to current
-3. EpcPredictionService — produce predicted EPC + anomaly flags (always runs)
-4. Compute Property.effective_epc (path-1 or path-2)
-5. KwhPredictionService — baseline kwh + bills
-6. RecommendationService — generate candidate measures
-7. ImpactPredictionService — predict per-measure impact
-8. OptimiserService — select optimal package
-9. KwhPredictionService — re-score on optimised package (tenant savings)
-10. ResultsPersister — write Plan + Recommendations under one UoW
+1.  PropertyRepo.get()  →  Property (epc, site_notes, overrides, geospatial, solar)
+2.  EpcRemappingService — if epc is in legacy schema, upgrade to current
+3.  EpcPredictionService — produce predicted EPC + anomaly flags (always runs)
+4.  Compute Property.effective_epc (path-1 or path-2)
+5.  RebaseliningService — IF effective_epc differs from lodged EPC, re-predict SAP/carbon/heat via ML
+6.  EpcEnergyDerivationService — derive baseline kWh + fuel split + bills from the (possibly rebaselined) Effective EPC. No ML.
+7.  RecommendationService — generate candidate measures
+8.  ImpactPredictionService — predict per-measure SAP/carbon/heat impact (ML)
+9.  OptimiserService — select optimal package
+10. KwhImpactService — predict kWh + bill delta for the optimised package (ML)
+11. ResultsPersister — write Plan + Recommendations under one UoW
 ```
 
-Steps 1–4 are per-property. Steps 5–9 batch the whole batch into one ML call where possible (the lambdas accept a DataFrame; today's code already batches).
+Steps 1–4 are per-property. Steps 5, 8, 10 batch the whole batch into one ML call where possible (the lambdas accept a DataFrame; today's code already batches). Steps 6 and 7 are deterministic per-property.
+
+Note vs the current `model_engine`: the **pre-recommendation** kWh ML call has been removed. Baseline kWh now comes from the Effective EPC directly (the new gov EPC API exposes `energy_consumption_current` and per-end-use cost fields). ML is reserved for **post-recommendation impact prediction** only.
 
 ### 9.5 Per-service contracts — deferred
 
@@ -530,7 +538,7 @@ ara/                                  # new top-level package, sibling of backen
     └── integration/                  # real DB + real SQS via localstack
 ```
 
-`backend/` continues to host the legacy code during phase 1. Once the last portfolio is migrated, `backend/engine/`, `backend/SearchEpc.py`, `backend/Property.py` are deleted.
+`backend/` continues to host the legacy code until the new pipeline is live. Once `model_engine` is no longer serving any traffic, `backend/engine/`, `backend/SearchEpc.py`, and the legacy `backend/Property.py` are deleted.
 
 Reused intact (no rewrite needed):
 
@@ -605,7 +613,8 @@ A landlord uploads a corrected heating system for UPRN 12345 via the UI.
 3. **ModellingPipeline** invoked on a batch of `[12345]`:
    - Reads `Property(uprn=12345)` from `PropertyRepo`.
    - `Property.effective_epc` = epc + landlord_overrides → heating system fields differ from baseline.
-   - Rebaselining triggered: `KwhPredictionService` re-predicts baseline SAP / carbon / heat / kwh.
+   - `RebaseliningService` triggered: ML re-predicts SAP / carbon / heat against the new effective EPC.
+   - `EpcEnergyDerivationService` re-runs over the new effective EPC to derive baseline kWh + fuel split + bills (no ML).
    - `RecommendationService` regenerates recommendations against the new baseline.
    - `OptimiserService` re-picks optimal package.
    - `ResultsPersister` writes new plan under one UoW (old plan is superseded; whether to soft-archive is a sub-PRD (iii) decision).
@@ -624,6 +633,9 @@ Total external calls: zero. The override write is the only thing that hit a netw
 6. **Soft-archive vs hard-overwrite** for superseded plans (§14) — affects audit / undo behaviour. Defer to sub-PRD (iii).
 7. **Building-level optimisation as a Phase 2 service** (§10) — agreed deferred; flag for roadmap discussion.
 8. **Transform versioning policy** (§8.3) — semver chosen; team to confirm bump conventions.
+9. **UCL EPC-correction model** (§9.2 S4a) — need the reference paper, the implementation we've used before, and a decision on whether to port directly or re-implement against the new EPC schema.
+10. **Fuel-price source for bill calculation** (§9.2 S4a) — Ofgem caps? Time-varying? Per-portfolio override? Decide alongside `EpcEnergyDerivationService` design.
+11. **kWh handling under Rebaselining** (§9.4 step 5) — confirmed: ML re-predicts SAP/carbon/heat only; `EpcEnergyDerivationService` re-runs for kWh. Validate that this is sufficient when overrides change heating fuel type (which would shift the fuel deduction).
 
 ---
 
@@ -645,4 +657,4 @@ Each sub-PRD owner: TBC. Each is independently reviewable but consumes the contr
 4. Stand up the empty `ara/` package skeleton + fakes + first integration-test scaffold as PR-1.
 5. Land services in dependency order: domain → repos → fetchers → services → orchestrators → API.
 
-Phase 1 milestone gate: first portfolio routed through new pipeline with parity against old engine (manual spot-check on 5 representative properties).
+Phase 1 milestone gate: first portfolio (Calico or Hyde) routed through the new pipeline end-to-end in June, with a manual spot-check on 5 representative properties to confirm outputs are reasonable. No parity-against-old-engine check — the old engine is dead by then.