From fcbaf58a4001dc4447459b6cdbee9951ae18fa9c Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Wed, 13 May 2026 19:29:02 +0000
Subject: [PATCH 1/8] initial prd

---
 ara_backend_design.md | 648 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 648 insertions(+)
 create mode 100644 ara_backend_design.md

diff --git a/ara_backend_design.md b/ara_backend_design.md
new file mode 100644
index 00000000..f901936b
--- /dev/null
+++ b/ara_backend_design.md
@@ -0,0 +1,648 @@
+# ARA Backend Redesign — Design PRD
+
+**Status**: Draft for team review
+**Author**: Khalim Conn-Kowlessar (with Claude grill session)
+**Branch**: `ara-backend-design-prd`
+**Scope**: Service architecture + domain model + contracts for the new modelling backend. Linked sub-PRDs cover ML training pipeline, DB schema migration, and historical EPC re-mapping.
+
+---
+
+## 1. Context
+
+### 1.1 The forcing function
+
+The current modelling backend (`backend/engine/engine.py` — `model_engine`, 1331 LOC) was built as an MVP. It is:
+
+- **Tightly coupled** to a specific gov EPC API that is being **decommissioned on 30 May 2026** (~17 days from today).
+- **A monolith** — one async function reaches into DB modules, HTTP clients, ML lambdas, S3, and queue infrastructure directly.
+- **Bottlenecked on a single person** — Khalim is the only contributor able to safely modify the engine because no one else can predict the blast radius of a change.
+- **Already returning erroneous data** from the old API (clients are aware). The replacement API is partially built (`backend/epc_client/epc_client_service.py`) on the current feature branch.
+
+### 1.2 What needs to change
+
+Beyond just swapping API clients, this is the moment to **rebuild the backend into a production-grade, contribute-able codebase**, with:
+
+- A clear domain model rooted in the new EPC schema (`EpcPropertyData`).
+- Service boundaries that other team members can read, fix, and extend without needing the entire mental model.
+- Repository-mediated persistence so business logic can be tested without spinning up a database.
+- A separation between **data fetching** (slow, IO-heavy, external) and **modelling** (deterministic, fast, internal).
+
+### 1.3 Out of scope for this PRD
+
+These ship as **linked sub-PRDs**:
+
+- **Sub-PRD (ii) — ML training pipeline** (autogluon repo + parquet generation in this repo + scoring model retraining for the new EPC schema)
+- **Sub-PRD (iii) — DB schema migration** (new tables: `site_notes`, `landlord_overrides`, EPC cache, parallel write strategy)
+- **Sub-PRD (iv) — Historical EPC re-mapping** (one-off + ongoing batch job: legacy stored EPCs → new `EpcPropertyData` shape)
+
+The contracts this PRD defines are the inputs each sub-PRD consumes.
+
+---
+
+## 2. Goals and non-goals
+
+### 2.1 Goals
+
+1. **Survive the 30 May API shutdown** — even if it means a brief degraded window, modelling continues to function against the new gov EPC API.
+2. **Decouple data fetching from modelling** — modelling never makes external HTTP calls; it reads everything from repositories.
+3. **Make every service unit-testable against fakes** — no test needs a real DB, a real gov API, or a real ML lambda to verify business logic.
+4. **Establish a single `Property` aggregate root** as the domain centrepiece; all 9 modelling concerns are slices of one aggregate.
+5. **Versioned ML data contract** — the EPC-to-features transform is the single shared artifact between this repo and the autogluon repo.
+6. **Per-property UI surfaces** — fetched data is shown to users for review and override **before** modelling runs; modelling is triggered separately.
+
+### 2.2 Non-goals
+
+- Multi-region deploy, GDPR-class data minimisation work, or compliance reporting — separate workstreams.
+- Replacement of the front-end. The new APIs preserve enough of the existing response shape that the FE migrates incrementally.
+- Removing pandas. The ML transform output is a parquet-friendly DataFrame-like shape; that stays.
+- A workflow engine (Prefect / Temporal / Airflow). Coordinator-class orchestration plus the existing SQS-fanout pattern is sufficient at the scale we serve.
+
+---
+
+## 3. Cutover strategy
+
+Two-phase cutover, driven by the 30 May deadline.
+
+### 3.1 Phase 0 — Stopgap (now → end of May)
+
+- The current `model_engine` keeps running. `SearchEpc` is rewired to delegate to `EpcClientService` (the new gov API client already built on this branch).
+- Old-schema EPCs persisted in the DB are read as-is; the EPC re-mapping service is not yet wired in.
+- Goal: no modelling outage at the API death date. Some degraded behaviour acceptable; clients are aware.
+
+### 3.2 Phase 1 — Strangler (June → ~Q4 2026)
+
+- New `ara/` package built alongside the old code. New endpoints expose the new pipeline. The old `model_engine` keeps running.
+- Per-portfolio feature flag: when set, the trigger endpoint routes the portfolio through the new pipeline. Default is the old pipeline.
+- Each of the 9 services is built, tested, and ships independently. Adding a service to the new pipeline does not require deleting the old one.
+- When confidence is high (last portfolio migrated, no regressions seen for N weeks), the old engine is deleted.
+
+### 3.3 What is **not** done
+
+- No parallel-shadow run (run both, diff outputs). Reason: doubles compute per plan, requires diff tooling we don't have, and the old engine is already known to return bad data — diffs would be noise.
+- No big-bang switch. Reason: 9 services is too much change to land in one PR.
+
+---
+
+## 4. Architecture overview
+
+```
+┌─────────────────────────────────────────────────────────────────────┐
+│  Trigger endpoint(s)                                                │
+│  (one or two — see §4.5; deferred decision)                         │
+└───────────┬──────────────────────────────────────────┬──────────────┘
+            │                                          │
+            ▼                                          ▼
+   ┌─────────────────┐                       ┌─────────────────┐
+   │ IngestionPipe   │   SQS, batches of N   │ ModellingPipe   │
+   │ -----------     │ ◄─────────────────────│ -----------     │
+   │ Fetchers run    │                       │ Reads via Repos │
+   │ Persist via     │                       │ Calls Services  │
+   │ Repos           │                       │ ML predictions  │
+   └────────┬────────┘                       └────────┬────────┘
+            │                                         │
+            └───────────────► Repos ◄─────────────────┘
+                                  │
+                                  ▼
+                        ┌──────────────────┐
+                        │  Postgres tables │
+                        │  (property,      │
+                        │   epc_cache,     │
+                        │   site_notes,    │
+                        │   landlord_      │
+                        │   overrides,     │
+                        │   plans, etc.)   │
+                        └──────────────────┘
+
+  ┌──────────────────────────┐
+  │  RefreshOrchestrator     │  triggers Ingestion → diff → conditionally Modelling
+  └──────────────────────────┘
+```
+
+### 4.1 Class taxonomy
+
+Every class falls into exactly one of four roles:
+
+| Role | Job | Examples |
+|------|-----|----------|
+| **Fetchers** | Call external APIs. Return raw response data. No DB. | `EpcClientService`, `GeospatialFetcher`, `SolarFetcher`, `SiteNotesIngester` |
+| **Repos** | Persist and load domain aggregates. SQL hidden inside. No external IO. | `PropertyRepo`, `EpcCacheRepo`, `SiteNotesRepo`, `LandlordOverridesRepo`, `RecommendationsRepo`, `GenericDataRepo`, `SubtaskRepo` |
+| **Services** | Business logic over domain objects. No external IO except via injected Fetchers / Repos. | `EpcRemappingService`, `EpcPredictionService`, `KwhPredictionService`, `ImpactPredictionService`, `RecommendationService`, `OptimiserService`, `FeatureBuilder`, `ResultsPersister` |
+| **Orchestrators** | Compose Fetchers + Services + Repos to produce an end-to-end result. The only place where step order is encoded. | `IngestionPipeline`, `ModellingPipeline`, `RefreshOrchestrator` |
+
+This taxonomy is **strict**. A class that fetches *and* persists belongs in the Service layer and depends on a Fetcher + a Repo. No back-channels.
+
+### 4.2 Two pipelines, one direction
+
+Data flows one way only: **Ingestion → Repos → Modelling**.
+
+- **Ingestion** writes; never calls Modelling.
+- **Modelling** reads; never calls Fetchers.
+
+If Modelling needs fresh data, it returns "stale" and the caller decides whether to ingest first. This makes Modelling a pure function of repository state, which is the property that makes it reproducible, debuggable, and testable.
+
+### 4.3 RefreshOrchestrator
+
+Sits above both pipelines. Job:
+
+1. Trigger `IngestionPipeline` for a portfolio.
+2. After ingestion completes, ask repos: "did anything change vs the last modelled snapshot?"
+3. If yes, trigger `ModellingPipeline`. If no, return early.
+
+This avoids re-modelling 100k properties when only 200 had refreshed EPC data.
+
+### 4.4 SQS fanout (preserved from current architecture)
+
+The existing `trigger_plan_entrypoint` SQS-chunking pattern is kept. Both pipelines fan out per batch of ~30–100 properties (tuneable). Each consumer runs one batch end-to-end through the relevant pipeline.
+
+UPRN partitioning: the trigger endpoint groups UPRNs by **locality** (postcode prefix / UPRN range) before chunking, so each batch maximises shared upstream fetches (one geospatial-range pull serves all 30 properties in the batch).
+
+### 4.5 One API or two? (deferred)
+
+The team will decide at implementation time whether Ingestion and Modelling sit behind:
+
+- **(a) One unified API** with a single trigger endpoint that runs both phases.
+- **(b) Two APIs**, each with its own trigger, RefreshOrchestrator chains them.
+
+Either is workable if the class taxonomy is preserved. Deferred to implementation review.
+
+---
+
+## 5. Domain model
+
+### 5.1 Aggregate root: `Property`
+
+`Property` is the centrepiece. Every service operates on one or more `Property` instances. Every repo writes one slice of `Property`. The aggregate carries all state for a single property's modelling run.
+
+```python
+@dataclass
+class PropertyIdentity:
+    portfolio_id: UUID
+    uprn: Optional[int]
+    landlord_property_id: Optional[str]
+    address: AddressLines
+    postcode: str
+
+@dataclass
+class Property:
+    identity: PropertyIdentity
+
+    # --- Source data — modelling path is determined by which of these are set ---
+    epc: Optional[EpcPropertyData]              # from gov API (or remapped historical)
+    site_notes: Optional[SiteNotes]             # our own survey; supersedes EPC when present
+    landlord_overrides: Optional[LandlordOverrides]   # sparse, only meaningful when epc set
+
+    # --- Enrichments ---
+    geospatial: Optional[GeoSpatial]
+    solar: Optional[SolarPotential]
+    epc_anomaly_flags: Optional[EpcAnomalyFlags]  # from EpcPredictionService vs neighbours
+
+    # --- Modelling outputs ---
+    baseline_predictions: Optional[BaselinePredictions]   # SAP/carbon/heat after rebaselining
+    recommendations: list[Recommendation]
+    impact_predictions: Optional[ImpactPredictions]
+    optimised_package: Optional[OptimisedPackage]
+
+    # --- Derived ---
+    @property
+    def source_path(self) -> Literal["site_notes", "epc_with_overlay"]: ...
+
+    @property
+    def effective_epc(self) -> EpcPropertyData:
+        """The EPC the modelling pipeline actually scores against."""
+        ...
+```
+
+### 5.2 `Properties` collection
+
+A first-class iterable, so batch operations are obvious:
+
+```python
+@dataclass
+class Properties:
+    items: list[Property]
+
+    def __iter__(self) -> Iterator[Property]: ...
+    def __len__(self) -> int: ...
+    def filter(self, pred: Callable[[Property], bool]) -> "Properties": ...
+    def map(self, fn: Callable[[Property], Property]) -> "Properties": ...
+    def with_landlord_overrides(self) -> "Properties": ...
+```
+
+Services typically take and return `Properties`, not lists.
+
+### 5.3 Other aggregates
+
+| Aggregate | Owns | Repo |
+|---|---|---|
+| `Property` | property identity, epc, site_notes, landlord_overrides, enrichments, modelling results | `PropertyRepo` |
+| `Plan` | per-property modelling output, scenario membership, plan + recommendations + parts | `RecommendationsRepo` |
+| `Scenario` | portfolio-wide scenario metadata | `RecommendationsRepo` |
+| `Subtask` / `Task` | SQS fanout state | `SubtaskRepo` |
+| `EpcCache` | gov-API responses keyed by UPRN, with freshness/TTL | `EpcCacheRepo` |
+| `GenericData` | UPRN-range geospatial, postcode lookups, shared static data | `GenericDataRepo` |
+
+Aggregates are loaded **whole** — never half a `Property`. If a slice is too large to load eagerly (e.g. recommendation history), it lives in a separate aggregate.
+
+---
+
+## 6. Source-of-truth and overlay precedence
+
+There are exactly **two modelling paths**. The `Property.source_path` property selects.
+
+### 6.1 Path 1 — Site notes
+
+If a `Property` has `site_notes` and they are newer than any available EPC (or no EPC exists), site notes are the **complete** source of truth:
+
+- `effective_epc` = `site_notes.to_epc_property_data()`.
+- EPC fields not covered by site notes — **none expected**. Site notes are committed to being a full-coverage survey. Treat any gap as a survey-quality bug, not a fallback signal.
+- `LandlordOverrides` are not applicable in Path 1 (the survey supersedes).
+
+### 6.2 Path 2 — EPC with landlord overlay
+
+If a `Property` has no site notes (or the EPC is newer):
+
+- `effective_epc` = `epc` with `landlord_overrides` applied as a sparse field-level overlay (`landlord > epc`).
+- `LandlordOverrides` are sparse: each row represents one corrected field. Schema TBD at implementation time; assume flat input via Excel/CSV for v1, with a flag to revisit shape after first customer onboarding.
+
+### 6.3 Recency tie-break
+
+When a property has **both** site notes and a public EPC, the newer of the two wins. Rationale: a recent EPC may reflect retrofit work done after our survey; conversely a recent survey reflects on-site observations the EPC cannot capture.
+
+This tie-break is implemented in `Property.source_path` and may be tuned later (e.g. always prefer surveys regardless of date, or per-portfolio policy).
+
+### 6.4 Rebaselining trigger
+
+The modelling pipeline re-predicts SAP / carbon / heat / kwh whenever:
+
+- `effective_epc` differs from the canonical baseline (i.e. raw EPC with no overrides), **or**
+- The previous modelling snapshot is missing or stale.
+
+The exact diff mechanism (hash of effective EPC, dirty-flag on overrides, timestamp comparison) is an implementation detail; recommendation is to start with a content hash stored alongside the previous run.
+
+### 6.5 Deprecated concepts
+
+- **Patches** (`patch_epc`) — removed. Functionality subsumed by `LandlordOverrides`.
+- **Already-installed measures** — likely subsumed by `LandlordOverrides` ("we have a heat pump now" → override heating fields). Confirmed at implementation time.
+- **Non-invasive recommendations** — TBD whether this concept survives; not blocking.
+
+---
+
+## 7. Persistence: repositories and unit of work
+
+### 7.1 What a repository is
+
+A repository owns the SQL for one aggregate. Nothing else writes SQL for that aggregate. Callers see only domain objects.
+
+```python
+class PropertyRepo(Protocol):
+    def get(self, identity: PropertyIdentity) -> Optional[Property]: ...
+    def bulk_save(self, uow: UnitOfWork, properties: Properties) -> None: ...
+    def find_by_portfolio(self, portfolio_id: UUID) -> Properties: ...
+    def find_stale(self, portfolio_id: UUID, threshold: timedelta) -> Properties: ...
+```
+
+Implementation references current `db_funcs.*` modules during phase 0 to avoid a big-bang SQL rewrite, but the interface is fixed.
+
+### 7.2 Unit of Work
+
+Multi-table writes inside a single aggregate, or across aggregates that share a transaction (e.g. property + plan + recommendations) go through a `UnitOfWork`:
+
+```python
+with self.uow_factory() as uow:
+    self.property_repo.bulk_save(uow, properties)
+    self.recommendations_repo.bulk_save(uow, plans)
+    uow.commit()
+```
+
+UoW owns the SQLAlchemy session lifecycle. Repos use the session passed in via the UoW. Outside a UoW, repos use a short-lived read session.
+
+### 7.3 Repository inventory
+
+| Repo | Tables it owns |
+|------|----------------|
+| `PropertyRepo` | `properties`, `property_details_epc`, `property_spatial` |
+| `EpcCacheRepo` | new table: `epc_api_cache` (TTL, raw API response, mapped `EpcPropertyData`) |
+| `SiteNotesRepo` | new table: `site_notes` (replaces current `energy_assessments`) |
+| `LandlordOverridesRepo` | new table: `landlord_overrides` (sparse, per-field rows for audit) |
+| `RecommendationsRepo` | `plans`, `recommendations`, `recommendation_parts`, `scenarios` |
+| `GenericDataRepo` | new table or S3-backed: UPRN-range geospatial + postcode-keyed shared static data |
+| `SubtaskRepo` | `tasks`, `subtasks` (existing) |
+
+DDL migrations are scoped to sub-PRD (iii).
+
+### 7.4 Fakes
+
+For tests, each repo has a `FakeXRepo` companion backed by a dict. Service unit tests inject fakes. No DB required.
+
+---
+
+## 8. ML contract
+
+### 8.1 Where ML lives
+
+| Concern | Owner |
+|---|---|
+| Defining the EPC → features transform | **This repo** (`ara.domain.ml.EpcMlTransform`) |
+| Loading data, applying transform, writing training parquet to S3 | **This repo** (sub-PRD (ii) batch job) |
+| Training, hyperparameter search, deployment | **Autogluon repo** |
+| Scoring at modelling time | **This repo** (`FeatureBuilder` calls `EpcMlTransform`, sends DataFrame to deployed lambda) |
+
+The autogluon repo is intentionally **dumb**: it consumes parquet, knows which column is the target, knows which columns to ignore. It has no EPC semantics.
+
+### 8.2 `EpcMlTransform`
+
+A separate class (not a method on `EpcPropertyData`), because:
+
+- The data class stays clean of training-infrastructure concerns.
+- Versioned transforms (`EpcMlTransformV1`, `EpcMlTransformV2`) swap easily.
+- Future need: injection of normalisation stats from the training set is straightforward on a class, awkward on a dataclass.
+
+```python
+class EpcMlTransform:
+    VERSION: str = "1.0.0"  # semver
+
+    def to_row(self, epc: EpcPropertyData) -> dict[str, Any]: ...
+    def to_rows(self, properties: Properties) -> pd.DataFrame: ...
+    def schema(self) -> dict[str, type]: ...  # for parquet emission + validation
+```
+
+The interesting work — flattening `List[SapWindow]`, `List[SapBuildingPart]` into fixed-width columns — lives inside this class. Domain decisions (top-N windows, aggregate roofs, etc.) are encoded here and reviewed by Khalim. Sub-PRD (ii) goes into detail.
+
+### 8.3 Versioning
+
+- Transform class is **semver-tagged** (`VERSION = "1.0.0"`).
+- S3 path for training parquet includes the version: `s3://.../training/v1.0.0/...`.
+- Deployed scoring lambda is tagged with the transform version it was trained against.
+- Modelling pipeline asserts at startup that its `EpcMlTransform.VERSION` matches the deployed lambda's tag; mismatch = hard fail at deploy time.
+
+Bump major when removing or renaming columns. Bump minor when adding optional columns (older models still scoreable; new models can be trained against new fields).
+
+### 8.4 Two model families, one transform
+
+Both ML services use the same transform:
+
+| Service | Lambda | Target |
+|---|---|---|
+| `KwhPredictionService` (service #5) | `kwh-models-*` | annual kWh + bills |
+| `ImpactPredictionService` (service #7) | `impact-models-*` | SAP, carbon, heat demand, post-retrofit kWh |
+
+The two families are trained against the same input feature schema; only target columns differ. Sub-PRD (ii) handles training-time details.
+
+---
+
+## 9. Service catalogue
+
+Twelve classes implement the modelling pipeline end-to-end. Detailed signatures are deliberately left for implementers — this PRD documents purpose, dependencies, and rough shape.
+
+### 9.1 Fetchers (called by `IngestionPipeline`)
+
+| # | Class | Purpose | Dependencies |
+|---|---|---|---|
+| F1 | `EpcClientService` | Fetches EPCs from new gov API. Already exists at `backend/epc_client/`. | httpx |
+| F2 | `GeospatialFetcher` | Fetches UPRN-range geospatial data (replaces `OpenUprnClient` use in current engine). | S3 / Ordnance Survey API |
+| F3 | `SolarFetcher` | Wraps Google Solar API; building-level + unit-level scenes. | Google Solar API |
+| F4 | `SiteNotesIngester` | Loads site notes from Excel uploads / structured input. Persists via `SiteNotesRepo`. | S3, repo |
+
+### 9.2 Domain services (called by `ModellingPipeline`)
+
+| # | Class | Original-list # | Purpose | Reads | Writes |
+|---|---|---|---|---|---|
+| S1 | `EpcRemappingService` | 4 | Re-map legacy / historical EPCs into new `EpcPropertyData` shape. | `EpcCacheRepo` | `EpcCacheRepo` (mapped column) |
+| S2 | `EpcPredictionService` | 3 | For every property: produce predicted EPC + per-field anomaly flags vs neighbours. Used both for gap-fill (Path 2 if EPC missing) and UI surfacing. | `EpcCacheRepo`, `GenericDataRepo` | — |
+| S3 | `FeatureBuilder` | (new) | Wraps `EpcMlTransform`. Converts `Properties` → scoring DataFrame. | — | — |
+| S4 | `KwhPredictionService` | 5 | Calls kWh + bills ML lambda; attaches results to `Property.baseline_predictions` / per-measure. | `FeatureBuilder` | — |
+| S5 | `RecommendationService` | 6 | Generates per-property recommendations using `effective_epc`, materials, exclusions, etc. Replaces current `Recommendations` (1383 LOC). | `MaterialsRepo` | — |
+| S6 | `ImpactPredictionService` | 7 | Calls SAP / carbon / heat / bills impact lambda for each recommendation. | `FeatureBuilder` | — |
+| S7 | `OptimiserService` | 8 | Produces optimised retrofit packages. Wraps current `CostOptimiser` / `GainOptimiser` / `optimise_with_scenarios`. | — | — |
+| S8 | `ResultsPersister` | 9 | Final step: writes plans, recommendations, property updates via repos under one UoW. | — | All write repos |
+
+### 9.3 Orchestrators
+
+| # | Class | Purpose |
+|---|---|---|
+| O1 | `IngestionPipeline` | Per-batch SQS consumer. Calls F1–F4, persists via repos. |
+| O2 | `ModellingPipeline` | Per-batch SQS consumer. Reads from repos, runs S1→S8 in order, ends with persistence. |
+| O3 | `RefreshOrchestrator` | Top-level: triggers Ingestion → diff → optionally Modelling. |
+
+### 9.4 `ModellingPipeline` step order
+
+For each `Property` in the batch:
+
+```
+1. PropertyRepo.get()  →  Property (epc, site_notes, overrides, geospatial, solar)
+2. EpcRemappingService — if epc is in legacy schema, upgrade to current
+3. EpcPredictionService — produce predicted EPC + anomaly flags (always runs)
+4. Compute Property.effective_epc (path-1 or path-2)
+5. KwhPredictionService — baseline kwh + bills
+6. RecommendationService — generate candidate measures
+7. ImpactPredictionService — predict per-measure impact
+8. OptimiserService — select optimal package
+9. KwhPredictionService — re-score on optimised package (tenant savings)
+10. ResultsPersister — write Plan + Recommendations under one UoW
+```
+
+Steps 1–4 are per-property. Steps 5–9 batch the whole batch into one ML call where possible (the lambdas accept a DataFrame; today's code already batches).
+
+### 9.5 Per-service contracts — deferred
+
+Method signatures, return types, error semantics, and edge-case behaviour are **explicitly out of scope** for this PRD. The implementer of each service runs a `/grill-me` session against this document and produces a detailed sub-design before coding.
+
+---
+
+## 10. Cross-batch concerns
+
+| Concern | Status | Approach |
+|---|---|---|
+| Building-level solar adjustment | Deferred — future TODO, not implemented today. | The current `building_ids` block in `model_engine` is dead-ish; it operates on the in-process batch only. New design preserves that limitation. Future feature: a post-modelling consolidation pass that groups results by `building_id` across batches and re-optimises. |
+| Portfolio aggregation | Dropped. | Front-end computes aggregations dynamically from per-property plans. `extract_portfolio_aggregation_data` in current engine is dead code (defined, never called) — deleting. |
+| Shared upstream data | Handled by orchestrator partitioning + `GenericDataRepo`. | Trigger endpoint groups UPRNs by postcode / UPRN-range before SQS chunking so each batch maximises intra-batch sharing. `GenericDataRepo` caches across batches so first batch pays, subsequent batches hit cache. |
+
+---
+
+## 11. Directory layout
+
+Proposal — team to tweak.
+
+```
+ara/                                  # new top-level package, sibling of backend/
+├── domain/
+│   ├── __init__.py
+│   ├── property.py                   # Property aggregate
+│   ├── properties.py                 # Properties collection
+│   ├── identity.py                   # PropertyIdentity, AddressLines
+│   ├── site_notes.py                 # SiteNotes (replaces energy_assessment)
+│   ├── landlord_overrides.py
+│   ├── geospatial.py
+│   ├── solar.py
+│   ├── recommendations.py            # Recommendation, OptimisedPackage
+│   ├── predictions.py                # BaselinePredictions, ImpactPredictions
+│   ├── anomaly_flags.py              # EpcAnomalyFlags
+│   └── ml/
+│       ├── __init__.py
+│       ├── transform.py              # EpcMlTransform (versioned)
+│       └── schema.py                 # scoring DataFrame schema
+│
+├── fetchers/
+│   ├── __init__.py
+│   ├── epc_client.py                 # alias / re-export of backend/epc_client/
+│   ├── geospatial.py
+│   ├── solar.py
+│   └── site_notes_ingester.py
+│
+├── repos/
+│   ├── __init__.py
+│   ├── unit_of_work.py
+│   ├── property_repo.py
+│   ├── epc_cache_repo.py
+│   ├── site_notes_repo.py
+│   ├── landlord_overrides_repo.py
+│   ├── recommendations_repo.py
+│   ├── generic_data_repo.py
+│   └── subtask_repo.py
+│
+├── services/
+│   ├── __init__.py
+│   ├── epc_remapping.py
+│   ├── epc_prediction.py             # nearby-similar + anomaly flags
+│   ├── feature_builder.py            # uses domain.ml.EpcMlTransform
+│   ├── kwh_prediction.py
+│   ├── impact_prediction.py
+│   ├── recommendation.py
+│   ├── optimiser.py                  # wraps recommendations/optimiser/
+│   └── results_persister.py
+│
+├── orchestrators/
+│   ├── __init__.py
+│   ├── ingestion_pipeline.py
+│   ├── modelling_pipeline.py
+│   └── refresh_orchestrator.py
+│
+├── api/
+│   ├── __init__.py
+│   ├── routers/
+│   │   ├── ingestion.py              # if two APIs
+│   │   └── modelling.py
+│   └── schemas/                      # request/response Pydantic models
+│
+└── tests/
+    ├── fakes/                        # FakePropertyRepo, FakeEpcClient, etc.
+    ├── unit/                         # service tests using fakes only
+    └── integration/                  # real DB + real SQS via localstack
+```
+
+`backend/` continues to host the legacy code during phase 1. Once the last portfolio is migrated, `backend/engine/`, `backend/SearchEpc.py`, `backend/Property.py` are deleted.
+
+Reused intact (no rewrite needed):
+
+- `backend/epc_client/` — the new gov API client. Wrapped by `ara/fetchers/epc_client.py`.
+- `datatypes/epc/domain/` — the new EPC schema. `Property.epc: EpcPropertyData` references it directly.
+- `recommendations/optimiser/` — wrapped by `ara/services/optimiser.py`.
+- `backend/app/db/` — repos delegate into `db_funcs.*` until the SQL is rewritten under sub-PRD (iii).
+
+---
+
+## 12. Testing strategy
+
+### 12.1 Unit tests (the bulk)
+
+Every service test injects fake fetchers and fake repos. No DB, no network, no ML lambda. A service test verifies one slice of logic in 5–30 lines.
+
+Example:
+
+```python
+def test_epc_prediction_flags_anomalous_wall_type():
+    neighbours = [_make_epc(wall_construction="solid") for _ in range(5)]
+    target = _make_property(epc=_make_epc(wall_construction="cavity"))
+    repo = FakeGenericDataRepo(neighbours_by_postcode={target.identity.postcode: neighbours})
+
+    svc = EpcPredictionService(generic_repo=repo)
+    result = svc.run(Properties([target]))
+
+    assert result[0].epc_anomaly_flags.wall_construction == "differs_from_neighbours"
+```
+
+### 12.2 Integration tests
+
+One per pipeline (Ingestion, Modelling, Refresh). Real Postgres (testcontainers or localstack), fake fetchers (hitting recorded fixtures), fake ML lambdas (returning canned predictions). Catches schema / SQL / transaction issues.
+
+### 12.3 Contract tests
+
+The transform (`EpcMlTransform`) has its own test suite:
+
+- Golden file: given a fixed `Property`, output matches an expected DataFrame row exactly.
+- Schema test: the output columns exactly match a checked-in CSV header (so autogluon team sees breakage on PR).
+
+### 12.4 What is NOT tested
+
+- The autogluon repo's training code — owned there.
+- The gov EPC API behaviour — assumed via the official spec.
+- Front-end aggregation logic — owned there.
+
+---
+
+## 13. Observability
+
+Each pipeline step emits a **structured log line** at start and end with:
+
+```
+{step, property_id, uprn, portfolio_id, subtask_id, duration_ms, outcome, error?}
+```
+
+Errors propagate with the `Property.identity` attached, so a portfolio of 100k can be triaged by grep.
+
+The existing task/subtask state machine is preserved — `IngestionPipeline` and `ModellingPipeline` update subtask status at start (`in progress`), end (`complete` / `failed`), with the CloudWatch log URL attached as today.
+
+CloudWatch alarms exist on subtask failure rate; thresholds remain unchanged.
+
+---
+
+## 14. Data flow: a worked example
+
+A landlord uploads a corrected heating system for UPRN 12345 via the UI.
+
+1. **UI** → `POST /properties/12345/overrides` → writes to `landlord_overrides` table via `LandlordOverridesRepo`.
+2. **RefreshOrchestrator** invoked (either automatically on override-write, or by a "re-model" button). Notes: ingestion is *not* triggered because no external state changed.
+3. **ModellingPipeline** invoked on a batch of `[12345]`:
+   - Reads `Property(uprn=12345)` from `PropertyRepo`.
+   - `Property.effective_epc` = epc + landlord_overrides → heating system fields differ from baseline.
+   - Rebaselining triggered: `KwhPredictionService` re-predicts baseline SAP / carbon / heat / kwh.
+   - `RecommendationService` regenerates recommendations against the new baseline.
+   - `OptimiserService` re-picks optimal package.
+   - `ResultsPersister` writes new plan under one UoW (old plan is superseded; whether to soft-archive is a sub-PRD (iii) decision).
+
+Total external calls: zero. The override write is the only thing that hit a network boundary, and that was the inbound HTTP from the UI.
+
+---
+
+## 15. Open questions for team review
+
+1. **One API vs two** (§4.5) — clean interfaces allow either; pick at implementation.
+2. **`LandlordOverrides` shape** (§6.2) — flat-Excel-shape for v1, with a flag to revisit after first customer.
+3. **`already_installed` and `non_invasive_recommendations`** (§6.5) — both likely subsumed by overlay, but final call deferred.
+4. **Recency tie-break policy** (§6.3) — default "newer wins"; team to consider per-portfolio override.
+5. **`GenericDataRepo` storage backend** — Postgres table, S3, or DynamoDB. Postgres is the path of least infra change; recommend defaulting to that.
+6. **Soft-archive vs hard-overwrite** for superseded plans (§14) — affects audit / undo behaviour. Defer to sub-PRD (iii).
+7. **Building-level optimisation as a Phase 2 service** (§10) — agreed deferred; flag for roadmap discussion.
+8. **Transform versioning policy** (§8.3) — semver chosen; team to confirm bump conventions.
+
+---
+
+## 16. Linked sub-PRDs (placeholders)
+
+- **Sub-PRD (ii) — ML training pipeline** — `docs/sub-prds/ml-training-pipeline.md` (TBC)
+- **Sub-PRD (iii) — DB schema migration** — `docs/sub-prds/db-schema-migration.md` (TBC)
+- **Sub-PRD (iv) — Historical EPC re-mapping** — `docs/sub-prds/historical-epc-remap.md` (TBC)
+
+Each sub-PRD owner: TBC. Each is independently reviewable but consumes the contracts defined in §5 (`Property` aggregate), §7 (repos), §8 (ML transform).
+
+---
+
+## 17. Next steps
+
+1. Team review of this PRD (target: ~1 week).
+2. Open follow-up grill sessions per service (`/grill-me` on each of S1–S8 + F1–F4) before that service is implemented.
+3. Break into issues via `/to-issues` against the project tracker.
+4. Stand up the empty `ara/` package skeleton + fakes + first integration-test scaffold as PR-1.
+5. Land services in dependency order: domain → repos → fetchers → services → orchestrators → API.
+
+Phase 1 milestone gate: first portfolio routed through new pipeline with parity against old engine (manual spot-check on 5 representative properties).

From 3afeeac1b52100b07a7787a27e54b8475bedc31e Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Wed, 13 May 2026 20:10:39 +0000
Subject: [PATCH 2/8] editing prep property ui surfaces logic

---
 ara_backend_design.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ara_backend_design.md b/ara_backend_design.md
index f901936b..e235cf6d 100644
--- a/ara_backend_design.md
+++ b/ara_backend_design.md
@@ -48,7 +48,7 @@ The contracts this PRD defines are the inputs each sub-PRD consumes.
 3. **Make every service unit-testable against fakes** — no test needs a real DB, a real gov API, or a real ML lambda to verify business logic.
 4. **Establish a single `Property` aggregate root** as the domain centrepiece; all 9 modelling concerns are slices of one aggregate.
 5. **Versioned ML data contract** — the EPC-to-features transform is the single shared artifact between this repo and the autogluon repo.
-6. **Per-property UI surfaces** — fetched data is shown to users for review and override **before** modelling runs; modelling is triggered separately.
+6. **Per-property UI surfaces** — fetched data can be shown to users for review and override **before** modelling runs; modelling is triggered separately. This will enable a landlord facing version of the product where we fetch the open data, present back to the user for review and then perform tbe modelling.
 
 ### 2.2 Non-goals
 

From d9c169608528475e0ed33048b8e637521143b9ad Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Wed, 13 May 2026 21:26:18 +0000
Subject: [PATCH 3/8] added architechtural decisions, added to prd

---
 CONTEXT.md                                    | 208 ++++++++++++++++++
 docs/adr/0001-two-source-paths.md             |  10 +
 docs/adr/0002-property-aggregate-root.md      |  14 ++
 ...3-strict-ingestion-modelling-separation.md |  13 ++
 4 files changed, 245 insertions(+)
 create mode 100644 CONTEXT.md
 create mode 100644 docs/adr/0001-two-source-paths.md
 create mode 100644 docs/adr/0002-property-aggregate-root.md
 create mode 100644 docs/adr/0003-strict-ingestion-modelling-separation.md

diff --git a/CONTEXT.md b/CONTEXT.md
new file mode 100644
index 00000000..bd71d6b5
--- /dev/null
+++ b/CONTEXT.md
@@ -0,0 +1,208 @@
+# Ara
+
+The Domna product for domestic retrofit modelling: ingests open-source EPC data, lets users correct or supersede it with their own surveys, and produces optimised retrofit packages for each property in a portfolio.
+
+## Language
+
+### Product
+
+**Ara**:
+The Domna product. Latin for "the altar"; named under Domna's classical-naming convention. Covers both the modelling product and the backend that powers it.
+_Avoid_: ARA (acronym style), v2 backend, the new backend
+
+**Domna**:
+The company. Roman name; sibling to Ara in the same naming convention.
+
+### Energy Performance Certificates
+
+**EPC**:
+An Energy Performance Certificate — a government-issued document rating a dwelling's energy efficiency from A (best) to G (worst).
+_Avoid_: energy certificate, energy report
+
+**Certificate Number**:
+The unique identifier assigned to an EPC by the government registry.
+_Avoid_: cert number, EPC ID
+
+**Registration Date**:
+The date an EPC was lodged with the government register; used to identify the most recent certificate for a property.
+_Avoid_: assessment date, submission date
+
+**EPC Band**:
+A single letter A–G representing a property's current or potential energy efficiency rating.
+_Avoid_: energy rating, EPC grade, EPC score
+
+**Schema Type**:
+The versioned RdSAP or SAP schema that describes the structure of an EPC's raw data (e.g. `RdSAP-Schema-21.0.1`).
+_Avoid_: schema version, EPC format
+
+**Domestic Certificate**:
+An EPC issued for a residential dwelling, as opposed to a commercial one.
+_Avoid_: residential EPC, home EPC
+
+### Properties and addresses
+
+**Property**:
+The Ara domain aggregate representing a single dwelling under modelling: its identity, source data, enrichments, and modelling outputs.
+_Avoid_: dwelling, unit, home, asset
+
+**Properties**:
+A first-class collection of Property objects; the unit of bulk operation in services.
+_Avoid_: property list, batch (used for SQS chunks)
+
+**UPRN**:
+Unique Property Reference Number — the government-issued permanent identifier for a physical address in the UK.
+_Avoid_: property ID, address ID, code
+
+**Postcode**:
+A UK postal code used to group nearby addresses; the primary search key for finding EPC records.
+_Avoid_: zip code, postal code
+
+**User Address**:
+A free-text address string provided by a user or imported from a customer dataset, before any normalisation or matching.
+_Avoid_: user input, raw address, user_inputed_address
+
+**Comparable Properties**:
+The reference cohort matched to a target Property by both geographic proximity (postcode prefix / UPRN range) and physical similarity (property type, built form, age band); used by the EPC Prediction Service for gap-filling and anomaly detection.
+_Avoid_: neighbours, similar properties, peer set
+
+### Source data
+
+**Site Notes**:
+The full-coverage record produced by a Domna survey of a single Property; carries every EPC field the modelling pipeline requires, and when present supersedes the public EPC for that Property — except when the public EPC is newer.
+_Avoid_: energy assessment, site survey, field survey, Domna survey, Hestia survey
+
+**Landlord Overrides**:
+Property data supplied by a landlord that may correct or supplement the public EPC for a single Property; triggers Rebaselining when applied; not applicable when Site Notes are present.
+_Avoid_: patches (deprecated), corrections, manual EPC, edits
+
+### Modelling
+
+**Effective EPC**:
+The EpcPropertyData scored by the modelling pipeline for a single Property, derived from either Site Notes alone or the public EPC with Landlord Overrides applied; carries source-derived physical fields and originally recorded performance values, with model-rebaselined performance held separately in Baseline Performance.
+_Avoid_: modelling EPC, working EPC, resolved EPC, derived EPC
+
+**Rebaselining**:
+Recomputing a Property's Baseline Performance via ML when its Effective EPC diverges from the originally lodged public EPC, or when no previous baseline exists.
+_Avoid_: re-scoring, re-prediction, performance recomputation
+
+**Baseline Performance**:
+The set of ML-predicted performance values for a single Property — SAP, carbon emissions, heat demand, annual kWh — produced by scoring the Effective EPC against the kWh model; distinct from the originally recorded performance fields on the Effective EPC.
+_Avoid_: baseline predictions, predicted baseline, rebaselined values
+
+**EPC Anomaly Flag**:
+A per-field indicator that a Property's value for an EPC field differs significantly from Comparable Properties; advisory only — surfaces in the UI to prompt user review, does not block modelling.
+_Avoid_: outlier, mismatch, divergence flag
+
+### Outputs
+
+**Scenario**:
+A named portfolio-level container for a single modelling run, capturing the goal (e.g. Increasing EPC), budget, exclusions, and housing type; holds many Plans.
+_Avoid_: project, batch, run-set
+
+**Plan**:
+The per-Property output of a single modelling run; belongs to one Scenario and carries the Property's full Recommendation list, Optimised Package, and post-retrofit predictions.
+_Avoid_: recommendation set, output, result
+
+**Recommendation**:
+A single proposed retrofit measure for a Property, with its cost, SAP impact, kWh savings, carbon savings, and parts list.
+_Avoid_: suggestion, option
+
+**Optimised Package**:
+The subset of a Property's Recommendations selected by the Optimiser Service for installation, chosen to satisfy the Scenario's goal subject to budget.
+_Avoid_: selected measures, default measures, optimal solution, recommended bundle
+
+**Measure Type**:
+The catalogue classification of a retrofit measure (e.g. `solar_pv`, `loft_insulation`, `ashp`); one or more Recommendations reference the same Measure Type with property-specific cost and impact.
+_Avoid_: measure (ambiguous), category
+
+### Address matching
+
+**Lexiscore**:
+A similarity score in [0, 1] between a User Address and a candidate EPC address; combines token overlap and character-level similarity.
+_Avoid_: score, match score, similarity
+
+**Lexirank**:
+Dense rank of candidates sorted by Lexiscore descending; rank 1 = best match.
+_Avoid_: rank, position
+
+**UPRN Candidate**:
+An EPC Search Result that is a plausible match for a given User Address, before scoring decides the winner.
+_Avoid_: match candidate, result
+
+**Score Threshold**:
+The minimum Lexiscore (currently 0.6) below which no match is returned even if a candidate exists.
+_Avoid_: minimum score, cutoff
+
+**Ambiguous Match**:
+A matching outcome where two or more candidates share Lexirank 1, making it impossible to select a unique winner.
+_Avoid_: tie, draw, duplicate
+
+**Best Match**:
+The single UPRN Candidate with Lexirank 1 that meets or exceeds the Score Threshold.
+_Avoid_: winner, top result
+
+### API and integration
+
+**EPC Search Result**:
+A lightweight record returned by the government domestic search endpoint — address lines, postcode, UPRN, band, and certificate number, but not full certificate data.
+_Avoid_: search row, EPC row, result
+
+**EPC Property Data**:
+The fully mapped domain object produced after fetching and parsing a complete EPC certificate; the schema the modelling pipeline operates against.
+_Avoid_: EPC data, certificate data, parsed EPC
+
+**Old EPC API**:
+The retired government API (`epc.opendatacommunities.org`) using HTTP Basic auth; decommissioned 30 May 2026.
+_Avoid_: legacy API
+
+**New EPC API**:
+The replacement government API (`api.get-energy-performance-data.communities.gov.uk`) using Bearer Token auth.
+_Avoid_: new API, current API
+
+**Bearer Token**:
+The auth credential required by the New EPC API; stored in the `EPC_AUTH_TOKEN` environment variable.
+_Avoid_: API key, auth token, secret
+
+## Relationships
+
+- A **Property** represents a single physical dwelling for modelling; identified by `(portfolio_id, UPRN)` or `(portfolio_id, landlord_property_id)`.
+- A **Property** has zero or more **EPCs** across time, exactly one **Effective EPC**, zero or one set of **Site Notes**, and zero or one set of **Landlord Overrides**.
+- An **EPC** belongs to exactly one **Property** and has one **Certificate Number**.
+- An **EPC** carries an **EPC Band** and is identifiable by its **Registration Date**; the most recent one is the current.
+- A **UPRN** identifies a physical dwelling permanently; it does not change when the property changes owner — but each portfolio gets its own **Property** keyed against it.
+- When a **Property** has both **Site Notes** and a public **EPC**, the newer of the two derives the **Effective EPC**. **Landlord Overrides** apply only when the **EPC** is the source — never when **Site Notes** are.
+- **Rebaselining** produces **Baseline Performance** for a Property; triggered when the **Effective EPC** diverges from the originally lodged EPC (because of **Site Notes**, **Landlord Overrides**, an expired EPC, or an estimated EPC).
+- The **EPC Prediction Service** uses **Comparable Properties** for both gap-filling and producing **EPC Anomaly Flags**.
+- A **Scenario** contains many **Plans** (one per Property). A **Plan** carries many **Recommendations**; the **Optimised Package** is the subset selected for installation.
+- A **Recommendation** references one **Measure Type** and carries property-specific cost and impact.
+- **Address Matching** uses a **User Address** and **Postcode** to find a **UPRN** by scoring **UPRN Candidates** from an EPC search. A **Lexirank** of 1 with no **Ambiguous Match** and a **Lexiscore** ≥ the **Score Threshold** produces a **Best Match**.
+
+## Example dialogue
+
+> **Dev:** "A landlord uploads a corrected boiler for one of their properties. What happens?"
+>
+> **Domain expert:** "That's a **Landlord Override** on the heating fields. Save it against the **Property**, then trigger **Rebaselining** — the **Effective EPC** has changed, so we need fresh **Baseline Performance** before regenerating **Recommendations**."
+
+> **Dev:** "What if the same Property also has Site Notes?"
+>
+> **Domain expert:** "**Site Notes** supersede the public **EPC**, so **Landlord Overrides** don't apply. We model from the **Site Notes** version of the **Effective EPC**. If the public **EPC** is newer than the **Site Notes**, that's the one exception — we use the newer one."
+
+> **Dev:** "After modelling we end up with a list of measures. Which ones get installed?"
+>
+> **Domain expert:** "The **Optimiser Service** picks the **Optimised Package** — a subset of **Recommendations** that hits the **Scenario** goal within budget. The rest stay in the **Plan** as alternatives the user can swap in."
+
+> **Dev:** "I'm looking at a property where the EPC says cavity walls but every other house on the street has solid. Is that a bug?"
+>
+> **Domain expert:** "That's an **EPC Anomaly Flag**. We compute it against the **Comparable Properties** for that postcode. It's advisory — the UI surfaces it and the landlord can apply a **Landlord Override** if it's wrong."
+
+## Flagged ambiguities
+
+- **"property"** was historically warned against in favour of "dwelling"; that has been inverted. **Property** is now canonical for the Ara domain aggregate. Legacy code still uses "dwelling" in places — treat as alias.
+- **"energy assessment"** in the existing codebase (`energy_assessment_functions`, `energy_assessments_by_uprn`) refers to what is now canonically called **Site Notes**. New code uses **Site Notes**.
+- **"patch"** / `patch_epc` in the existing codebase has been merged into **Landlord Overrides**; the original concept is deprecated.
+- **"already_installed measures"** in the existing codebase is likely subsumed by **Landlord Overrides** ("we have a heat pump now" → override the heating fields). Final call deferred to implementation.
+- **"address"** appears as both the raw **User Address** (free-text) and a structured field on an **EPC Search Result** (normalised lines). Always qualify: "user address" vs "EPC address" or "address line 1".
+- **"score"** is used for `AddressMatch.score()` output, the `lexiscore` column, and informally. Prefer **Lexiscore** in domain discussions; reserve "score" for method-level code comments.
+- **"user_inputed_address"** in `backend/address2UPRN/main.py` is a misspelling and a synonym for **User Address** — the canonical term. New code should use `user_address`.
+- **"EPC"** is overloaded as both the document and the rating band letter. Use **EPC** for the document, **EPC Band** for the letter.
+- **"re-scoring"** has two meanings in the codebase — **Rebaselining** (re-predicting baseline performance after an EPC change) and post-optimisation measure re-prediction. Prefer **Rebaselining** for the former; for the latter, the **Optimiser Service** step does its own scoring without a special name.
diff --git a/docs/adr/0001-two-source-paths.md b/docs/adr/0001-two-source-paths.md
new file mode 100644
index 00000000..82615810
--- /dev/null
+++ b/docs/adr/0001-two-source-paths.md
@@ -0,0 +1,10 @@
+# Two source paths for a Property, not layered precedence
+
+For modelling a Property we considered a strict layered precedence stack — `patches > site_notes > energy_assessment > epc > predicted` — with per-field provenance tracking. We rejected that in favour of **two strictly disjoint source paths**: a Property is modelled either from its Site Notes alone, or from the public EPC with Landlord Overrides applied on top. Site Notes are committed to being full-coverage by the domain ([CONTEXT.md](../../CONTEXT.md): _Site Notes_), so once we have them the EPC is irrelevant; conversely, Landlord Overrides are only meaningful when the EPC is the source of physical state.
+
+The trade-off: layered precedence is more flexible (it tolerates a partial Site Notes survey by falling through to EPC for missing fields), but mixed-source data muddles the audit trail and undermines the "if we surveyed it, trust the survey" promise. The two-path model gives a cleaner derivation rule and an unambiguous source-of-truth per Property, at the cost of treating survey gaps as a survey-quality bug rather than a fallback signal. A Recency Tie-Break covers the one case where both exist: the newer of the two wins.
+
+## Consequences
+
+- Reversing this means rewriting `Property.effective_epc` and every service that reads it. Hard to roll back once 12 services depend on the two-path shape.
+- Future addition of a third path (e.g. partial-survey) is a real change, not just a config tweak — flag it as an ADR if proposed.
diff --git a/docs/adr/0002-property-aggregate-root.md b/docs/adr/0002-property-aggregate-root.md
new file mode 100644
index 00000000..1114bc15
--- /dev/null
+++ b/docs/adr/0002-property-aggregate-root.md
@@ -0,0 +1,14 @@
+# `Property` is the aggregate root, not `EpcPropertyData`
+
+The Ara modelling pipeline produces nine slices of per-property data (EPC, geospatial, solar, baseline performance, recommendations, optimised package, etc.). We considered making `EpcPropertyData` — the rich RdSAP-21-style EPC schema — the centrepiece, with other data hanging off it. We rejected that and introduced a new **`Property` aggregate root** that holds identity, all source data (EPC, Site Notes, Landlord Overrides), enrichments, and modelling outputs as named fields. Services take `Property` (or `Properties`) and return them with one slice populated.
+
+Two reasons drove this:
+1. **Geospatial, solar, recommendations, and overrides are peers to the EPC**, not properties of it. Putting them on `EpcPropertyData` conflates physical-state schema with modelling-run state.
+2. **A typed `ModellingContext` dict-bag (the obvious alternative)** is exactly what the current legacy `Property` class became — 1259 lines of accumulated stuff, hard to read, hard to test, hard to extend. Named fields on a dataclass force the type system to keep us honest.
+
+The cost is more domain types up front (`Property`, `Properties`, `PropertyIdentity`, `BaselinePerformance`, `OptimisedPackage`, etc.) and the discipline of one service writing one slice. The benefit is that every service has a single job and every test injects fake repos against a small, named structure.
+
+## Consequences
+
+- Every service signature accepts or returns `Property` / `Properties`. Refactoring later means touching all of them.
+- `EpcPropertyData` stays a pure physical-state schema (defined in [datatypes/epc/domain/epc_property_data.py](../../datatypes/epc/domain/epc_property_data.py)) — no modelling outputs or run state on it.
diff --git a/docs/adr/0003-strict-ingestion-modelling-separation.md b/docs/adr/0003-strict-ingestion-modelling-separation.md
new file mode 100644
index 00000000..68361ba9
--- /dev/null
+++ b/docs/adr/0003-strict-ingestion-modelling-separation.md
@@ -0,0 +1,13 @@
+# Strict separation between Ingestion and Modelling
+
+Data flows one way only: **Ingestion → Repos → Modelling**. Modelling services never make external HTTP calls; Ingestion services never run business logic. If Modelling needs fresh data, it sees a stale record in a repo and returns; the caller (a refresh orchestrator or the FE) decides whether to ingest first. We considered allowing modelling services to call fetchers directly on cache miss — convenient — and rejected it.
+
+The trade-off is that modelling cannot "self-heal" by going to the gov EPC API when it finds stale data. The benefit is that modelling becomes a deterministic function of repository state: same Property in the repos, same modelling output. That is the property that makes modelling unit-testable against fakes (no DB, no network, no ML lambda), reproducible, and debuggable. It also enables a per-property UI flow where fetched data is shown to the user for review and possible override **before** modelling runs.
+
+Under the rushed timeline this constraint is more valuable, not less. Mixing fetchers into services is the easy thing to do when shipping fast; once it's done it's hard to extract.
+
+## Consequences
+
+- Every modelling service depends only on Repos (and other Services / domain logic). No HTTP libraries in the modelling import graph.
+- A `RefreshOrchestrator` is the only thing that calls Ingestion then Modelling in sequence; nothing else may.
+- "Modelling is stale, refetch in-line" is a forbidden pattern — surface staleness, do not silently repair it.

From 02df38e207828c1a0730ef5357525597e95c0c2c Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Wed, 13 May 2026 21:52:02 +0000
Subject: [PATCH 4/8] note kwh service not needing predictions

---
 CLAUDE.md              | 14 +++----
 CONTEXT.md             | 12 ++++--
 UBIQUITOUS_LANGUAGE.md | 77 ++-----------------------------------
 ara_backend_design.md  | 86 ++++++++++++++++++++++++------------------
 4 files changed, 66 insertions(+), 123 deletions(-)

diff --git a/CLAUDE.md b/CLAUDE.md
index f88a59d5..faa857ce 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -36,28 +36,26 @@ Five Claude Code skills are installed in this repo's dev container. Each maps to
 |-------|--------|-------------|
 | **grill-me** | `/grill-me` | Before implementing — stress-tests a design through sequential questioning |
 | **to-prd** | `/to-prd` | After a planning conversation — formalises context into a GitHub issue PRD |
-| **ubiquitous-language** | `/ubiquitous-language` | When domain terms are drifting or ambiguous — builds/updates `UBIQUITOUS_LANGUAGE.md` |
+| **grill-with-docs** | `/grill-with-docs` | When domain terms are drifting or new concepts are landing — challenges plans against `CONTEXT.md`, sharpens terminology inline, and writes ADRs for load-bearing decisions in `docs/adr/`. Replaces the older `ubiquitous-language` skill. |
 | **tdd** | `/tdd` | During implementation — enforces vertical-slice TDD (one test → one impl → repeat) |
 | **improve-codebase-architecture** | `/improve-codebase-architecture` | During refactoring — surfaces shallow modules and proposes deepening opportunities |
 
+Domain glossary lives at [CONTEXT.md](./CONTEXT.md); load-bearing decisions live at [docs/adr/](./docs/adr/). The legacy [UBIQUITOUS_LANGUAGE.md](./UBIQUITOUS_LANGUAGE.md) is a redirect.
+
 ### Typical session chains
 
 **Feature planning:**
-`/grill-me` → `/to-prd` → `/ubiquitous-language`
+`/grill-me` → `/to-prd` → `/grill-with-docs`
 
 **Implementation:**
 `/tdd` (+ `/grill-me` if a design fork appears mid-session)
 
 **Refactoring:**
-`/improve-codebase-architecture` → `/grill-me` → `/tdd` → `/ubiquitous-language`
+`/improve-codebase-architecture` → `/grill-me` → `/tdd` → `/grill-with-docs`
 
 ### First time setting up?
 
-New containers install all skills automatically via the Dockerfile. If you're in an existing container, run:
-
-```bash
-bash .devcontainer/backend/install-claude-skills.sh
-```
+Skills are installed automatically when the dev container is built, via the postCreate step that pulls from `Hestia-Homes/agentic-toolkit` (see `.devcontainer/backend/Dockerfile`). If an existing container is missing skills, rebuild the dev container.
 
 ## Type Safety
 
diff --git a/CONTEXT.md b/CONTEXT.md
index bd71d6b5..69de3529 100644
--- a/CONTEXT.md
+++ b/CONTEXT.md
@@ -82,13 +82,17 @@ The EpcPropertyData scored by the modelling pipeline for a single Property, deri
 _Avoid_: modelling EPC, working EPC, resolved EPC, derived EPC
 
 **Rebaselining**:
-Recomputing a Property's Baseline Performance via ML when its Effective EPC diverges from the originally lodged public EPC, or when no previous baseline exists.
+Re-predicting a Property's SAP, carbon emissions, and heat demand via ML when its Effective EPC's physical state diverges from the originally lodged public EPC (because Site Notes or Landlord Overrides have changed walls / heating / windows / etc.). Does not include kWh — that is always derived deterministically.
 _Avoid_: re-scoring, re-prediction, performance recomputation
 
 **Baseline Performance**:
-The set of ML-predicted performance values for a single Property — SAP, carbon emissions, heat demand, annual kWh — produced by scoring the Effective EPC against the kWh model; distinct from the originally recorded performance fields on the Effective EPC.
+A Property's current performance values — SAP, carbon emissions, heat demand, annual kWh, fuel split, bills — held against the Effective EPC. SAP / carbon / heat come directly from the Effective EPC's recorded values when no override applies, or from Rebaselining when an override changes physical state. Annual kWh and the fuel split are always derived deterministically by the EPC Energy Derivation Service.
 _Avoid_: baseline predictions, predicted baseline, rebaselined values
 
+**EPC Energy Derivation**:
+The deterministic process that derives a Property's annual kWh, fuel split (gas / electric / other), and bills from the Effective EPC's energy fields — applying a UCL-style correction for known EPC over/under-prediction and deducing fuel type for heating + hot water from the SAP heating fields. No ML.
+_Avoid_: kWh prediction, baseline kWh, energy estimation
+
 **EPC Anomaly Flag**:
 A per-field indicator that a Property's value for an EPC field differs significantly from Comparable Properties; advisory only — surfaces in the UI to prompt user review, does not block modelling.
 _Avoid_: outlier, mismatch, divergence flag
@@ -171,7 +175,7 @@ _Avoid_: API key, auth token, secret
 - An **EPC** carries an **EPC Band** and is identifiable by its **Registration Date**; the most recent one is the current.
 - A **UPRN** identifies a physical dwelling permanently; it does not change when the property changes owner — but each portfolio gets its own **Property** keyed against it.
 - When a **Property** has both **Site Notes** and a public **EPC**, the newer of the two derives the **Effective EPC**. **Landlord Overrides** apply only when the **EPC** is the source — never when **Site Notes** are.
-- **Rebaselining** produces **Baseline Performance** for a Property; triggered when the **Effective EPC** diverges from the originally lodged EPC (because of **Site Notes**, **Landlord Overrides**, an expired EPC, or an estimated EPC).
+- **Rebaselining** contributes the SAP / carbon / heat parts of **Baseline Performance** when the **Effective EPC** physical state diverges from the originally lodged EPC. **EPC Energy Derivation** contributes the kWh / fuel split / bills parts unconditionally for every Property.
 - The **EPC Prediction Service** uses **Comparable Properties** for both gap-filling and producing **EPC Anomaly Flags**.
 - A **Scenario** contains many **Plans** (one per Property). A **Plan** carries many **Recommendations**; the **Optimised Package** is the subset selected for installation.
 - A **Recommendation** references one **Measure Type** and carries property-specific cost and impact.
@@ -181,7 +185,7 @@ _Avoid_: API key, auth token, secret
 
 > **Dev:** "A landlord uploads a corrected boiler for one of their properties. What happens?"
 >
-> **Domain expert:** "That's a **Landlord Override** on the heating fields. Save it against the **Property**, then trigger **Rebaselining** — the **Effective EPC** has changed, so we need fresh **Baseline Performance** before regenerating **Recommendations**."
+> **Domain expert:** "That's a **Landlord Override** on the heating fields. Save it against the **Property**. The **Effective EPC** has changed, so **Rebaselining** runs to re-predict SAP / carbon / heat, and **EPC Energy Derivation** re-runs to update kWh / bills based on the new fuel deduction. With fresh **Baseline Performance** we regenerate **Recommendations**."
 
 > **Dev:** "What if the same Property also has Site Notes?"
 >
diff --git a/UBIQUITOUS_LANGUAGE.md b/UBIQUITOUS_LANGUAGE.md
index 1765cbc8..66684925 100644
--- a/UBIQUITOUS_LANGUAGE.md
+++ b/UBIQUITOUS_LANGUAGE.md
@@ -1,78 +1,7 @@
 # Ubiquitous Language
 
-Domain terminology glossary for this project. Generated and maintained by the `/ubiquitous-language` Claude Code skill.
+This file has been **superseded by [CONTEXT.md](./CONTEXT.md)**.
 
-Invoke `/ubiquitous-language` in any session to extract new terms from the conversation, flag ambiguities, and update this file with canonical definitions.
+The project's domain glossary now lives at the repo root in `CONTEXT.md`, maintained by the `/grill-with-docs` skill (which replaced `/ubiquitous-language`).
 
----
-
-## Energy Performance Certificates
-
-| Term | Definition | Aliases to avoid |
-|------|------------|------------------|
-| **EPC** | An Energy Performance Certificate — a government-issued document rating a dwelling's energy efficiency from A (best) to G (worst). | "energy certificate", "energy report" |
-| **Certificate Number** | The unique identifier assigned to an EPC by the government registry. | "cert number", "EPC ID" |
-| **Registration Date** | The date an EPC was lodged with the government register; used to identify the most recent certificate for a property. | "assessment date", "submission date" |
-| **EPC Band** | A single letter A–G representing a property's current or potential energy efficiency rating. | "energy rating", "EPC grade", "EPC score" |
-| **Schema Type** | The versioned RdSAP or SAP schema that describes the structure of a certificate's raw data (e.g. `RdSAP-Schema-21.0.1`). | "schema version", "EPC format" |
-| **Domestic Certificate** | An EPC issued for a residential dwelling, as opposed to a commercial one. | "residential EPC", "home EPC" |
-
-## Properties and Addresses
-
-| Term | Definition | Aliases to avoid |
-|------|------------|------------------|
-| **UPRN** | Unique Property Reference Number — the government-issued permanent identifier for a physical address in the UK. | "property ID", "address ID", "code" |
-| **Postcode** | A UK postal code used to group nearby addresses; the primary search key for finding EPC records. | "zip code", "postal code" |
-| **User Address** | A free-text address string provided by a user or imported from a customer dataset, before any normalisation or matching. | "user input", "raw address", "user_inputed_address" |
-| **Dwelling** | A single residential unit that can hold an EPC — a house, flat, or maisonette. | "property", "unit", "home" |
-
-## Address Matching
-
-| Term | Definition | Aliases to avoid |
-|------|------------|------------------|
-| **Lexiscore** | A similarity score in [0, 1] between a user address and a candidate EPC address; combines token overlap and character-level similarity. | "score", "match score", "similarity" |
-| **Lexirank** | Dense rank of candidates sorted by lexiscore descending; rank 1 = best match. | "rank", "position" |
-| **UPRN Candidate** | An EPC search result that is a plausible match for a given user address, before scoring decides the winner. | "match candidate", "result" |
-| **Score Threshold** | The minimum lexiscore (currently 0.6) below which no match is returned even if a candidate exists. | "minimum score", "cutoff" |
-| **Ambiguous Match** | A matching outcome where two or more candidates share lexirank 1, making it impossible to select a unique winner. | "tie", "draw", "duplicate" |
-| **Best Match** | The single UPRN candidate with lexirank 1 that meets or exceeds the score threshold. | "winner", "top result" |
-
-## API and Integration
-
-| Term | Definition | Aliases to avoid |
-|------|------------|------------------|
-| **EPC Search Result** | A lightweight record returned by the government domestic search endpoint — contains address lines, postcode, UPRN, band, and certificate number but not the full certificate data. | "search row", "EPC row", "result" |
-| **EPC Property Data** | The fully mapped domain object produced after fetching and parsing a complete EPC certificate. | "EPC data", "certificate data", "parsed EPC" |
-| **Old EPC API** | The retired government API (`epc.opendatacommunities.org`) using HTTP Basic auth; decommissioned May 2026. | "legacy API" |
-| **New EPC API** | The replacement government API (`api.get-energy-performance-data.communities.gov.uk`) using Bearer token auth. | "new API", "current API" |
-| **Bearer Token** | The auth credential required by the new EPC API; stored in the `EPC_AUTH_TOKEN` environment variable. | "API key", "auth token", "secret" |
-
-## Relationships
-
-- An **EPC** belongs to exactly one **Dwelling** and has one **Certificate Number**.
-- A **Dwelling** may have multiple **EPCs** across time; the one with the most recent **Registration Date** is the current one.
-- A **UPRN** identifies a **Dwelling** permanently; it does not change when the property changes owner.
-- An **EPC Search Result** is a summary; it points to a full **EPC** via its **Certificate Number**.
-- **Address Matching** uses a **User Address** and **Postcode** to find a **UPRN** by scoring **UPRN Candidates** from an EPC search.
-- A **Lexirank** of 1 with no **Ambiguous Match** and a **Lexiscore** ≥ the **Score Threshold** produces a **Best Match**.
-
-## Example dialogue
-
-> **Dev:** "We have a user address and postcode. How do we find the UPRN?"
-
-> **Domain expert:** "Search the **New EPC API** by **Postcode** — you get back a list of **EPC Search Results** for that area. Each one has an address and a **UPRN**. Score each against the **User Address** using the **Lexiscore**. If the top **UPRN Candidate** scores above the **Score Threshold** and there's no **Ambiguous Match**, that's your **Best Match**."
-
-> **Dev:** "What if two results share the same address line 1?"
-
-> **Domain expert:** "That's an **Ambiguous Match** — two candidates at **Lexirank** 1. Fall back to scoring on the full address using all address lines joined together. If that still ties, return nothing."
-
-> **Dev:** "Once we have the best match, do we use the UPRN or fetch the full EPC?"
-
-> **Domain expert:** "Depends on what you need. The **EPC Search Result** gives you the **EPC Band** and **Certificate Number**. If you need energy efficiency detail, use the **Certificate Number** to fetch the full **EPC Property Data**."
-
-## Flagged ambiguities
-
-- **"address"** appears as both the raw **User Address** (free-text from customer data) and a structured field on an **EPC Search Result** (normalised address lines). Always qualify: "user address" vs "EPC address" or "address line 1".
-- **"score"** is used for the `AddressMatch.score()` function output, the `lexiscore` DataFrame column, and informally in conversation. Prefer **Lexiscore** in domain discussions; reserve "score" for method-level code comments.
-- **"user_inputed_address"** in `backend/address2UPRN/main.py` is a misspelling and a synonym for **User Address** — the canonical term. New code should use `user_address`.
-- **"EPC"** is overloaded as both the document (an Energy Performance Certificate) and the rating band letter. Use **EPC** for the document and **EPC Band** for the letter.
+If you arrived here from a link in `CLAUDE.md` or older docs, follow the link above. This file is kept only to preserve git history and may be removed once internal references are updated.
diff --git a/ara_backend_design.md b/ara_backend_design.md
index e235cf6d..4d50b3b6 100644
--- a/ara_backend_design.md
+++ b/ara_backend_design.md
@@ -48,7 +48,7 @@ The contracts this PRD defines are the inputs each sub-PRD consumes.
 3. **Make every service unit-testable against fakes** — no test needs a real DB, a real gov API, or a real ML lambda to verify business logic.
 4. **Establish a single `Property` aggregate root** as the domain centrepiece; all 9 modelling concerns are slices of one aggregate.
 5. **Versioned ML data contract** — the EPC-to-features transform is the single shared artifact between this repo and the autogluon repo.
-6. **Per-property UI surfaces** — fetched data can be shown to users for review and override **before** modelling runs; modelling is triggered separately. This will enable a landlord facing version of the product where we fetch the open data, present back to the user for review and then perform tbe modelling.
+6. **Per-property UI surfaces** — fetched data can be shown to users for review and override **before** modelling runs; modelling is triggered separately. This will enable a landlord facing version of the product where we fetch the open data, present back to the user for review and then perform the modelling.
 
 ### 2.2 Non-goals
 
@@ -59,27 +59,30 @@ The contracts this PRD defines are the inputs each sub-PRD consumes.
 
 ---
 
-## 3. Cutover strategy
+## 3. Cutover plan
 
-Two-phase cutover, driven by the 30 May deadline.
+Forced cut-over, driven by the 30 May deadline. There is no strangler period because the Old EPC API death takes `model_engine` with it.
 
-### 3.1 Phase 0 — Stopgap (now → end of May)
+### 3.1 Phase 0 — Status quo (now → 30 May)
 
-- The current `model_engine` keeps running. `SearchEpc` is rewired to delegate to `EpcClientService` (the new gov API client already built on this branch).
-- Old-schema EPCs persisted in the DB are read as-is; the EPC re-mapping service is not yet wired in.
-- Goal: no modelling outage at the API death date. Some degraded behaviour acceptable; clients are aware.
+- `model_engine` keeps running against the Old EPC API for as long as it works.
+- Build of the 9 new services starts **this week**, in parallel to the old engine continuing to serve traffic.
+- The new `ara/` package lives alongside `backend/` but is not yet wired into any production endpoint.
+- Goal: keep the lights on until the API dies; start the build immediately so the dark period is short.
 
-### 3.2 Phase 1 — Strangler (June → ~Q4 2026)
+### 3.2 Phase 1 — Forced cut-over (30 May onwards)
 
-- New `ara/` package built alongside the old code. New endpoints expose the new pipeline. The old `model_engine` keeps running.
-- Per-portfolio feature flag: when set, the trigger endpoint routes the portfolio through the new pipeline. Default is the old pipeline.
-- Each of the 9 services is built, tested, and ships independently. Adding a service to the new pipeline does not require deleting the old one.
-- When confidence is high (last portfolio migrated, no regressions seen for N weeks), the old engine is deleted.
+- On 30 May the Old EPC API dies; `model_engine` ceases to function for any new modelling run.
+- Some downtime is expected and accepted. Clients are aware.
+- Modelling resumes when the new pipeline is ready end-to-end. There is no per-portfolio feature flag, no parallel pipelines, no traffic split — the new pipeline is the only pipeline.
+- **Calico** and **Hyde** are the first live clients onto the new pipeline in June.
+- `model_engine`, `SearchEpc`, the legacy `Property`, and surrounding modules in `backend/` are deleted once the new pipeline is serving all traffic.
 
-### 3.3 What is **not** done
+### 3.3 What is *not* done
 
-- No parallel-shadow run (run both, diff outputs). Reason: doubles compute per plan, requires diff tooling we don't have, and the old engine is already known to return bad data — diffs would be noise.
-- No big-bang switch. Reason: 9 services is too much change to land in one PR.
+- No strangler — there is nothing to strangle once the Old EPC API dies on 30 May.
+- No parallel-shadow run — would double compute and require diff tooling we don't have, while the old engine is already known to return bad data so diffs would be noise.
+- No per-portfolio feature flag — the cut-over is all-or-nothing.
 
 ---
 
@@ -126,7 +129,7 @@ Every class falls into exactly one of four roles:
 |------|-----|----------|
 | **Fetchers** | Call external APIs. Return raw response data. No DB. | `EpcClientService`, `GeospatialFetcher`, `SolarFetcher`, `SiteNotesIngester` |
 | **Repos** | Persist and load domain aggregates. SQL hidden inside. No external IO. | `PropertyRepo`, `EpcCacheRepo`, `SiteNotesRepo`, `LandlordOverridesRepo`, `RecommendationsRepo`, `GenericDataRepo`, `SubtaskRepo` |
-| **Services** | Business logic over domain objects. No external IO except via injected Fetchers / Repos. | `EpcRemappingService`, `EpcPredictionService`, `KwhPredictionService`, `ImpactPredictionService`, `RecommendationService`, `OptimiserService`, `FeatureBuilder`, `ResultsPersister` |
+| **Services** | Business logic over domain objects. No external IO except via injected Fetchers / Repos. | `EpcRemappingService`, `EpcPredictionService`, `EpcEnergyDerivationService`, `KwhImpactService`, `ImpactPredictionService`, `RecommendationService`, `OptimiserService`, `FeatureBuilder`, `ResultsPersister` |
 | **Orchestrators** | Compose Fetchers + Services + Repos to produce an end-to-end result. The only place where step order is encoded. | `IngestionPipeline`, `ModellingPipeline`, `RefreshOrchestrator` |
 
 This taxonomy is **strict**. A class that fetches *and* persists belongs in the Service layer and depends on a Fetcher + a Repo. No back-channels.
@@ -160,8 +163,8 @@ UPRN partitioning: the trigger endpoint groups UPRNs by **locality** (postcode p
 
 The team will decide at implementation time whether Ingestion and Modelling sit behind:
 
-- **(a) One unified API** with a single trigger endpoint that runs both phases.
-- **(b) Two APIs**, each with its own trigger, RefreshOrchestrator chains them.
+- **(a) One unified API** with a single trigger endpoint that runs both phases. Most closely mimics what's live today.
+- **(b) Two APIs**, each with its own trigger, RefreshOrchestrator chains them. Separate API call for fetching and modelling.
 
 Either is workable if the class taxonomy is preserved. Deferred to implementation review.
 
@@ -197,7 +200,7 @@ class Property:
     epc_anomaly_flags: Optional[EpcAnomalyFlags]  # from EpcPredictionService vs neighbours
 
     # --- Modelling outputs ---
-    baseline_predictions: Optional[BaselinePredictions]   # SAP/carbon/heat after rebaselining
+    baseline_performance: Optional[BaselinePerformance]   # SAP/carbon/heat (from EPC or rebaselined ML) + kWh + fuel split (always EPC + UCL + fuel deduction)
     recommendations: list[Recommendation]
     impact_predictions: Optional[ImpactPredictions]
     optimised_package: Optional[OptimisedPackage]
@@ -383,8 +386,8 @@ Both ML services use the same transform:
 
 | Service | Lambda | Target |
 |---|---|---|
-| `KwhPredictionService` (service #5) | `kwh-models-*` | annual kWh + bills |
-| `ImpactPredictionService` (service #7) | `impact-models-*` | SAP, carbon, heat demand, post-retrofit kWh |
+| `KwhImpactService` (service #5) | `kwh-models-*` | per-measure annual kWh + bills delta (post-optimisation re-score only) |
+| `ImpactPredictionService` (service #7) | `impact-models-*` | SAP, carbon, heat demand per-measure impact |
 
 The two families are trained against the same input feature schema; only target columns differ. Sub-PRD (ii) handles training-time details.
 
@@ -410,9 +413,11 @@ Twelve classes implement the modelling pipeline end-to-end. Detailed signatures
 | S1 | `EpcRemappingService` | 4 | Re-map legacy / historical EPCs into new `EpcPropertyData` shape. | `EpcCacheRepo` | `EpcCacheRepo` (mapped column) |
 | S2 | `EpcPredictionService` | 3 | For every property: produce predicted EPC + per-field anomaly flags vs neighbours. Used both for gap-fill (Path 2 if EPC missing) and UI surfacing. | `EpcCacheRepo`, `GenericDataRepo` | — |
 | S3 | `FeatureBuilder` | (new) | Wraps `EpcMlTransform`. Converts `Properties` → scoring DataFrame. | — | — |
-| S4 | `KwhPredictionService` | 5 | Calls kWh + bills ML lambda; attaches results to `Property.baseline_predictions` / per-measure. | `FeatureBuilder` | — |
+| S4a | `EpcEnergyDerivationService` | (new) | Derives baseline kWh + fuel split + bills from the Effective EPC's energy fields (`energy_consumption_current`, `heating_cost_current`, `hot_water_cost_current`). Applies UCL-style correction for known EPC over/under-prediction, then deduces fuel type (gas/electric/other) for heating + hot water to split consumption. Deterministic, no ML. | — | — |
+| S4b | `RebaseliningService` | (new, partial overlap with old "rebaselining" logic) | When the Effective EPC's physical state differs from the originally lodged EPC (Site Notes or Landlord Overrides applied), calls SAP/carbon/heat ML lambdas to produce new baseline values. kWh under the new state is re-derived via `EpcEnergyDerivationService`, not ML. | `FeatureBuilder` | — |
 | S5 | `RecommendationService` | 6 | Generates per-property recommendations using `effective_epc`, materials, exclusions, etc. Replaces current `Recommendations` (1383 LOC). | `MaterialsRepo` | — |
-| S6 | `ImpactPredictionService` | 7 | Calls SAP / carbon / heat / bills impact lambda for each recommendation. | `FeatureBuilder` | — |
+| S6 | `ImpactPredictionService` | 7 | Calls SAP / carbon / heat impact lambda for each recommendation. | `FeatureBuilder` | — |
+| S6b | `KwhImpactService` | 5 (partial) | Calls kWh ML lambda to predict the kWh delta per recommendation; used to compute bill savings on the optimised package. | `FeatureBuilder` | — |
 | S7 | `OptimiserService` | 8 | Produces optimised retrofit packages. Wraps current `CostOptimiser` / `GainOptimiser` / `optimise_with_scenarios`. | — | — |
 | S8 | `ResultsPersister` | 9 | Final step: writes plans, recommendations, property updates via repos under one UoW. | — | All write repos |
 
@@ -429,19 +434,22 @@ Twelve classes implement the modelling pipeline end-to-end. Detailed signatures
 For each `Property` in the batch:
 
 ```
-1. PropertyRepo.get()  →  Property (epc, site_notes, overrides, geospatial, solar)
-2. EpcRemappingService — if epc is in legacy schema, upgrade to current
-3. EpcPredictionService — produce predicted EPC + anomaly flags (always runs)
-4. Compute Property.effective_epc (path-1 or path-2)
-5. KwhPredictionService — baseline kwh + bills
-6. RecommendationService — generate candidate measures
-7. ImpactPredictionService — predict per-measure impact
-8. OptimiserService — select optimal package
-9. KwhPredictionService — re-score on optimised package (tenant savings)
-10. ResultsPersister — write Plan + Recommendations under one UoW
+1.  PropertyRepo.get()  →  Property (epc, site_notes, overrides, geospatial, solar)
+2.  EpcRemappingService — if epc is in legacy schema, upgrade to current
+3.  EpcPredictionService — produce predicted EPC + anomaly flags (always runs)
+4.  Compute Property.effective_epc (path-1 or path-2)
+5.  RebaseliningService — IF effective_epc differs from lodged EPC, re-predict SAP/carbon/heat via ML
+6.  EpcEnergyDerivationService — derive baseline kWh + fuel split + bills from the (possibly rebaselined) Effective EPC. No ML.
+7.  RecommendationService — generate candidate measures
+8.  ImpactPredictionService — predict per-measure SAP/carbon/heat impact (ML)
+9.  OptimiserService — select optimal package
+10. KwhImpactService — predict kWh + bill delta for the optimised package (ML)
+11. ResultsPersister — write Plan + Recommendations under one UoW
 ```
 
-Steps 1–4 are per-property. Steps 5–9 batch the whole batch into one ML call where possible (the lambdas accept a DataFrame; today's code already batches).
+Steps 1–4 are per-property. Steps 5, 8, 10 batch the whole batch into one ML call where possible (the lambdas accept a DataFrame; today's code already batches). Steps 6 and 7 are deterministic per-property.
+
+Note vs the current `model_engine`: the **pre-recommendation** kWh ML call has been removed. Baseline kWh now comes from the Effective EPC directly (the new gov EPC API exposes `energy_consumption_current` and per-end-use cost fields). ML is reserved for **post-recommendation impact prediction** only.
 
 ### 9.5 Per-service contracts — deferred
 
@@ -530,7 +538,7 @@ ara/                                  # new top-level package, sibling of backen
     └── integration/                  # real DB + real SQS via localstack
 ```
 
-`backend/` continues to host the legacy code during phase 1. Once the last portfolio is migrated, `backend/engine/`, `backend/SearchEpc.py`, `backend/Property.py` are deleted.
+`backend/` continues to host the legacy code until the new pipeline is live. Once `model_engine` is no longer serving any traffic, `backend/engine/`, `backend/SearchEpc.py`, and the legacy `backend/Property.py` are deleted.
 
 Reused intact (no rewrite needed):
 
@@ -605,7 +613,8 @@ A landlord uploads a corrected heating system for UPRN 12345 via the UI.
 3. **ModellingPipeline** invoked on a batch of `[12345]`:
    - Reads `Property(uprn=12345)` from `PropertyRepo`.
    - `Property.effective_epc` = epc + landlord_overrides → heating system fields differ from baseline.
-   - Rebaselining triggered: `KwhPredictionService` re-predicts baseline SAP / carbon / heat / kwh.
+   - `RebaseliningService` triggered: ML re-predicts SAP / carbon / heat against the new effective EPC.
+   - `EpcEnergyDerivationService` re-runs over the new effective EPC to derive baseline kWh + fuel split + bills (no ML).
    - `RecommendationService` regenerates recommendations against the new baseline.
    - `OptimiserService` re-picks optimal package.
    - `ResultsPersister` writes new plan under one UoW (old plan is superseded; whether to soft-archive is a sub-PRD (iii) decision).
@@ -624,6 +633,9 @@ Total external calls: zero. The override write is the only thing that hit a netw
 6. **Soft-archive vs hard-overwrite** for superseded plans (§14) — affects audit / undo behaviour. Defer to sub-PRD (iii).
 7. **Building-level optimisation as a Phase 2 service** (§10) — agreed deferred; flag for roadmap discussion.
 8. **Transform versioning policy** (§8.3) — semver chosen; team to confirm bump conventions.
+9. **UCL EPC-correction model** (§9.2 S4a) — need the reference paper, the implementation we've used before, and a decision on whether to port directly or re-implement against the new EPC schema.
+10. **Fuel-price source for bill calculation** (§9.2 S4a) — Ofgem caps? Time-varying? Per-portfolio override? Decide alongside `EpcEnergyDerivationService` design.
+11. **kWh handling under Rebaselining** (§9.4 step 5) — confirmed: ML re-predicts SAP/carbon/heat only; `EpcEnergyDerivationService` re-runs for kWh. Validate that this is sufficient when overrides change heating fuel type (which would shift the fuel deduction).
 
 ---
 
@@ -645,4 +657,4 @@ Each sub-PRD owner: TBC. Each is independently reviewable but consumes the contr
 4. Stand up the empty `ara/` package skeleton + fakes + first integration-test scaffold as PR-1.
 5. Land services in dependency order: domain → repos → fetchers → services → orchestrators → API.
 
-Phase 1 milestone gate: first portfolio routed through new pipeline with parity against old engine (manual spot-check on 5 representative properties).
+Phase 1 milestone gate: first portfolio (Calico or Hyde) routed through the new pipeline end-to-end in June, with a manual spot-check on 5 representative properties to confirm outputs are reasonable. No parity-against-old-engine check — the old engine is dead by then.

From f8bd13cb63623680a6856f9dffe4f87b950fe4b1 Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Thu, 14 May 2026 07:39:18 +0000
Subject: [PATCH 5/8] editing per portfolio feature flag

---
 ara_backend_design.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ara_backend_design.md b/ara_backend_design.md
index 4d50b3b6..109dd1fa 100644
--- a/ara_backend_design.md
+++ b/ara_backend_design.md
@@ -74,7 +74,7 @@ Forced cut-over, driven by the 30 May deadline. There is no strangler period bec
 
 - On 30 May the Old EPC API dies; `model_engine` ceases to function for any new modelling run.
 - Some downtime is expected and accepted. Clients are aware.
-- Modelling resumes when the new pipeline is ready end-to-end. There is no per-portfolio feature flag, no parallel pipelines, no traffic split — the new pipeline is the only pipeline.
+- Modelling resumes when the new pipeline is ready end-to-end. Remains to be decided if we have a per-portfolio flag, purely for the front end to reference old tables where necessary. No parallel pipelines, no traffic split — the new pipeline is the only pipeline.
 - **Calico** and **Hyde** are the first live clients onto the new pipeline in June.
 - `model_engine`, `SearchEpc`, the legacy `Property`, and surrounding modules in `backend/` are deleted once the new pipeline is serving all traffic.
 
@@ -82,7 +82,7 @@ Forced cut-over, driven by the 30 May deadline. There is no strangler period bec
 
 - No strangler — there is nothing to strangle once the Old EPC API dies on 30 May.
 - No parallel-shadow run — would double compute and require diff tooling we don't have, while the old engine is already known to return bad data so diffs would be noise.
-- No per-portfolio feature flag — the cut-over is all-or-nothing.
+- TBC per-portfolio feature flag. Without this, the cut-over is all-or-nothing. All old portfolios are broken.
 
 ---
 

From 8d6c770da8d4215f88c0b5b037d86b3181fdc237 Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Thu, 14 May 2026 16:36:22 +0000
Subject: [PATCH 6/8] grilling session updates to prd

---
 ara_backend_design.md | 350 ++++++++++++++++++++++++++++--------------
 1 file changed, 236 insertions(+), 114 deletions(-)

diff --git a/ara_backend_design.md b/ara_backend_design.md
index 109dd1fa..b6aa8f22 100644
--- a/ara_backend_design.md
+++ b/ara_backend_design.md
@@ -26,6 +26,7 @@ Beyond just swapping API clients, this is the moment to **rebuild the backend in
 - Service boundaries that other team members can read, fix, and extend without needing the entire mental model.
 - Repository-mediated persistence so business logic can be tested without spinning up a database.
 - A separation between **data fetching** (slow, IO-heavy, external) and **modelling** (deterministic, fast, internal).
+- Baseline kWh and bills derived deterministically from the Effective EPC (SAP physics + UCL correction + per-fuel rates from a refreshable repo) rather than from the EPC's stale cost fields or from an ML kWh prediction.
 
 ### 1.3 Out of scope for this PRD
 
@@ -159,14 +160,35 @@ The existing `trigger_plan_entrypoint` SQS-chunking pattern is kept. Both pipeli
 
 UPRN partitioning: the trigger endpoint groups UPRNs by **locality** (postcode prefix / UPRN range) before chunking, so each batch maximises shared upstream fetches (one geospatial-range pull serves all 30 properties in the batch).
 
-### 4.5 One API or two? (deferred)
+### 4.5 One endpoint for v1
 
-The team will decide at implementation time whether Ingestion and Modelling sit behind:
+For Phase 1 we ship **one trigger endpoint** that internally chains Ingestion → Modelling via `RefreshOrchestrator`. This matches the current FastAPI-fronted Lambda pattern (the FastAPI app in `services/<svc>/` is a thin entrypoint that invokes the modelling Lambda).
 
-- **(a) One unified API** with a single trigger endpoint that runs both phases. Most closely mimics what's live today.
-- **(b) Two APIs**, each with its own trigger, RefreshOrchestrator chains them. Separate API call for fetching and modelling.
+We can split into two endpoints later (refresh-only vs model-only) once a real workflow demands it — e.g. a Landlord-Override edit that should re-model without re-fetching open data. The class taxonomy and `RefreshOrchestrator` boundary allow this split without re-architecting.
 
-Either is workable if the class taxonomy is preserved. Deferred to implementation review.
+### 4.6 Trigger contract
+
+The trigger payload is reduced compared to today's `PlanTriggerRequest` ([backend/app/plan/schemas.py:98](../../backend/app/plan/schemas.py#L98)) — most of what's currently in the request body moves into the persisted `Scenario` aggregate.
+
+```python
+class ModelTriggerRequest(BaseModel):
+    portfolio_id: UUID
+    property_ids: list[UUID] | S3Ref           # inline up to ~10k, S3 ref above
+    scenario_ids: list[UUID]                   # 1+; resolved + pinned to ScenarioSnapshot at fan-out
+    task_id: UUID
+    subtask_id: UUID                           # SQS state machine, preserved from today
+```
+
+Everything that used to ride at the top level dies or moves:
+
+- `goal`, `budget`, `goal_value`, `inclusions`, `exclusions`, `required_measures`, `enforce_fabric_first`, `scenario_name`, `housing_type` → into `Scenario` / `ScenarioPhase`.
+- `patches_file_path`, `already_installed_file_path`, `non_invasive_recommendations_file_path` → gone; Landlord Overrides covers all three.
+- `valuation_file_path` → gone; `ValuationService` derives it.
+- `ashp_cop`, `default_u_values` → `HeatingSystemAssumptionsRepo` / global config; not per-trigger.
+- `multi_plan` → gone; `scenario_ids: list[...]` handles N runs natively (one Plan per scenario per property).
+- `event_type`, `epc_certificate_number`, `lmk_key`, `file_format`, `sheet_name`, `index_start`/`index_end`, `file_type` → ingestion-side concerns; if needed, ride on a separate ingestion-trigger payload.
+
+**Scenario snapshotting**: at fan-out time `RefreshOrchestrator` reads each requested `Scenario`, writes a `ScenarioSnapshot` keyed by `(task_id, scenario_id)`, and per-batch SQS messages reference the snapshot. Mid-run edits to the live `Scenario` do not affect an in-flight modelling job. Snapshots are read-only and can be garbage-collected after the task completes.
 
 ---
 
@@ -200,10 +222,10 @@ class Property:
     epc_anomaly_flags: Optional[EpcAnomalyFlags]  # from EpcPredictionService vs neighbours
 
     # --- Modelling outputs ---
-    baseline_performance: Optional[BaselinePerformance]   # SAP/carbon/heat (from EPC or rebaselined ML) + kWh + fuel split (always EPC + UCL + fuel deduction)
+    baseline_performance: Optional[BaselinePerformance]   # carries lodged + effective pair; see §5.4
     recommendations: list[Recommendation]
     impact_predictions: Optional[ImpactPredictions]
-    optimised_package: Optional[OptimisedPackage]
+    plans: list[Plan]                                     # one per Scenario the property was modelled against
 
     # --- Derived ---
     @property
@@ -238,14 +260,51 @@ Services typically take and return `Properties`, not lists.
 | Aggregate | Owns | Repo |
 |---|---|---|
 | `Property` | property identity, epc, site_notes, landlord_overrides, enrichments, modelling results | `PropertyRepo` |
-| `Plan` | per-property modelling output, scenario membership, plan + recommendations + parts | `RecommendationsRepo` |
-| `Scenario` | portfolio-wide scenario metadata | `RecommendationsRepo` |
+| `Plan` | per-property modelling output for one Scenario: ordered `phases: list[PlanPhase]`, each carrying its `OptimisedPackage`, ending state snapshot, and rolled-over options | `RecommendationsRepo` |
+| `Scenario` | portfolio-wide scenario metadata (goal, budget, exclusions, housing type) plus ordered `phases: list[ScenarioPhase]`; each phase carries `measure_types_allowed`, phase budget, phase target | `RecommendationsRepo` |
+| `ScenarioSnapshot` | frozen copy of a `Scenario` pinned at trigger time, keyed by `(task_id, scenario_id)`, so mid-run scenario edits don't affect an in-flight modelling job | `RecommendationsRepo` |
 | `Subtask` / `Task` | SQS fanout state | `SubtaskRepo` |
 | `EpcCache` | gov-API responses keyed by UPRN, with freshness/TTL | `EpcCacheRepo` |
 | `GenericData` | UPRN-range geospatial, postcode lookups, shared static data | `GenericDataRepo` |
+| `FuelRates` | time-versioned, region-aware per-fuel rates (pence/kWh), standing charges, SEG export rate, calorific values | `FuelRatesRepo` |
+| `CarbonFactors` | time-versioned per-fuel CO2 emission factors (kgCO2e/kWh); Defra publishes annually | `CarbonFactorsRepo` |
+| `HeatingSystemAssumptions` | boiler efficiency tables, ASHP/GSHP COPs, solar-thermal coverage proportion; per-property physical assumptions, not fuel-market data | `HeatingSystemAssumptionsRepo` |
 
 Aggregates are loaded **whole** — never half a `Property`. If a slice is too large to load eagerly (e.g. recommendation history), it lives in a separate aggregate.
 
+A single-phase Scenario is `phases: [<one ScenarioPhase>]` with all measure types allowed and the full budget on it — no special-case path through the pipeline.
+
+### 5.4 `BaselinePerformance` carries lodged + effective
+
+```python
+@dataclass
+class BaselinePerformance:
+    # As-lodged: unmodified EPC fields (or Site Notes' recorded values where Site Notes are the source).
+    lodged_sap: int
+    lodged_band: Epc
+    lodged_carbon: float
+    lodged_heat_demand: float
+
+    # Effective: what the modelling pipeline actually scored against.
+    # Equals lodged when neither rebaselining trigger fires; equals ML output when rebaselined.
+    effective_sap: int
+    effective_band: Epc
+    effective_carbon: float
+    effective_heat_demand: float
+
+    # kWh / fuel split / bills — always derived deterministically from the Effective EPC by
+    # EpcEnergyDerivationService (SAP physics + UCL correction + FuelRates lookup).
+    # Lodged kWh / bills are not stored separately — the EPC's cost fields are stale by design.
+    annual_kwh: float
+    fuel_split: dict[Fuel, float]
+    annual_bills: dict[Fuel, float]
+
+    rebaselined: bool
+    rebaseline_reason: Optional[Literal["pre_sap10", "physical_state_changed", "both"]]
+```
+
+The pair lets the FE show "lodged rating vs SAP10-equivalent rebaselined rating" side by side without a separate query. Both fields are always populated; when no rebaselining trigger fires, `effective_*` equals `lodged_*`.
+
 ---
 
 ## 6. Source-of-truth and overlay precedence
@@ -275,12 +334,16 @@ This tie-break is implemented in `Property.source_path` and may be tuned later (
 
 ### 6.4 Rebaselining trigger
 
-The modelling pipeline re-predicts SAP / carbon / heat / kwh whenever:
+ML re-predicts SAP / carbon / heat when **either** of these holds:
 
-- `effective_epc` differs from the canonical baseline (i.e. raw EPC with no overrides), **or**
-- The previous modelling snapshot is missing or stale.
+1. **Pre-SAP10 schema** — `effective_epc.sap_version < 10.0`. The EPC was rated under SAP 2012 (or earlier) and we want a SAP10-equivalent baseline so all properties are scored against the same model version. Canonical signal is the `sap_version: float` field; fall back to `schema_type` string, then to `lodgement_date` if both are absent. Site Notes are assumed SAP10 by construction (PasHub / ECMK produce them now) — Path 1 typically doesn't trigger this leg.
+2. **Physical state changed** — `effective_epc` differs from the lodged EPC's physical fields (walls / heating / windows / etc.). Triggered by Landlord Overrides changing physical state, or by Site Notes that contradict the lodged EPC.
 
-The exact diff mechanism (hash of effective EPC, dirty-flag on overrides, timestamp comparison) is an implementation detail; recommendation is to start with a content hash stored alongside the previous run.
+When triggered, a single ML call re-predicts SAP/carbon/heat with the current Effective EPC state as input. Both reasons can fire together; the prediction is still one call.
+
+kWh is **always** re-derived via `EpcEnergyDerivationService` — even when no ML rebaseline runs, because fuel rates change over time and the EPC's cost fields are stale by design.
+
+The diff mechanism for "physical state changed" (content hash, dirty flag, etc.) is an implementation detail; start with a content hash of the physical-state subset of `EpcPropertyData` stored alongside the previous run.
 
 ### 6.5 Deprecated concepts
 
@@ -327,8 +390,11 @@ UoW owns the SQLAlchemy session lifecycle. Repos use the session passed in via t
 | `EpcCacheRepo` | new table: `epc_api_cache` (TTL, raw API response, mapped `EpcPropertyData`) |
 | `SiteNotesRepo` | new table: `site_notes` (replaces current `energy_assessments`) |
 | `LandlordOverridesRepo` | new table: `landlord_overrides` (sparse, per-field rows for audit) |
-| `RecommendationsRepo` | `plans`, `recommendations`, `recommendation_parts`, `scenarios` |
+| `RecommendationsRepo` | `plans`, `plan_phases`, `recommendations`, `recommendation_parts`, `scenarios`, `scenario_phases`, `scenario_snapshots` |
 | `GenericDataRepo` | new table or S3-backed: UPRN-range geospatial + postcode-keyed shared static data |
+| `FuelRatesRepo` | new table: `fuel_rates` — `(fuel_type, rate_pence_per_kwh, standing_charge_pence_per_day, calorific_value_kwh_per_unit, unit, effective_from, effective_to, region_code Optional, source)`. SEG export rate is a row with `fuel_type = 'electricity_export'`. |
+| `CarbonFactorsRepo` | new table: `carbon_factors` — `(fuel_type, kgco2e_per_kwh, effective_from, effective_to, source)`. Defra publishes annually. |
+| `HeatingSystemAssumptionsRepo` | new table(s): boiler efficiency, ASHP/GSHP COP, solar-thermal coverage proportion. Static-ish, manual refresh. |
 | `SubtaskRepo` | `tasks`, `subtasks` (existing) |
 
 DDL migrations are scoped to sub-PRD (iii).
@@ -380,14 +446,16 @@ The interesting work — flattening `List[SapWindow]`, `List[SapBuildingPart]` i
 
 Bump major when removing or renaming columns. Bump minor when adding optional columns (older models still scoreable; new models can be trained against new fields).
 
-### 8.4 Two model families, one transform
+### 8.4 ML model families
 
-Both ML services use the same transform:
+Both ML calls (rebaselining + per-measure impact) use the same `EpcMlTransform`:
 
 | Service | Lambda | Target |
 |---|---|---|
-| `KwhImpactService` (service #5) | `kwh-models-*` | per-measure annual kWh + bills delta (post-optimisation re-score only) |
-| `ImpactPredictionService` (service #7) | `impact-models-*` | SAP, carbon, heat demand per-measure impact |
+| `RebaseliningService` (S4b) | `baseline-models-*` | SAP / carbon / heat demand under the current Effective EPC state (SAP10-equivalent) |
+| `ImpactPredictionService` (S6) | `impact-models-*` | SAP / carbon / heat demand impact per measure (and per battery option, using new EPC battery fields) |
+
+Annual kWh and bills are never an ML target — derived deterministically by `EpcEnergyDerivationService` (S4a). Recommendation kWh delta is derived from the SAP delta predicted by S6 plus heating-system fuel + COP, not via a separate ML call.
 
 The two families are trained against the same input feature schema; only target columns differ. Sub-PRD (ii) handles training-time details.
 
@@ -395,16 +463,20 @@ The two families are trained against the same input feature schema; only target
 
 ## 9. Service catalogue
 
-Twelve classes implement the modelling pipeline end-to-end. Detailed signatures are deliberately left for implementers — this PRD documents purpose, dependencies, and rough shape.
+The classes below implement the pipeline end-to-end. Detailed signatures are deliberately left for implementers — this PRD documents purpose, dependencies, and rough shape; per-service grill sessions produce the contracts.
+
+**Out of the legacy engine** (deleted, not migrated): `PredictionMatrix` (debug-only, moves to test fixtures), `extract_portfolio_aggregation_data` (dead code, FE aggregates dynamically per §10), inspections plumbing (`inspections_map` is initialised but never populated in the current engine), patches / `already_installed` / `non_invasive_recommendations` (subsumed by Landlord Overrides), ECO4 / WHLG funding integration (`get_funding_data` and `optimise_with_scenarios`' funding paths), the pre-recommendation kWh ML lambda (`KWH_MODEL_PREFIXES`), and floor-count / heat-loss-perimeter estimation from geospatial (now on `EpcPropertyData`). Address matching (`address2UPRN`) lives as a separate service, not inside `EpcClientService`.
 
 ### 9.1 Fetchers (called by `IngestionPipeline`)
 
 | # | Class | Purpose | Dependencies |
 |---|---|---|---|
-| F1 | `EpcClientService` | Fetches EPCs from new gov API. Already exists at `backend/epc_client/`. | httpx |
-| F2 | `GeospatialFetcher` | Fetches UPRN-range geospatial data (replaces `OpenUprnClient` use in current engine). | S3 / Ordnance Survey API |
+| F1 | `EpcClientService` | Fetches EPCs from new gov API. Already exists at `backend/epc_client/`. Scope narrows compared to current `SearchEpc` — address matching (`address2uprn`) and OS API estimation are not its concern. | httpx |
+| F2 | `GeospatialFetcher` | Fetches UPRN-range geospatial data. Replaces `OpenUprnClient`. **Floor count and heat-loss perimeter estimation are no longer needed** — both are now on `EpcPropertyData` directly (`number_of_storeys`, `SapFloorDimension.heat_loss_perimeter_m`). Scope reduces to building geometry and postcode-area context. | S3 / Ordnance Survey API |
 | F3 | `SolarFetcher` | Wraps Google Solar API; building-level + unit-level scenes. | Google Solar API |
 | F4 | `SiteNotesIngester` | Loads site notes from Excel uploads / structured input. Persists via `SiteNotesRepo`. | S3, repo |
+| F5 | `FuelRatesFetcher` | Scheduled ETL — scrapes Ofgem regional caps and per-fuel rates, writes timeseries rows to `FuelRatesRepo`. Manual CSV upload fallback for off-cycle corrections. | Ofgem feed, repo |
+| F6 | `CarbonFactorsFetcher` | Same shape as F5 against Defra's annual CO2 factor publication. | Defra feed, repo |
 
 ### 9.2 Domain services (called by `ModellingPipeline`)
 
@@ -413,13 +485,13 @@ Twelve classes implement the modelling pipeline end-to-end. Detailed signatures
 | S1 | `EpcRemappingService` | 4 | Re-map legacy / historical EPCs into new `EpcPropertyData` shape. | `EpcCacheRepo` | `EpcCacheRepo` (mapped column) |
 | S2 | `EpcPredictionService` | 3 | For every property: produce predicted EPC + per-field anomaly flags vs neighbours. Used both for gap-fill (Path 2 if EPC missing) and UI surfacing. | `EpcCacheRepo`, `GenericDataRepo` | — |
 | S3 | `FeatureBuilder` | (new) | Wraps `EpcMlTransform`. Converts `Properties` → scoring DataFrame. | — | — |
-| S4a | `EpcEnergyDerivationService` | (new) | Derives baseline kWh + fuel split + bills from the Effective EPC's energy fields (`energy_consumption_current`, `heating_cost_current`, `hot_water_cost_current`). Applies UCL-style correction for known EPC over/under-prediction, then deduces fuel type (gas/electric/other) for heating + hot water to split consumption. Deterministic, no ML. | — | — |
-| S4b | `RebaseliningService` | (new, partial overlap with old "rebaselining" logic) | When the Effective EPC's physical state differs from the originally lodged EPC (Site Notes or Landlord Overrides applied), calls SAP/carbon/heat ML lambdas to produce new baseline values. kWh under the new state is re-derived via `EpcEnergyDerivationService`, not ML. | `FeatureBuilder` | — |
-| S5 | `RecommendationService` | 6 | Generates per-property recommendations using `effective_epc`, materials, exclusions, etc. Replaces current `Recommendations` (1383 LOC). | `MaterialsRepo` | — |
-| S6 | `ImpactPredictionService` | 7 | Calls SAP / carbon / heat impact lambda for each recommendation. | `FeatureBuilder` | — |
-| S6b | `KwhImpactService` | 5 (partial) | Calls kWh ML lambda to predict the kWh delta per recommendation; used to compute bill savings on the optimised package. | `FeatureBuilder` | — |
-| S7 | `OptimiserService` | 8 | Produces optimised retrofit packages. Wraps current `CostOptimiser` / `GainOptimiser` / `optimise_with_scenarios`. | — | — |
-| S8 | `ResultsPersister` | 9 | Final step: writes plans, recommendations, property updates via repos under one UoW. | — | All write repos |
+| S4a | `EpcEnergyDerivationService` | (new) | Derives annual kWh + fuel split + bills from the Effective EPC. Deterministic, no ML. Pipeline: (1) source regulated PEUI — either from `energy_consumption_current × floor_area` when EPC field present and no physical override, or from SAP physics (heat demand × area + SAP hot-water + SAP lighting) for Site Notes / overridden cases; (2) add appliance + cooking via SAP Appendix L formulas (port of [`AnnualBillSavings.estimate_appliances_energy_use`](../../backend/ml_models/AnnualBillSavings.py)); (3) apply UCL per-band correction (Few et al. 2023, Table 3), keyed on the **post-state Effective EPC's band** — not the lodged band; (4) decompose total PEUI into end-use shares via SAP-physics proportions; (5) primary→delivered per fuel using SAP primary factors; (6) bills = delivered kWh per fuel × current rate from `FuelRatesRepo` + standing charges + SEG credits. CO2 emissions from `CarbonFactorsRepo`. | `FuelRatesRepo`, `CarbonFactorsRepo`, `HeatingSystemAssumptionsRepo` | — |
+| S4b | `RebaseliningService` | (new, partial overlap with old "rebaselining" logic) | Triggered by §6.4 conditions (pre-SAP10 schema **or** physical state changed). Calls SAP/carbon/heat ML lambdas to produce SAP10-equivalent baseline against the current Effective EPC state. Both `BaselinePerformance.lodged_*` and `effective_*` are populated downstream — pair is always stored, equal when not rebaselined. kWh is re-derived via S4a, not ML. | `FeatureBuilder` | — |
+| S5 | `RecommendationService` | 6 | Generates per-property recommendations against the current rolling Effective EPC. Invoked **once per (scenario × phase)** — filters candidates to the phase's `measure_types_allowed`, returns candidates eligible against the post-prior-phase state. Replaces current `Recommendations` (1383 LOC). | `MaterialsRepo` | — |
+| S6 | `ImpactPredictionService` | 7 | Calls SAP / carbon / heat impact ML lambda for **every** candidate recommendation (FE displays all options to user). Invoked per (scenario × phase) with the rolling state's feature vector. Recommendation kWh delta is derived deterministically from SAP delta + heating-system fuel/COP, not from a separate ML call. Battery impact uses the new EPC battery fields (`energy_pv_battery_count`, `energy_pv_battery_capacity`) as ML inputs — the deterministic `BatterySAPScorer` from the legacy engine is replaced by ML prediction. | `FeatureBuilder` | — |
+| S7 | `OptimiserService` | 8 | Per-phase optimisation against rolling state. Reads `PlanPhase.state_at_end[n-1]` to honour cross-phase constraints (fabric-first, heat-pump-needs-insulation, ventilation). Wraps current `CostOptimiser` / `GainOptimiser` / `optimise_with_scenarios` minus the dead ECO-funding paths. Unselected candidates roll into phase n+1's candidate pool (auto vs user-marked TBD, §15). | — | — |
+| S8 | `ValuationService` | — | Estimates per-property valuation (current + post-retrofit) from academic-paper-based regression on EPC change, property type, region. Improvement on the existing `PropertyValuation.estimate` code — exact shape deferred to per-service grill. | — | — |
+| S9 | `ResultsPersister` | 9 | Final step: writes Plan (with `phases[]`) + Recommendations + Property updates via repos under one UoW, per scenario. | — | All write repos |
 
 ### 9.3 Orchestrators
 
@@ -431,25 +503,42 @@ Twelve classes implement the modelling pipeline end-to-end. Detailed signatures
 
 ### 9.4 `ModellingPipeline` step order
 
-For each `Property` in the batch:
+For each `Property` in the batch, against each pinned `ScenarioSnapshot` from the trigger payload:
 
 ```
-1.  PropertyRepo.get()  →  Property (epc, site_notes, overrides, geospatial, solar)
-2.  EpcRemappingService — if epc is in legacy schema, upgrade to current
-3.  EpcPredictionService — produce predicted EPC + anomaly flags (always runs)
-4.  Compute Property.effective_epc (path-1 or path-2)
-5.  RebaseliningService — IF effective_epc differs from lodged EPC, re-predict SAP/carbon/heat via ML
-6.  EpcEnergyDerivationService — derive baseline kWh + fuel split + bills from the (possibly rebaselined) Effective EPC. No ML.
-7.  RecommendationService — generate candidate measures
-8.  ImpactPredictionService — predict per-measure SAP/carbon/heat impact (ML)
-9.  OptimiserService — select optimal package
-10. KwhImpactService — predict kWh + bill delta for the optimised package (ML)
-11. ResultsPersister — write Plan + Recommendations under one UoW
+Per-property setup (runs once regardless of scenario count):
+  1.  PropertyRepo.get()  →  Property (epc, site_notes, overrides, geospatial, solar)
+  2.  EpcRemappingService — if epc is in legacy schema, upgrade to current
+  3.  EpcPredictionService — predicted EPC + per-field anomaly flags (always runs)
+  4.  Compute Property.effective_epc (path-1 or path-2)
+  5.  RebaseliningService — IF §6.4 conditions hold (pre-SAP10 OR physical state changed),
+                            re-predict SAP/carbon/heat via ML against the Effective EPC state.
+                            Populate BaselinePerformance.lodged_* + effective_*.
+  6.  EpcEnergyDerivationService — SAP-physics + UCL (post-state band) + FuelRates → kWh, fuel split, bills.
+
+Per-scenario loop:
+  Per-phase loop (in scenario phase order):
+    7.  RecommendationService — generate candidate measures, restricted to phase's measure_types_allowed,
+                                 against the rolling Effective EPC state (baseline for phase 1; updated for phase 2+).
+    8.  ImpactPredictionService — predict SAP/carbon/heat impact for those candidates, ML scored against
+                                   the rolling state's feature vector. All candidates scored (FE shows options).
+    9.  OptimiserService — select package within phase budget + phase goal. Reads earlier-phase state to honour
+                            cross-phase constraints (fabric-first, heat-pump-needs-insulation, ventilation).
+    10. Apply package → roll state forward (simulate post-package SAP / kWh / bills via S4a + impact predictions
+                        from step 8). Record `PlanPhase.state_at_end`. Unselected options become
+                        `PlanPhase.rolled_over_options` and are eligible candidates next phase.
+  11. ResultsPersister — write Plan (phases[]) + Recommendations under one UoW for this scenario.
 ```
 
-Steps 1–4 are per-property. Steps 5, 8, 10 batch the whole batch into one ML call where possible (the lambdas accept a DataFrame; today's code already batches). Steps 6 and 7 are deterministic per-property.
+Steps 1–6 run **once per property** regardless of scenario count.
+Steps 7–10 run **once per (scenario × phase)** per property.
+Step 11 runs once per scenario per property.
 
-Note vs the current `model_engine`: the **pre-recommendation** kWh ML call has been removed. Baseline kWh now comes from the Effective EPC directly (the new gov EPC API exposes `energy_consumption_current` and per-end-use cost fields). ML is reserved for **post-recommendation impact prediction** only.
+Batching: steps 5, 8 batch the whole batch into one ML call where possible. Step 8's cost scales with `N_phases × N_scenarios × N_candidate_measures`; multi-phase pays its own ML bill, single-phase scenarios cost the same as today.
+
+Note vs the current `model_engine`: the **pre-recommendation** kWh ML call has been removed. Baseline kWh now comes from `EpcEnergyDerivationService` (SAP physics + UCL + FuelRates). ML is reserved for SAP/carbon/heat (rebaselining + impact prediction). Recommendation-level kWh delta is derived deterministically from the impact-predicted SAP delta plus heating-system fuel + COP from `HeatingSystemAssumptionsRepo`; no separate kWh ML lambda.
+
+**Open future change** (flagged §15): SAP-impact-of-a-measure is not strictly additive — installing measure A changes the SAP impact of measure B. The current per-measure ML scoring + linear optimisation approximates this. A future iteration may pre-define candidate packages and ML-score whole packages, accepting the combinatorial cost in return for accuracy. Defer until implementation reveals where the approximation hurts.
 
 ### 9.5 Per-service contracts — deferred
 
@@ -467,85 +556,113 @@ Method signatures, return types, error semantics, and edge-case behaviour are **
 
 ---
 
-## 11. Directory layout
+## 11. Repository layout — monorepo via uv workspaces
 
-Proposal — team to tweak.
+The repo is restructured as a Python monorepo using **uv workspaces**. Shared types and shared infra live as workspace packages under `packages/`; each deployable Lambda or microservice lives as its own package under `services/`. Each `services/<svc>/` has its own `pyproject.toml`, `Dockerfile`, and Lambda image — the bundle contains only that service's deps + its workspace deps, keeping cold-start size and package weight contained.
 
 ```
-ara/                                  # new top-level package, sibling of backend/
-├── domain/
-│   ├── __init__.py
-│   ├── property.py                   # Property aggregate
-│   ├── properties.py                 # Properties collection
-│   ├── identity.py                   # PropertyIdentity, AddressLines
-│   ├── site_notes.py                 # SiteNotes (replaces energy_assessment)
-│   ├── landlord_overrides.py
-│   ├── geospatial.py
-│   ├── solar.py
-│   ├── recommendations.py            # Recommendation, OptimisedPackage
-│   ├── predictions.py                # BaselinePredictions, ImpactPredictions
-│   ├── anomaly_flags.py              # EpcAnomalyFlags
-│   └── ml/
-│       ├── __init__.py
-│       ├── transform.py              # EpcMlTransform (versioned)
-│       └── schema.py                 # scoring DataFrame schema
+/
+├── pyproject.toml                      # workspace root
+├── uv.lock
 │
-├── fetchers/
-│   ├── __init__.py
-│   ├── epc_client.py                 # alias / re-export of backend/epc_client/
-│   ├── geospatial.py
-│   ├── solar.py
-│   └── site_notes_ingester.py
+├── packages/                           # shared workspace packages — imported by services/
+│   ├── domain/                         # "domna-domain"
+│   │   ├── pyproject.toml
+│   │   └── src/domain/
+│   │       ├── property.py             # Property, Properties, PropertyIdentity
+│   │       ├── site_notes.py
+│   │       ├── landlord_overrides.py
+│   │       ├── baseline_performance.py # lodged + effective pair
+│   │       ├── plan.py                 # Plan, PlanPhase, OptimisedPackage
+│   │       ├── scenario.py             # Scenario, ScenarioPhase, ScenarioSnapshot
+│   │       ├── recommendation.py
+│   │       ├── geospatial.py
+│   │       ├── solar.py
+│   │       ├── anomaly_flags.py
+│   │       └── ml/
+│   │           ├── transform.py        # EpcMlTransform (versioned)
+│   │           └── schema.py
+│   │
+│   ├── repos/                          # "domna-repos" — persistence, no business logic
+│   │   ├── pyproject.toml
+│   │   └── src/repos/
+│   │       ├── unit_of_work.py
+│   │       ├── property_repo.py
+│   │       ├── epc_cache_repo.py
+│   │       ├── site_notes_repo.py
+│   │       ├── landlord_overrides_repo.py
+│   │       ├── recommendations_repo.py
+│   │       ├── generic_data_repo.py
+│   │       ├── fuel_rates_repo.py
+│   │       ├── carbon_factors_repo.py
+│   │       ├── heating_system_assumptions_repo.py
+│   │       └── subtask_repo.py
+│   │
+│   ├── fetchers/                       # "domna-fetchers" — external API clients
+│   │   ├── pyproject.toml
+│   │   └── src/fetchers/
+│   │       ├── epc_client.py           # wraps backend/epc_client/
+│   │       ├── geospatial.py
+│   │       ├── solar.py
+│   │       ├── fuel_rates_fetcher.py
+│   │       └── carbon_factors_fetcher.py
+│   │
+│   └── utils/                          # "domna-utils" — logging, AWS, S3, cloudwatch, subtasks
+│       ├── pyproject.toml
+│       └── src/utils/
 │
-├── repos/
-│   ├── __init__.py
-│   ├── unit_of_work.py
-│   ├── property_repo.py
-│   ├── epc_cache_repo.py
-│   ├── site_notes_repo.py
-│   ├── landlord_overrides_repo.py
-│   ├── recommendations_repo.py
-│   ├── generic_data_repo.py
-│   └── subtask_repo.py
+├── services/                           # deployable units, one Lambda image each
+│   ├── ara/                            # the modelling backend
+│   │   ├── pyproject.toml              # deps: domna-domain, domna-repos, domna-fetchers, domna-utils, ML libs
+│   │   ├── Dockerfile
+│   │   ├── src/ara/
+│   │   │   ├── services/               # EpcRemappingService, EpcPredictionService,
+│   │   │   │                           # EpcEnergyDerivationService, RebaseliningService,
+│   │   │   │                           # FeatureBuilder, RecommendationService,
+│   │   │   │                           # ImpactPredictionService, OptimiserService,
+│   │   │   │                           # ValuationService, ResultsPersister
+│   │   │   ├── orchestrators/          # IngestionPipeline, ModellingPipeline, RefreshOrchestrator
+│   │   │   └── lambdas/                # handler.py per Lambda + event-shape contracts
+│   │   └── tests/
+│   │       ├── fakes/                  # FakePropertyRepo, FakeEpcClient, etc.
+│   │       ├── unit/                   # service tests using fakes only
+│   │       └── integration/            # real DB + real SQS via localstack
+│   │
+│   ├── address2uprn/                   # messy-address → UPRN matching, pre-modelling step
+│   │   ├── pyproject.toml
+│   │   ├── Dockerfile
+│   │   └── src/address2uprn/
+│   ├── hubspot/                        # existing Hubspot ETL
+│   ├── pashub/                         # PasHub survey ingestion
+│   ├── ecmk/                           # ECMK assessment ingestion
+│   └── magicplan/                      # MagicPlan integration
 │
-├── services/
-│   ├── __init__.py
-│   ├── epc_remapping.py
-│   ├── epc_prediction.py             # nearby-similar + anomaly flags
-│   ├── feature_builder.py            # uses domain.ml.EpcMlTransform
-│   ├── kwh_prediction.py
-│   ├── impact_prediction.py
-│   ├── recommendation.py
-│   ├── optimiser.py                  # wraps recommendations/optimiser/
-│   └── results_persister.py
+├── backend/                            # legacy FastAPI app + microservices, kept until cut-over
+│   ├── app/                            # FastAPI; thin entrypoints that invoke service Lambdas
+│   └── ...                             # legacy engine, SearchEpc, etc.; deleted after cut-over
 │
-├── orchestrators/
-│   ├── __init__.py
-│   ├── ingestion_pipeline.py
-│   ├── modelling_pipeline.py
-│   └── refresh_orchestrator.py
-│
-├── api/
-│   ├── __init__.py
-│   ├── routers/
-│   │   ├── ingestion.py              # if two APIs
-│   │   └── modelling.py
-│   └── schemas/                      # request/response Pydantic models
-│
-└── tests/
-    ├── fakes/                        # FakePropertyRepo, FakeEpcClient, etc.
-    ├── unit/                         # service tests using fakes only
-    └── integration/                  # real DB + real SQS via localstack
+├── datatypes/                          # existing — EPC schemas; eventually folds into packages/domain/
+└── docs/
+    └── adr/                            # architectural decision records
 ```
 
-`backend/` continues to host the legacy code until the new pipeline is live. Once `model_engine` is no longer serving any traffic, `backend/engine/`, `backend/SearchEpc.py`, and the legacy `backend/Property.py` are deleted.
+**Boundary properties** (enforced by package structure, not convention):
+- A `services/<svc>/` package can `import domain.*`, `import repos.*`, `import fetchers.*`, `import utils.*`. It **cannot** import another service's modules — they're separate distributions with no cross-import path.
+- ADR-0003 (Ingestion / Modelling separation) is preserved: modelling services in `services/ara/src/ara/services/` depend only on `repos.*` + `domain.*`, never on fetchers. Orchestrators are the only place fetchers and services meet.
 
-Reused intact (no rewrite needed):
+**Migration** (incremental, not big-bang):
+1. Carve out `packages/domain/` first — fold `datatypes/epc/domain/` + the new aggregate types into it.
+2. Carve out `packages/utils/` from current `utils/` + `backend/utils/`.
+3. Carve out `packages/repos/` and `packages/fetchers/` once `services/ara/` is being built and needs them.
+4. `services/ara/` is greenfield — no legacy code lives in it.
+5. `services/address2uprn/`, `services/pashub/`, etc. are split out as their owners pick them up.
+6. `backend/` shrinks to the FastAPI entrypoint layer once everything else has moved.
 
-- `backend/epc_client/` — the new gov API client. Wrapped by `ara/fetchers/epc_client.py`.
-- `datatypes/epc/domain/` — the new EPC schema. `Property.epc: EpcPropertyData` references it directly.
-- `recommendations/optimiser/` — wrapped by `ara/services/optimiser.py`.
-- `backend/app/db/` — repos delegate into `db_funcs.*` until the SQL is rewritten under sub-PRD (iii).
+**Reused intact** (no rewrite needed at carve-out time):
+- `backend/epc_client/` → folds into `packages/fetchers/src/fetchers/epc_client.py`.
+- `datatypes/epc/domain/` → folds into `packages/domain/src/domain/epc/`.
+- `recommendations/optimiser/` → wrapped by `services/ara/src/ara/services/optimiser.py`.
+- `backend/app/db/` → repos delegate into `db_funcs.*` until SQL is rewritten under sub-PRD (iii).
 
 ---
 
@@ -625,7 +742,7 @@ Total external calls: zero. The override write is the only thing that hit a netw
 
 ## 15. Open questions for team review
 
-1. **One API vs two** (§4.5) — clean interfaces allow either; pick at implementation.
+1. **One endpoint vs two** (§4.5) — **resolved**: single endpoint for Phase 1; split later when a real workflow demands it.
 2. **`LandlordOverrides` shape** (§6.2) — flat-Excel-shape for v1, with a flag to revisit after first customer.
 3. **`already_installed` and `non_invasive_recommendations`** (§6.5) — both likely subsumed by overlay, but final call deferred.
 4. **Recency tie-break policy** (§6.3) — default "newer wins"; team to consider per-portfolio override.
@@ -633,9 +750,14 @@ Total external calls: zero. The override write is the only thing that hit a netw
 6. **Soft-archive vs hard-overwrite** for superseded plans (§14) — affects audit / undo behaviour. Defer to sub-PRD (iii).
 7. **Building-level optimisation as a Phase 2 service** (§10) — agreed deferred; flag for roadmap discussion.
 8. **Transform versioning policy** (§8.3) — semver chosen; team to confirm bump conventions.
-9. **UCL EPC-correction model** (§9.2 S4a) — need the reference paper, the implementation we've used before, and a decision on whether to port directly or re-implement against the new EPC schema.
-10. **Fuel-price source for bill calculation** (§9.2 S4a) — Ofgem caps? Time-varying? Per-portfolio override? Decide alongside `EpcEnergyDerivationService` design.
-11. **kWh handling under Rebaselining** (§9.4 step 5) — confirmed: ML re-predicts SAP/carbon/heat only; `EpcEnergyDerivationService` re-runs for kWh. Validate that this is sufficient when overrides change heating fuel type (which would shift the fuel deduction).
+9. **UCL EPC-correction model** (§9.2 S4a) — **resolved**: Few et al. 2023 (Energy & Buildings 288, 113024). Implementation pattern already in [`AnnualBillSavings.adjust_energy_to_metered`](../../backend/ml_models/AnnualBillSavings.py) — port the per-band gradients/intercepts (Table 3) into `EpcEnergyDerivationService`, keyed on the post-state Effective EPC band.
+10. **Fuel-price source for bill calculation** (§9.2 S4a) — **resolved**: `FuelRatesRepo` is a time-versioned, region-aware table; ETL by `FuelRatesFetcher` (Ofgem feed + manual upload fallback). Per-portfolio override deferred to v2 — confirm whether Calico / Hyde have bulk-buy contracts before first onboarding.
+11. **kWh handling under Rebaselining** (§9.4) — **resolved**: ML re-predicts SAP/carbon/heat only; `EpcEnergyDerivationService` re-derives kWh from the rebaselined Effective EPC. Heating-fuel-type change is handled naturally because S4a re-reads heating fields from the Effective EPC.
+12. **Phase rollover semantics** (§9.2 S7) — when a candidate measure isn't selected in phase n, does it auto-roll into phase n+1's candidate pool, or does the user mark which measure types can roll? Auto is simpler; user-marked is more flexible. Decide at scenario-builder UX time.
+13. **Package-level vs per-measure ML scoring** (§9.4) — SAP impact of a measure is not strictly additive; the current per-measure scoring + linear optimisation approximates this. A future iteration may pre-define candidate packages and ML-score whole packages. Defer until per-service grill on `OptimiserService`.
+14. **UCL extrapolation scope** (§9.2 S4a) — the Few et al. paper is gas-heated, no PV, England + Wales only. Current legacy code applies the correction to all properties regardless. Keep silent extrapolation for v1, or stratify (no correction for non-gas / PV) and surface uncertainty to FE? Defer to per-service grill.
+15. **`ValuationService` rebuild** (§9.2 S8) — existing `PropertyValuation.estimate` cites several papers; the rebuild should improve the regression. Shape deferred to per-service grill.
+16. **Battery-via-ML cutover** (§9.2 S6) — confirm the new ML model is trained against `energy_pv_battery_count` + `energy_pv_battery_capacity` and the legacy `BatterySAPScorer` can be retired without regression for battery-equipped properties.
 
 ---
 

From acb25182359061fdcfaad99abbcf5c719ffa6ae6 Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Fri, 15 May 2026 10:41:47 +0000
Subject: [PATCH 7/8] second grill session updating prd + context

---
 CONTEXT.md            | 68 ++++++++++++++++++++++++++++++++++++++-----
 ara_backend_design.md |  7 +++--
 2 files changed, 64 insertions(+), 11 deletions(-)

diff --git a/CONTEXT.md b/CONTEXT.md
index 69de3529..1e8411ed 100644
--- a/CONTEXT.md
+++ b/CONTEXT.md
@@ -82,31 +82,69 @@ The EpcPropertyData scored by the modelling pipeline for a single Property, deri
 _Avoid_: modelling EPC, working EPC, resolved EPC, derived EPC
 
 **Rebaselining**:
-Re-predicting a Property's SAP, carbon emissions, and heat demand via ML when its Effective EPC's physical state diverges from the originally lodged public EPC (because Site Notes or Landlord Overrides have changed walls / heating / windows / etc.). Does not include kWh — that is always derived deterministically.
-_Avoid_: re-scoring, re-prediction, performance recomputation
+Re-predicting a Property's SAP, carbon emissions, and heat demand via ML so the modelling pipeline scores it against the current SAP10 methodology. Triggered when either (a) the Effective EPC was lodged under a pre-SAP10 schema (`sap_version < 10.0`), so the recorded scores reflect a superseded methodology, or (b) Site Notes / Landlord Overrides changed the physical state of the Property (walls / heating / windows / etc.) so the lodged scores no longer reflect what's installed. Both triggers may fire together. Produces Effective Performance; Lodged Performance is preserved unchanged. Does not include kWh — that is always derived deterministically by EPC Energy Derivation.
+_Avoid_: re-scoring, re-prediction, performance recomputation, refresh (for cache-freshness)
 
 **Baseline Performance**:
-A Property's current performance values — SAP, carbon emissions, heat demand, annual kWh, fuel split, bills — held against the Effective EPC. SAP / carbon / heat come directly from the Effective EPC's recorded values when no override applies, or from Rebaselining when an override changes physical state. Annual kWh and the fuel split are always derived deterministically by the EPC Energy Derivation Service.
+A Property's current performance aggregate, holding both Lodged Performance and Effective Performance plus annual kWh / fuel split / bills derived from the Effective EPC. Persisted as one row; surfaced as one block in the UI.
 _Avoid_: baseline predictions, predicted baseline, rebaselined values
 
+**Lodged Performance**:
+The SAP / EPC Band / carbon emissions / heat demand recorded on the public EPC (or the Site Notes' as-surveyed values when Site Notes are the source) — unmodified by modelling. The half of Baseline Performance that says "what the government register says about this Property".
+_Avoid_: original performance, raw EPC values, recorded baseline
+
+**Effective Performance**:
+The SAP / EPC Band / carbon emissions / heat demand the modelling pipeline actually scored against — equal to Lodged Performance when no Rebaselining trigger fires, replaced by ML output when triggered. The half of Baseline Performance that says "what we modelled".
+_Avoid_: modelled performance, rebaselined performance (only correct when rebaselining ran), scored values
+
 **EPC Energy Derivation**:
-The deterministic process that derives a Property's annual kWh, fuel split (gas / electric / other), and bills from the Effective EPC's energy fields — applying a UCL-style correction for known EPC over/under-prediction and deducing fuel type for heating + hot water from the SAP heating fields. No ML.
+The deterministic process that derives a Property's annual kWh, fuel split across heating, hot water, lighting, appliances and cooking, and bills from the Effective EPC — applying a UCL Correction for known EPC over/under-prediction and deducing fuel type from the SAP heating fields. No ML.
 _Avoid_: kWh prediction, baseline kWh, energy estimation
 
+**UCL Correction**:
+The per-band linear correction (Few et al. 2023, _Energy & Buildings_ 288 113024) applied to EPC-modelled total primary energy use intensity to align it with metered consumption. Calibrated against gas-heated, non-PV homes in England and Wales rated under SAP 2012; the current implementation extrapolates it to all properties (open question §15.14).
+_Avoid_: UCL adjustment, energy correction, metered correction
+
 **EPC Anomaly Flag**:
 A per-field indicator that a Property's value for an EPC field differs significantly from Comparable Properties; advisory only — surfaces in the UI to prompt user review, does not block modelling.
 _Avoid_: outlier, mismatch, divergence flag
 
+### Reference data
+
+**Fuel Rates**:
+The current per-fuel rate (pence/kWh) and standing charge used to compute a Property's bills; time-versioned and regional, refreshed from Ofgem's published caps via an ETL. The Smart Export Guarantee rate sits in the same set as `electricity_export`. Consumed by EPC Energy Derivation.
+_Avoid_: fuel prices (commodity prices, different concept), tariff, energy cost
+
+**Carbon Factors**:
+The per-fuel CO2 emission factor (kgCO2e/kWh) used to compute a Property's carbon emissions; time-versioned, refreshed from Defra's annual publication. Consumed by EPC Energy Derivation.
+_Avoid_: emission factors (ambiguous), CO2 rates
+
 ### Outputs
 
 **Scenario**:
-A named portfolio-level container for a single modelling run, capturing the goal (e.g. Increasing EPC), budget, exclusions, and housing type; holds many Plans.
+A named portfolio-level retrofit plan, built by a user in the scenario-builder UI and persisted before any modelling fires; carries the overall goal (e.g. Increasing EPC), budget, exclusions, housing type, and an ordered list of Scenario Phases. The model is triggered against one or more Scenarios at once; each Scenario yields one Plan per Property.
 _Avoid_: project, batch, run-set
 
+**Scenario Phase**:
+One ordered step inside a Scenario, carrying a measure-type allowlist (e.g. "loft insulation and walls in phase 1; ASHP in phase 2"), an optional phase budget, and an optional phase target. A single-phase Scenario is one Scenario Phase with all measure types allowed and the full budget on it — there is no special-case path.
+_Avoid_: scenario stage, scenario step, tranche
+
+**Scenario Snapshot**:
+A frozen copy of a Scenario pinned at trigger time, keyed by (task, scenario); used by the modelling pipeline so mid-run edits to the live Scenario do not affect an in-flight job. Snapshots are read-only and may be garbage-collected after the task completes.
+_Avoid_: scenario version, frozen scenario, pinned scenario
+
 **Plan**:
-The per-Property output of a single modelling run; belongs to one Scenario and carries the Property's full Recommendation list, Optimised Package, and post-retrofit predictions.
+The per-Property output of one Scenario's modelling run; carries an ordered list of Plan Phases matching the Scenario's Phase shape. A Property modelled against N Scenarios in one trigger ends up with N Plans.
 _Avoid_: recommendation set, output, result
 
+**Plan Phase**:
+The per-Property output of one Scenario Phase: the Optimised Package selected for that phase, the ending state snapshot (the Property's SAP / kWh / bills after the package is applied), and any Rolled-over Options that flow as candidates into the next Plan Phase.
+_Avoid_: plan stage, plan step
+
+**Rolled-over Options**:
+Recommendations generated but not selected by the Optimiser in a given Plan Phase, that remain eligible as candidates in subsequent Plan Phases. Exact roll-over rule (automatic vs user-marked) is under design.
+_Avoid_: deferred measures, leftover recommendations
+
 **Recommendation**:
 A single proposed retrofit measure for a Property, with its cost, SAP impact, kWh savings, carbon savings, and parts list.
 _Avoid_: suggestion, option
@@ -175,9 +213,13 @@ _Avoid_: API key, auth token, secret
 - An **EPC** carries an **EPC Band** and is identifiable by its **Registration Date**; the most recent one is the current.
 - A **UPRN** identifies a physical dwelling permanently; it does not change when the property changes owner — but each portfolio gets its own **Property** keyed against it.
 - When a **Property** has both **Site Notes** and a public **EPC**, the newer of the two derives the **Effective EPC**. **Landlord Overrides** apply only when the **EPC** is the source — never when **Site Notes** are.
-- **Rebaselining** contributes the SAP / carbon / heat parts of **Baseline Performance** when the **Effective EPC** physical state diverges from the originally lodged EPC. **EPC Energy Derivation** contributes the kWh / fuel split / bills parts unconditionally for every Property.
+- A Property's **Baseline Performance** holds two halves: **Lodged Performance** (the gov register's SAP / band / carbon / heat) and **Effective Performance** (what the modelling pipeline scored against). The two are equal unless **Rebaselining** fires.
+- **Rebaselining** produces **Effective Performance** by ML re-prediction when either (a) the Effective EPC was lodged under a pre-SAP10 schema, or (b) the Effective EPC's physical state diverges from the lodged EPC. **Lodged Performance** is never overwritten.
+- **EPC Energy Derivation** contributes the annual kWh, fuel split, and bills on every Property unconditionally, reading current **Fuel Rates** and **Carbon Factors** from their respective repos.
 - The **EPC Prediction Service** uses **Comparable Properties** for both gap-filling and producing **EPC Anomaly Flags**.
-- A **Scenario** contains many **Plans** (one per Property). A **Plan** carries many **Recommendations**; the **Optimised Package** is the subset selected for installation.
+- A **Scenario** carries one or more ordered **Scenario Phases**. Triggering the model against N Scenarios produces N **Plans** per Property; each Plan carries an ordered list of **Plan Phases** matching the Scenario's shape.
+- Each **Plan Phase** holds its **Optimised Package**, the ending state snapshot, and any **Rolled-over Options** that flow as candidates into the next Plan Phase. A single-phase Scenario is one Scenario Phase with all measure types allowed; the same machinery handles it.
+- A **Scenario Snapshot** is pinned at trigger time per (task, scenario) so mid-run edits to the live Scenario do not affect an in-flight modelling job.
 - A **Recommendation** references one **Measure Type** and carries property-specific cost and impact.
 - **Address Matching** uses a **User Address** and **Postcode** to find a **UPRN** by scoring **UPRN Candidates** from an EPC search. A **Lexirank** of 1 with no **Ambiguous Match** and a **Lexiscore** ≥ the **Score Threshold** produces a **Best Match**.
 
@@ -199,6 +241,14 @@ _Avoid_: API key, auth token, secret
 >
 > **Domain expert:** "That's an **EPC Anomaly Flag**. We compute it against the **Comparable Properties** for that postcode. It's advisory — the UI surfaces it and the landlord can apply a **Landlord Override** if it's wrong."
 
+> **Dev:** "The property card shows two SAP scores side by side. Why?"
+>
+> **Domain expert:** "Those are **Lodged Performance** and **Effective Performance**. **Lodged** is what the gov register says — the EPC was rated under SAP 2012. **Effective** is what we scored against — we ran **Rebaselining** to predict the SAP10-equivalent rating because the methodology changed. Both stay on the **Baseline Performance** so users can see what's on record and what we're modelling against."
+
+> **Dev:** "A landlord wants a 3-year retrofit plan — fabric work this year, heat pump next, solar after. How do we model that?"
+>
+> **Domain expert:** "Three **Scenario Phases** in one **Scenario**. Phase 1 allows fabric measures with this year's budget, phase 2 allows the heat pump with next year's budget, phase 3 allows solar. When we model, the **Optimiser Service** runs per phase against the rolling state — the heat pump is scored against the post-insulation property, not the original one. Each **Plan Phase** captures the **Optimised Package** plus the ending SAP / bills, and any **Rolled-over Options** that didn't make this phase's budget become candidates next phase."
+
 ## Flagged ambiguities
 
 - **"property"** was historically warned against in favour of "dwelling"; that has been inverted. **Property** is now canonical for the Ara domain aggregate. Legacy code still uses "dwelling" in places — treat as alias.
@@ -210,3 +260,5 @@ _Avoid_: API key, auth token, secret
 - **"user_inputed_address"** in `backend/address2UPRN/main.py` is a misspelling and a synonym for **User Address** — the canonical term. New code should use `user_address`.
 - **"EPC"** is overloaded as both the document and the rating band letter. Use **EPC** for the document, **EPC Band** for the letter.
 - **"re-scoring"** has two meanings in the codebase — **Rebaselining** (re-predicting baseline performance after an EPC change) and post-optimisation measure re-prediction. Prefer **Rebaselining** for the former; for the latter, the **Optimiser Service** step does its own scoring without a special name.
+- **"phase"** appears in two unrelated contexts: as cut-over timeline language in the PRD ("Phase 0 — Status quo", "Phase 1 — Forced cut-over") and as a domain concept in **Scenario Phase** / **Plan Phase**. Only the latter is a glossary term; cut-over phases are project-management vocabulary that does not enter code.
+- **"stale"** appears in two senses: cache-freshness ("a Repo record is stale and the orchestrator should refetch") — a legitimate operational concept; and as loose shorthand for the EPC's recorded cost fields being unusable. The cost fields are not stale — they are pinned to the inspection-date fuel rates by design. Use "pinned to inspection date" or "pre-SAP10 schema" (whichever applies) instead.
diff --git a/ara_backend_design.md b/ara_backend_design.md
index b6aa8f22..de64cc1d 100644
--- a/ara_backend_design.md
+++ b/ara_backend_design.md
@@ -26,7 +26,7 @@ Beyond just swapping API clients, this is the moment to **rebuild the backend in
 - Service boundaries that other team members can read, fix, and extend without needing the entire mental model.
 - Repository-mediated persistence so business logic can be tested without spinning up a database.
 - A separation between **data fetching** (slow, IO-heavy, external) and **modelling** (deterministic, fast, internal).
-- Baseline kWh and bills derived deterministically from the Effective EPC (SAP physics + UCL correction + per-fuel rates from a refreshable repo) rather than from the EPC's stale cost fields or from an ML kWh prediction.
+- Baseline kWh and bills derived deterministically from the Effective EPC (SAP physics + UCL correction + per-fuel rates from a refreshable repo) rather than from the EPC's recorded cost fields (which use fuel rates pinned to the inspection date) or from an ML kWh prediction.
 
 ### 1.3 Out of scope for this PRD
 
@@ -294,7 +294,8 @@ class BaselinePerformance:
 
     # kWh / fuel split / bills — always derived deterministically from the Effective EPC by
     # EpcEnergyDerivationService (SAP physics + UCL correction + FuelRates lookup).
-    # Lodged kWh / bills are not stored separately — the EPC's cost fields are stale by design.
+    # Lodged kWh / bills are not stored separately — the EPC's recorded cost fields are pinned to
+    # inspection-date fuel rates, so we always re-derive bills from current FuelRates regardless.
     annual_kwh: float
     fuel_split: dict[Fuel, float]
     annual_bills: dict[Fuel, float]
@@ -341,7 +342,7 @@ ML re-predicts SAP / carbon / heat when **either** of these holds:
 
 When triggered, a single ML call re-predicts SAP/carbon/heat with the current Effective EPC state as input. Both reasons can fire together; the prediction is still one call.
 
-kWh is **always** re-derived via `EpcEnergyDerivationService` — even when no ML rebaseline runs, because fuel rates change over time and the EPC's cost fields are stale by design.
+kWh is **always** re-derived via `EpcEnergyDerivationService` — even when no ML rebaseline runs — because the EPC's recorded cost fields use fuel rates pinned to the inspection date, and current rates from `FuelRatesRepo` are what we want to surface to users.
 
 The diff mechanism for "physical state changed" (content hash, dirty flag, etc.) is an implementation detail; start with a content hash of the physical-state subset of `EpcPropertyData` stored alongside the previous run.
 

From dfe9e3ddbebbb886c8c1fd927e29dcb3680de036 Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Fri, 15 May 2026 10:56:53 +0000
Subject: [PATCH 8/8] added potential file scaffolding:

---
 packages/README.md                            | 16 ++++++++++
 packages/domain/README.md                     | 30 +++++++++++++++++++
 packages/domain/pyproject.toml                | 13 ++++++++
 packages/domain/src/domain/__init__.py        |  4 +++
 packages/fetchers/README.md                   | 19 ++++++++++++
 packages/fetchers/pyproject.toml              | 19 ++++++++++++
 packages/fetchers/src/fetchers/__init__.py    |  4 +++
 packages/repos/README.md                      | 27 +++++++++++++++++
 packages/repos/pyproject.toml                 | 19 ++++++++++++
 packages/repos/src/repos/__init__.py          |  4 +++
 packages/utils/README.md                      | 15 ++++++++++
 packages/utils/pyproject.toml                 | 15 ++++++++++
 packages/utils/src/utils/__init__.py          |  4 +++
 pyproject.toml                                | 12 ++++++++
 services/README.md                            | 13 ++++++++
 services/ara/Dockerfile                       | 12 ++++++++
 services/ara/README.md                        | 30 +++++++++++++++++++
 services/ara/pyproject.toml                   | 28 +++++++++++++++++
 services/ara/src/ara/__init__.py              |  4 +++
 services/ara/src/ara/lambdas/__init__.py      |  5 ++++
 .../ara/src/ara/orchestrators/__init__.py     |  5 ++++
 services/ara/src/ara/services/__init__.py     |  9 ++++++
 services/ara/tests/__init__.py                |  0
 services/ara/tests/fakes/__init__.py          |  4 +++
 services/ara/tests/integration/__init__.py    |  0
 services/ara/tests/unit/__init__.py           |  0
 26 files changed, 311 insertions(+)
 create mode 100644 packages/README.md
 create mode 100644 packages/domain/README.md
 create mode 100644 packages/domain/pyproject.toml
 create mode 100644 packages/domain/src/domain/__init__.py
 create mode 100644 packages/fetchers/README.md
 create mode 100644 packages/fetchers/pyproject.toml
 create mode 100644 packages/fetchers/src/fetchers/__init__.py
 create mode 100644 packages/repos/README.md
 create mode 100644 packages/repos/pyproject.toml
 create mode 100644 packages/repos/src/repos/__init__.py
 create mode 100644 packages/utils/README.md
 create mode 100644 packages/utils/pyproject.toml
 create mode 100644 packages/utils/src/utils/__init__.py
 create mode 100644 services/README.md
 create mode 100644 services/ara/Dockerfile
 create mode 100644 services/ara/README.md
 create mode 100644 services/ara/pyproject.toml
 create mode 100644 services/ara/src/ara/__init__.py
 create mode 100644 services/ara/src/ara/lambdas/__init__.py
 create mode 100644 services/ara/src/ara/orchestrators/__init__.py
 create mode 100644 services/ara/src/ara/services/__init__.py
 create mode 100644 services/ara/tests/__init__.py
 create mode 100644 services/ara/tests/fakes/__init__.py
 create mode 100644 services/ara/tests/integration/__init__.py
 create mode 100644 services/ara/tests/unit/__init__.py

diff --git a/packages/README.md b/packages/README.md
new file mode 100644
index 00000000..0911a1d3
--- /dev/null
+++ b/packages/README.md
@@ -0,0 +1,16 @@
+# Shared packages
+
+Workspace packages consumed by `services/*`. Each package is its own Python distribution with its own `pyproject.toml`; services import via the workspace dependency mechanism (`{ workspace = true }`).
+
+| Package | Purpose |
+|---------|---------|
+| [`domain/`](./domain/) | Shared domain types — `Property`, `BaselinePerformance`, `Plan`, `Scenario`, `EpcPropertyData`, etc. No persistence, no IO, no business logic. |
+| [`repos/`](./repos/) | Persistence layer — one repo per aggregate. Owns the SQL. Depends on `domain`. |
+| [`fetchers/`](./fetchers/) | External API clients (gov EPC, Ofgem, Google Solar, etc.). Depend on `domain` for response shapes. |
+| [`utils/`](./utils/) | Cross-cutting infra — logging, S3, CloudWatch URL builders, SQS task helpers. |
+
+## Adding a new shared package
+
+Only when a real second consumer materialises. Don't pre-shatter (`repos-epc`, `repos-property`, ...) — split when a deployment needs to drop a dep, not before.
+
+See [`../ara_backend_design.md`](../ara_backend_design.md) §11 for the broader monorepo layout and [`../CONTEXT.md`](../CONTEXT.md) for the domain glossary that names the types living in `domain/`.
diff --git a/packages/domain/README.md b/packages/domain/README.md
new file mode 100644
index 00000000..6dc69d41
--- /dev/null
+++ b/packages/domain/README.md
@@ -0,0 +1,30 @@
+# domna-domain
+
+Shared domain types — `Property`, `Properties`, `BaselinePerformance`, `Plan`, `PlanPhase`, `Scenario`, `ScenarioPhase`, `ScenarioSnapshot`, `Recommendation`, `OptimisedPackage`, `EpcPropertyData`, etc.
+
+**Boundary**: types only. No persistence, no IO, no business logic. Other packages and services depend on `domna-domain`; this package depends on nothing internal.
+
+Domain definitions live in [`../../CONTEXT.md`](../../CONTEXT.md). New types added here must match the glossary terms.
+
+## Layout
+
+```
+src/domain/
+├── __init__.py
+├── property.py             # Property, Properties, PropertyIdentity
+├── site_notes.py
+├── landlord_overrides.py
+├── baseline_performance.py # lodged + effective pair (ADR-0004)
+├── plan.py                 # Plan, PlanPhase, OptimisedPackage
+├── scenario.py             # Scenario, ScenarioPhase, ScenarioSnapshot (ADR-0005)
+├── recommendation.py
+├── geospatial.py
+├── solar.py
+├── anomaly_flags.py
+└── ml/
+    ├── __init__.py
+    ├── transform.py        # EpcMlTransform (versioned per §8.3)
+    └── schema.py
+```
+
+When `datatypes/epc/domain/` folds in, the EPC schema types move under `src/domain/epc/`.
diff --git a/packages/domain/pyproject.toml b/packages/domain/pyproject.toml
new file mode 100644
index 00000000..5e820371
--- /dev/null
+++ b/packages/domain/pyproject.toml
@@ -0,0 +1,13 @@
+[project]
+name = "domna-domain"
+version = "0.1.0"
+description = "Shared domain types for the Ara modelling pipeline and sibling Domna services."
+requires-python = ">=3.11"
+dependencies = []
+
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+
+[tool.hatch.build.targets.wheel]
+packages = ["src/domain"]
diff --git a/packages/domain/src/domain/__init__.py b/packages/domain/src/domain/__init__.py
new file mode 100644
index 00000000..1d52198c
--- /dev/null
+++ b/packages/domain/src/domain/__init__.py
@@ -0,0 +1,4 @@
+"""Shared domain types for the Ara modelling pipeline and sibling Domna services.
+
+No persistence, no IO, no business logic. See README.md for layout.
+"""
diff --git a/packages/fetchers/README.md b/packages/fetchers/README.md
new file mode 100644
index 00000000..ebe47f74
--- /dev/null
+++ b/packages/fetchers/README.md
@@ -0,0 +1,19 @@
+# domna-fetchers
+
+External API clients. Each fetcher is responsible for one external system — `EpcClientService` for the gov EPC API, `GeospatialFetcher` for Ordnance Survey, `SolarFetcher` for Google Solar, `FuelRatesFetcher` for Ofgem, `CarbonFactorsFetcher` for Defra.
+
+**Boundary**: makes HTTP calls + returns raw or lightly-mapped responses. No DB, no business logic. Modelling services never depend on fetchers — only orchestrators do (per [ADR-0003](../../docs/adr/0003-strict-ingestion-modelling-separation.md)).
+
+## Layout
+
+```
+src/fetchers/
+├── __init__.py
+├── epc_client.py          # wraps backend/epc_client/
+├── geospatial.py
+├── solar.py
+├── fuel_rates_fetcher.py
+└── carbon_factors_fetcher.py
+```
+
+`backend/epc_client/` will fold into `epc_client.py` during the migration; until then this module re-exports from the legacy location.
diff --git a/packages/fetchers/pyproject.toml b/packages/fetchers/pyproject.toml
new file mode 100644
index 00000000..69404681
--- /dev/null
+++ b/packages/fetchers/pyproject.toml
@@ -0,0 +1,19 @@
+[project]
+name = "domna-fetchers"
+version = "0.1.0"
+description = "External API clients — gov EPC, Ofgem, Google Solar, Defra, etc."
+requires-python = ">=3.11"
+dependencies = [
+    "domna-domain",
+    "httpx>=0.27",
+]
+
+[tool.uv.sources]
+domna-domain = { workspace = true }
+
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+
+[tool.hatch.build.targets.wheel]
+packages = ["src/fetchers"]
diff --git a/packages/fetchers/src/fetchers/__init__.py b/packages/fetchers/src/fetchers/__init__.py
new file mode 100644
index 00000000..74646907
--- /dev/null
+++ b/packages/fetchers/src/fetchers/__init__.py
@@ -0,0 +1,4 @@
+"""External API clients for Ara and sibling services.
+
+One fetcher per external system. No DB, no business logic. See README.md.
+"""
diff --git a/packages/repos/README.md b/packages/repos/README.md
new file mode 100644
index 00000000..990b66db
--- /dev/null
+++ b/packages/repos/README.md
@@ -0,0 +1,27 @@
+# domna-repos
+
+Persistence layer. One repo per aggregate; owns the SQL for its tables. Callers see only domain objects from `domna-domain`.
+
+**Boundary**: depends on `domna-domain` for types. No external IO except the DB. No business logic — services do that.
+
+## Repos (per [PRD §7.3](../../ara_backend_design.md))
+
+```
+src/repos/
+├── __init__.py
+├── unit_of_work.py
+├── property_repo.py
+├── epc_cache_repo.py
+├── site_notes_repo.py
+├── landlord_overrides_repo.py
+├── recommendations_repo.py
+├── generic_data_repo.py
+├── fuel_rates_repo.py
+├── carbon_factors_repo.py
+├── heating_system_assumptions_repo.py
+└── subtask_repo.py
+```
+
+Each repo has a `Fake*Repo` companion in its service's test tree (typically `services/ara/tests/fakes/`) — dict-backed, no DB.
+
+DDL migrations are scoped to sub-PRD (iii); during Phase 0 repos may delegate into the legacy `backend/app/db/db_funcs.*` modules.
diff --git a/packages/repos/pyproject.toml b/packages/repos/pyproject.toml
new file mode 100644
index 00000000..53689812
--- /dev/null
+++ b/packages/repos/pyproject.toml
@@ -0,0 +1,19 @@
+[project]
+name = "domna-repos"
+version = "0.1.0"
+description = "Persistence layer — one repo per aggregate. Owns the SQL."
+requires-python = ">=3.11"
+dependencies = [
+    "domna-domain",
+    "sqlalchemy>=2.0",
+]
+
+[tool.uv.sources]
+domna-domain = { workspace = true }
+
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+
+[tool.hatch.build.targets.wheel]
+packages = ["src/repos"]
diff --git a/packages/repos/src/repos/__init__.py b/packages/repos/src/repos/__init__.py
new file mode 100644
index 00000000..a981395a
--- /dev/null
+++ b/packages/repos/src/repos/__init__.py
@@ -0,0 +1,4 @@
+"""Persistence layer for the Ara domain aggregates.
+
+One repo per aggregate. Owns SQL; exposes domain objects. See README.md.
+"""
diff --git a/packages/utils/README.md b/packages/utils/README.md
new file mode 100644
index 00000000..1fba6457
--- /dev/null
+++ b/packages/utils/README.md
@@ -0,0 +1,15 @@
+# domna-utils
+
+Cross-cutting infrastructure helpers. Nothing domain-specific — anything in here should be portable across services.
+
+## Will live here (migrating from `utils/` and `backend/utils/`)
+
+- Logging — `logger.py`
+- S3 — `s3.py`
+- Pandas helpers — `pandas_utils.py`
+- CloudWatch URL builder — `cloudwatch.py`
+- SQS subtask helpers — `subtasks.py`
+
+## Will NOT live here
+
+Service-specific parsers (Osmosis condition report, full-SAP parser, SharePoint integration) move into the service that owns them, not here.
diff --git a/packages/utils/pyproject.toml b/packages/utils/pyproject.toml
new file mode 100644
index 00000000..cf739bbd
--- /dev/null
+++ b/packages/utils/pyproject.toml
@@ -0,0 +1,15 @@
+[project]
+name = "domna-utils"
+version = "0.1.0"
+description = "Cross-cutting infrastructure helpers — logging, S3, CloudWatch, SQS tasks."
+requires-python = ">=3.11"
+dependencies = [
+    "boto3>=1.34",
+]
+
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+
+[tool.hatch.build.targets.wheel]
+packages = ["src/utils"]
diff --git a/packages/utils/src/utils/__init__.py b/packages/utils/src/utils/__init__.py
new file mode 100644
index 00000000..d010a3be
--- /dev/null
+++ b/packages/utils/src/utils/__init__.py
@@ -0,0 +1,4 @@
+"""Cross-cutting infrastructure helpers — logging, S3, CloudWatch, SQS tasks.
+
+Nothing domain-specific belongs here. See README.md.
+"""
diff --git a/pyproject.toml b/pyproject.toml
index 49108861..75aabc82 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1 +1,13 @@
 [tool.pyright]
+
+# uv workspace root.
+# Each workspace member has its own pyproject.toml under packages/<name>/ or services/<name>/.
+# Run `uv sync` at the root to install everything; `uv sync --package <name>` for one.
+[tool.uv.workspace]
+members = [
+    "packages/domain",
+    "packages/repos",
+    "packages/fetchers",
+    "packages/utils",
+    "services/ara",
+]
diff --git a/services/README.md b/services/README.md
new file mode 100644
index 00000000..c82ef6a4
--- /dev/null
+++ b/services/README.md
@@ -0,0 +1,13 @@
+# Services
+
+Each subdirectory is a deployable unit — typically a Lambda image. Own `pyproject.toml`, own `Dockerfile`, own deps. Lambda bundle contains only that service's deps + its workspace deps.
+
+| Service | Purpose |
+|---------|---------|
+| [`ara/`](./ara/) | The Domna retrofit modelling backend — ingestion + modelling pipelines, all 9 services in [PRD §9.2](../ara_backend_design.md). |
+
+Other Domna services (address2uprn, hubspot, pashub, ecmk, magicplan) live in the legacy `backend/` and `etl/` trees for now; they are slated to migrate here as their owners pick them up — see [PRD §11](../ara_backend_design.md). When that work starts, scaffold the service under `services/<name>/` and add it to the workspace members in the root `pyproject.toml`.
+
+## Service boundary
+
+A service can `import domain.*`, `import repos.*`, `import fetchers.*`, `import utils.*` (workspace deps). It **cannot** import another service's modules — they are separate distributions with no cross-import path. This is the structural enforcement of the modelling/ingestion separation ([ADR-0003](../docs/adr/0003-strict-ingestion-modelling-separation.md)).
diff --git a/services/ara/Dockerfile b/services/ara/Dockerfile
new file mode 100644
index 00000000..c45d6bc1
--- /dev/null
+++ b/services/ara/Dockerfile
@@ -0,0 +1,12 @@
+# Lambda image for the Ara modelling backend.
+#
+# This is a scaffold — final image will install only ara + its workspace deps
+# (domna-domain, domna-repos, domna-fetchers, domna-utils) plus ML/data libraries.
+# Build via uv to keep cold-start size contained.
+
+FROM public.ecr.aws/lambda/python:3.11
+
+# TODO: install uv, sync this service's deps from the workspace lock file,
+# copy src/ara/ into ${LAMBDA_TASK_ROOT}/, set CMD to the Lambda handler.
+
+CMD ["ara.lambdas.handler.handler"]
diff --git a/services/ara/README.md b/services/ara/README.md
new file mode 100644
index 00000000..71e71a5d
--- /dev/null
+++ b/services/ara/README.md
@@ -0,0 +1,30 @@
+# ara
+
+The Domna retrofit modelling backend. Replaces the legacy `backend/engine/engine.py` monolith with a service-oriented pipeline that survives the 30 May 2026 gov EPC API cut-over and that other team members can read, fix, and extend.
+
+Design document: [`../../ara_backend_design.md`](../../ara_backend_design.md).
+Domain glossary: [`../../CONTEXT.md`](../../CONTEXT.md).
+
+## Layout
+
+```
+src/ara/
+├── services/        # the 9 domain services from PRD §9.2:
+│                    #   EpcRemappingService, EpcPredictionService,
+│                    #   FeatureBuilder, EpcEnergyDerivationService,
+│                    #   RebaseliningService, RecommendationService,
+│                    #   ImpactPredictionService, OptimiserService,
+│                    #   ValuationService, ResultsPersister
+├── orchestrators/   # IngestionPipeline, ModellingPipeline, RefreshOrchestrator
+└── lambdas/         # one handler.py per Lambda + the event-shape contracts
+```
+
+## Pipeline
+
+See [PRD §9.4](../../ara_backend_design.md) for the per-batch step order. Briefly: per-property setup (steps 1–6) runs once per Property; the per-scenario × per-phase loop (steps 7–10) re-derives candidates and impact predictions against the rolling Effective EPC state; results are persisted under one Unit of Work per (Plan, Scenario).
+
+## Testing
+
+- `tests/unit/` — service tests against fakes from `tests/fakes/`. No DB, no network, no ML lambda.
+- `tests/integration/` — real Postgres (testcontainers / localstack), fake fetchers + fake ML lambdas.
+- ML transform contract tests live with `domain.ml.transform` in `packages/domain/`.
diff --git a/services/ara/pyproject.toml b/services/ara/pyproject.toml
new file mode 100644
index 00000000..3556a15f
--- /dev/null
+++ b/services/ara/pyproject.toml
@@ -0,0 +1,28 @@
+[project]
+name = "ara"
+version = "0.1.0"
+description = "The Domna retrofit modelling backend. Ingestion + modelling pipelines."
+requires-python = ">=3.11"
+dependencies = [
+    "domna-domain",
+    "domna-repos",
+    "domna-fetchers",
+    "domna-utils",
+    "pandas>=2.0",
+    "pandas-stubs",
+    "numpy>=1.26",
+    "pydantic>=2.0",
+]
+
+[tool.uv.sources]
+domna-domain = { workspace = true }
+domna-repos = { workspace = true }
+domna-fetchers = { workspace = true }
+domna-utils = { workspace = true }
+
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+
+[tool.hatch.build.targets.wheel]
+packages = ["src/ara"]
diff --git a/services/ara/src/ara/__init__.py b/services/ara/src/ara/__init__.py
new file mode 100644
index 00000000..26856c73
--- /dev/null
+++ b/services/ara/src/ara/__init__.py
@@ -0,0 +1,4 @@
+"""The Domna retrofit modelling backend.
+
+See README.md and ara_backend_design.md (repo root) for the architecture.
+"""
diff --git a/services/ara/src/ara/lambdas/__init__.py b/services/ara/src/ara/lambdas/__init__.py
new file mode 100644
index 00000000..93b08582
--- /dev/null
+++ b/services/ara/src/ara/lambdas/__init__.py
@@ -0,0 +1,5 @@
+"""Lambda handlers + event-shape contracts.
+
+One handler per deployable Lambda. See PRD §4.6 for the ModelTriggerRequest
+shape.
+"""
diff --git a/services/ara/src/ara/orchestrators/__init__.py b/services/ara/src/ara/orchestrators/__init__.py
new file mode 100644
index 00000000..4d2c9a60
--- /dev/null
+++ b/services/ara/src/ara/orchestrators/__init__.py
@@ -0,0 +1,5 @@
+"""Orchestrators for the Ara pipeline.
+
+IngestionPipeline, ModellingPipeline, RefreshOrchestrator. The only place
+where step order is encoded and where fetchers + services + repos meet.
+"""
diff --git a/services/ara/src/ara/services/__init__.py b/services/ara/src/ara/services/__init__.py
new file mode 100644
index 00000000..b561f336
--- /dev/null
+++ b/services/ara/src/ara/services/__init__.py
@@ -0,0 +1,9 @@
+"""Domain services for the Ara modelling pipeline (PRD §9.2).
+
+EpcRemappingService, EpcPredictionService, FeatureBuilder,
+EpcEnergyDerivationService, RebaseliningService, RecommendationService,
+ImpactPredictionService, OptimiserService, ValuationService, ResultsPersister.
+
+Each service operates on `Properties` and depends only on repos + other services
++ domain objects. No external IO (per ADR-0003).
+"""
diff --git a/services/ara/tests/__init__.py b/services/ara/tests/__init__.py
new file mode 100644
index 00000000..e69de29b
diff --git a/services/ara/tests/fakes/__init__.py b/services/ara/tests/fakes/__init__.py
new file mode 100644
index 00000000..cc032044
--- /dev/null
+++ b/services/ara/tests/fakes/__init__.py
@@ -0,0 +1,4 @@
+"""Fake repos and fetchers for unit tests.
+
+One Fake<Name>Repo per real repo; dict-backed; no DB. Same for fetchers.
+"""
diff --git a/services/ara/tests/integration/__init__.py b/services/ara/tests/integration/__init__.py
new file mode 100644
index 00000000..e69de29b
diff --git a/services/ara/tests/unit/__init__.py b/services/ara/tests/unit/__init__.py
new file mode 100644
index 00000000..e69de29b