diff --git a/CLAUDE.md b/CLAUDE.md
index f88a59d5..faa857ce 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -36,28 +36,26 @@ Five Claude Code skills are installed in this repo's dev container. Each maps to
 |-------|--------|-------------|
 | **grill-me** | `/grill-me` | Before implementing — stress-tests a design through sequential questioning |
 | **to-prd** | `/to-prd` | After a planning conversation — formalises context into a GitHub issue PRD |
-| **ubiquitous-language** | `/ubiquitous-language` | When domain terms are drifting or ambiguous — builds/updates `UBIQUITOUS_LANGUAGE.md` |
+| **grill-with-docs** | `/grill-with-docs` | When domain terms are drifting or new concepts are landing — challenges plans against `CONTEXT.md`, sharpens terminology inline, and writes ADRs for load-bearing decisions in `docs/adr/`. Replaces the older `ubiquitous-language` skill. |
 | **tdd** | `/tdd` | During implementation — enforces vertical-slice TDD (one test → one impl → repeat) |
 | **improve-codebase-architecture** | `/improve-codebase-architecture` | During refactoring — surfaces shallow modules and proposes deepening opportunities |
 
+Domain glossary lives at [CONTEXT.md](./CONTEXT.md); load-bearing decisions live at [docs/adr/](./docs/adr/). The legacy [UBIQUITOUS_LANGUAGE.md](./UBIQUITOUS_LANGUAGE.md) is a redirect.
+
 ### Typical session chains
 
 **Feature planning:**
-`/grill-me` → `/to-prd` → `/ubiquitous-language`
+`/grill-me` → `/to-prd` → `/grill-with-docs`
 
 **Implementation:**
 `/tdd` (+ `/grill-me` if a design fork appears mid-session)
 
 **Refactoring:**
-`/improve-codebase-architecture` → `/grill-me` → `/tdd` → `/ubiquitous-language`
+`/improve-codebase-architecture` → `/grill-me` → `/tdd` → `/grill-with-docs`
 
 ### First time setting up?
 
-New containers install all skills automatically via the Dockerfile. If you're in an existing container, run:
-
-```bash
-bash .devcontainer/backend/install-claude-skills.sh
-```
+Skills are installed automatically when the dev container is built, via the postCreate step that pulls from `Hestia-Homes/agentic-toolkit` (see `.devcontainer/backend/Dockerfile`). If an existing container is missing skills, rebuild the dev container.
 
 ## Type Safety
 
diff --git a/CONTEXT.md b/CONTEXT.md
new file mode 100644
index 00000000..1e8411ed
--- /dev/null
+++ b/CONTEXT.md
@@ -0,0 +1,264 @@
+# Ara
+
+The Domna product for domestic retrofit modelling: ingests open-source EPC data, lets users correct or supersede it with their own surveys, and produces optimised retrofit packages for each property in a portfolio.
+
+## Language
+
+### Product
+
+**Ara**:
+The Domna product. Latin for "the altar"; named under Domna's classical-naming convention. Covers both the modelling product and the backend that powers it.
+_Avoid_: ARA (acronym style), v2 backend, the new backend
+
+**Domna**:
+The company. Roman name; sibling to Ara in the same naming convention.
+
+### Energy Performance Certificates
+
+**EPC**:
+An Energy Performance Certificate — a government-issued document rating a dwelling's energy efficiency from A (best) to G (worst).
+_Avoid_: energy certificate, energy report
+
+**Certificate Number**:
+The unique identifier assigned to an EPC by the government registry.
+_Avoid_: cert number, EPC ID
+
+**Registration Date**:
+The date an EPC was lodged with the government register; used to identify the most recent certificate for a property.
+_Avoid_: assessment date, submission date
+
+**EPC Band**:
+A single letter A–G representing a property's current or potential energy efficiency rating.
+_Avoid_: energy rating, EPC grade, EPC score
+
+**Schema Type**:
+The versioned RdSAP or SAP schema that describes the structure of an EPC's raw data (e.g. `RdSAP-Schema-21.0.1`).
+_Avoid_: schema version, EPC format
+
+**Domestic Certificate**:
+An EPC issued for a residential dwelling, as opposed to a commercial one.
+_Avoid_: residential EPC, home EPC
+
+### Properties and addresses
+
+**Property**:
+The Ara domain aggregate representing a single dwelling under modelling: its identity, source data, enrichments, and modelling outputs.
+_Avoid_: dwelling, unit, home, asset
+
+**Properties**:
+A first-class collection of Property objects; the unit of bulk operation in services.
+_Avoid_: property list, batch (used for SQS chunks)
+
+**UPRN**:
+Unique Property Reference Number — the government-issued permanent identifier for a physical address in the UK.
+_Avoid_: property ID, address ID, code
+
+**Postcode**:
+A UK postal code used to group nearby addresses; the primary search key for finding EPC records.
+_Avoid_: zip code, postal code
+
+**User Address**:
+A free-text address string provided by a user or imported from a customer dataset, before any normalisation or matching.
+_Avoid_: user input, raw address, user_inputed_address
+
+**Comparable Properties**:
+The reference cohort matched to a target Property by both geographic proximity (postcode prefix / UPRN range) and physical similarity (property type, built form, age band); used by the EPC Prediction Service for gap-filling and anomaly detection.
+_Avoid_: neighbours, similar properties, peer set
+
+### Source data
+
+**Site Notes**:
+The full-coverage record produced by a Domna survey of a single Property; carries every EPC field the modelling pipeline requires, and when present supersedes the public EPC for that Property — except when the public EPC is newer.
+_Avoid_: energy assessment, site survey, field survey, Domna survey, Hestia survey
+
+**Landlord Overrides**:
+Property data supplied by a landlord that may correct or supplement the public EPC for a single Property; triggers Rebaselining when applied; not applicable when Site Notes are present.
+_Avoid_: patches (deprecated), corrections, manual EPC, edits
+
+### Modelling
+
+**Effective EPC**:
+The EpcPropertyData scored by the modelling pipeline for a single Property, derived from either Site Notes alone or the public EPC with Landlord Overrides applied; carries source-derived physical fields and originally recorded performance values, with model-rebaselined performance held separately in Baseline Performance.
+_Avoid_: modelling EPC, working EPC, resolved EPC, derived EPC
+
+**Rebaselining**:
+Re-predicting a Property's SAP, carbon emissions, and heat demand via ML so the modelling pipeline scores it against the current SAP10 methodology. Triggered when either (a) the Effective EPC was lodged under a pre-SAP10 schema (`sap_version < 10.0`), so the recorded scores reflect a superseded methodology, or (b) Site Notes / Landlord Overrides changed the physical state of the Property (walls / heating / windows / etc.) so the lodged scores no longer reflect what's installed. Both triggers may fire together. Produces Effective Performance; Lodged Performance is preserved unchanged. Does not include kWh — that is always derived deterministically by EPC Energy Derivation.
+_Avoid_: re-scoring, re-prediction, performance recomputation, refresh (for cache-freshness)
+
+**Baseline Performance**:
+A Property's current performance aggregate, holding both Lodged Performance and Effective Performance plus annual kWh / fuel split / bills derived from the Effective EPC. Persisted as one row; surfaced as one block in the UI.
+_Avoid_: baseline predictions, predicted baseline, rebaselined values
+
+**Lodged Performance**:
+The SAP / EPC Band / carbon emissions / heat demand recorded on the public EPC (or the Site Notes' as-surveyed values when Site Notes are the source) — unmodified by modelling. The half of Baseline Performance that says "what the government register says about this Property".
+_Avoid_: original performance, raw EPC values, recorded baseline
+
+**Effective Performance**:
+The SAP / EPC Band / carbon emissions / heat demand the modelling pipeline actually scored against — equal to Lodged Performance when no Rebaselining trigger fires, replaced by ML output when triggered. The half of Baseline Performance that says "what we modelled".
+_Avoid_: modelled performance, rebaselined performance (only correct when rebaselining ran), scored values
+
+**EPC Energy Derivation**:
+The deterministic process that derives a Property's annual kWh, fuel split across heating, hot water, lighting, appliances and cooking, and bills from the Effective EPC — applying a UCL Correction for known EPC over/under-prediction and deducing fuel type from the SAP heating fields. No ML.
+_Avoid_: kWh prediction, baseline kWh, energy estimation
+
+**UCL Correction**:
+The per-band linear correction (Few et al. 2023, _Energy & Buildings_ 288 113024) applied to EPC-modelled total primary energy use intensity to align it with metered consumption. Calibrated against gas-heated, non-PV homes in England and Wales rated under SAP 2012; the current implementation extrapolates it to all properties (open question §15.14).
+_Avoid_: UCL adjustment, energy correction, metered correction
+
+**EPC Anomaly Flag**:
+A per-field indicator that a Property's value for an EPC field differs significantly from Comparable Properties; advisory only — surfaces in the UI to prompt user review, does not block modelling.
+_Avoid_: outlier, mismatch, divergence flag
+
+### Reference data
+
+**Fuel Rates**:
+The current per-fuel rate (pence/kWh) and standing charge used to compute a Property's bills; time-versioned and regional, refreshed from Ofgem's published caps via an ETL. The Smart Export Guarantee rate sits in the same set as `electricity_export`. Consumed by EPC Energy Derivation.
+_Avoid_: fuel prices (commodity prices, different concept), tariff, energy cost
+
+**Carbon Factors**:
+The per-fuel CO2 emission factor (kgCO2e/kWh) used to compute a Property's carbon emissions; time-versioned, refreshed from Defra's annual publication. Consumed by EPC Energy Derivation.
+_Avoid_: emission factors (ambiguous), CO2 rates
+
+### Outputs
+
+**Scenario**:
+A named portfolio-level retrofit plan, built by a user in the scenario-builder UI and persisted before any modelling fires; carries the overall goal (e.g. Increasing EPC), budget, exclusions, housing type, and an ordered list of Scenario Phases. The model is triggered against one or more Scenarios at once; each Scenario yields one Plan per Property.
+_Avoid_: project, batch, run-set
+
+**Scenario Phase**:
+One ordered step inside a Scenario, carrying a measure-type allowlist (e.g. "loft insulation and walls in phase 1; ASHP in phase 2"), an optional phase budget, and an optional phase target. A single-phase Scenario is one Scenario Phase with all measure types allowed and the full budget on it — there is no special-case path.
+_Avoid_: scenario stage, scenario step, tranche
+
+**Scenario Snapshot**:
+A frozen copy of a Scenario pinned at trigger time, keyed by (task, scenario); used by the modelling pipeline so mid-run edits to the live Scenario do not affect an in-flight job. Snapshots are read-only and may be garbage-collected after the task completes.
+_Avoid_: scenario version, frozen scenario, pinned scenario
+
+**Plan**:
+The per-Property output of one Scenario's modelling run; carries an ordered list of Plan Phases matching the Scenario's Phase shape. A Property modelled against N Scenarios in one trigger ends up with N Plans.
+_Avoid_: recommendation set, output, result
+
+**Plan Phase**:
+The per-Property output of one Scenario Phase: the Optimised Package selected for that phase, the ending state snapshot (the Property's SAP / kWh / bills after the package is applied), and any Rolled-over Options that flow as candidates into the next Plan Phase.
+_Avoid_: plan stage, plan step
+
+**Rolled-over Options**:
+Recommendations generated but not selected by the Optimiser in a given Plan Phase, that remain eligible as candidates in subsequent Plan Phases. Exact roll-over rule (automatic vs user-marked) is under design.
+_Avoid_: deferred measures, leftover recommendations
+
+**Recommendation**:
+A single proposed retrofit measure for a Property, with its cost, SAP impact, kWh savings, carbon savings, and parts list.
+_Avoid_: suggestion, option
+
+**Optimised Package**:
+The subset of a Property's Recommendations selected by the Optimiser Service for installation, chosen to satisfy the Scenario's goal subject to budget.
+_Avoid_: selected measures, default measures, optimal solution, recommended bundle
+
+**Measure Type**:
+The catalogue classification of a retrofit measure (e.g. `solar_pv`, `loft_insulation`, `ashp`); one or more Recommendations reference the same Measure Type with property-specific cost and impact.
+_Avoid_: measure (ambiguous), category
+
+### Address matching
+
+**Lexiscore**:
+A similarity score in [0, 1] between a User Address and a candidate EPC address; combines token overlap and character-level similarity.
+_Avoid_: score, match score, similarity
+
+**Lexirank**:
+Dense rank of candidates sorted by Lexiscore descending; rank 1 = best match.
+_Avoid_: rank, position
+
+**UPRN Candidate**:
+An EPC Search Result that is a plausible match for a given User Address, before scoring decides the winner.
+_Avoid_: match candidate, result
+
+**Score Threshold**:
+The minimum Lexiscore (currently 0.6) below which no match is returned even if a candidate exists.
+_Avoid_: minimum score, cutoff
+
+**Ambiguous Match**:
+A matching outcome where two or more candidates share Lexirank 1, making it impossible to select a unique winner.
+_Avoid_: tie, draw, duplicate
+
+**Best Match**:
+The single UPRN Candidate with Lexirank 1 that meets or exceeds the Score Threshold.
+_Avoid_: winner, top result
+
+### API and integration
+
+**EPC Search Result**:
+A lightweight record returned by the government domestic search endpoint — address lines, postcode, UPRN, band, and certificate number, but not full certificate data.
+_Avoid_: search row, EPC row, result
+
+**EPC Property Data**:
+The fully mapped domain object produced after fetching and parsing a complete EPC certificate; the schema the modelling pipeline operates against.
+_Avoid_: EPC data, certificate data, parsed EPC
+
+**Old EPC API**:
+The retired government API (`epc.opendatacommunities.org`) using HTTP Basic auth; decommissioned 30 May 2026.
+_Avoid_: legacy API
+
+**New EPC API**:
+The replacement government API (`api.get-energy-performance-data.communities.gov.uk`) using Bearer Token auth.
+_Avoid_: new API, current API
+
+**Bearer Token**:
+The auth credential required by the New EPC API; stored in the `EPC_AUTH_TOKEN` environment variable.
+_Avoid_: API key, auth token, secret
+
+## Relationships
+
+- A **Property** represents a single physical dwelling for modelling; identified by `(portfolio_id, UPRN)` or `(portfolio_id, landlord_property_id)`.
+- A **Property** has zero or more **EPCs** across time, exactly one **Effective EPC**, zero or one set of **Site Notes**, and zero or one set of **Landlord Overrides**.
+- An **EPC** belongs to exactly one **Property** and has one **Certificate Number**.
+- An **EPC** carries an **EPC Band** and is identifiable by its **Registration Date**; the most recent one is the current.
+- A **UPRN** identifies a physical dwelling permanently; it does not change when the property changes owner — but each portfolio gets its own **Property** keyed against it.
+- When a **Property** has both **Site Notes** and a public **EPC**, the newer of the two derives the **Effective EPC**. **Landlord Overrides** apply only when the **EPC** is the source — never when **Site Notes** are.
+- A Property's **Baseline Performance** holds two halves: **Lodged Performance** (the gov register's SAP / band / carbon / heat) and **Effective Performance** (what the modelling pipeline scored against). The two are equal unless **Rebaselining** fires.
+- **Rebaselining** produces **Effective Performance** by ML re-prediction when either (a) the Effective EPC was lodged under a pre-SAP10 schema, or (b) the Effective EPC's physical state diverges from the lodged EPC. **Lodged Performance** is never overwritten.
+- **EPC Energy Derivation** contributes the annual kWh, fuel split, and bills on every Property unconditionally, reading current **Fuel Rates** and **Carbon Factors** from their respective repos.
+- The **EPC Prediction Service** uses **Comparable Properties** for both gap-filling and producing **EPC Anomaly Flags**.
+- A **Scenario** carries one or more ordered **Scenario Phases**. Triggering the model against N Scenarios produces N **Plans** per Property; each Plan carries an ordered list of **Plan Phases** matching the Scenario's shape.
+- Each **Plan Phase** holds its **Optimised Package**, the ending state snapshot, and any **Rolled-over Options** that flow as candidates into the next Plan Phase. A single-phase Scenario is one Scenario Phase with all measure types allowed; the same machinery handles it.
+- A **Scenario Snapshot** is pinned at trigger time per (task, scenario) so mid-run edits to the live Scenario do not affect an in-flight modelling job.
+- A **Recommendation** references one **Measure Type** and carries property-specific cost and impact.
+- **Address Matching** uses a **User Address** and **Postcode** to find a **UPRN** by scoring **UPRN Candidates** from an EPC search. A **Lexirank** of 1 with no **Ambiguous Match** and a **Lexiscore** ≥ the **Score Threshold** produces a **Best Match**.
+
+## Example dialogue
+
+> **Dev:** "A landlord uploads a corrected boiler for one of their properties. What happens?"
+>
+> **Domain expert:** "That's a **Landlord Override** on the heating fields. Save it against the **Property**. The **Effective EPC** has changed, so **Rebaselining** runs to re-predict SAP / carbon / heat, and **EPC Energy Derivation** re-runs to update kWh / bills based on the new fuel deduction. With fresh **Baseline Performance** we regenerate **Recommendations**."
+
+> **Dev:** "What if the same Property also has Site Notes?"
+>
+> **Domain expert:** "**Site Notes** supersede the public **EPC**, so **Landlord Overrides** don't apply. We model from the **Site Notes** version of the **Effective EPC**. If the public **EPC** is newer than the **Site Notes**, that's the one exception — we use the newer one."
+
+> **Dev:** "After modelling we end up with a list of measures. Which ones get installed?"
+>
+> **Domain expert:** "The **Optimiser Service** picks the **Optimised Package** — a subset of **Recommendations** that hits the **Scenario** goal within budget. The rest stay in the **Plan** as alternatives the user can swap in."
+
+> **Dev:** "I'm looking at a property where the EPC says cavity walls but every other house on the street has solid. Is that a bug?"
+>
+> **Domain expert:** "That's an **EPC Anomaly Flag**. We compute it against the **Comparable Properties** for that postcode. It's advisory — the UI surfaces it and the landlord can apply a **Landlord Override** if it's wrong."
+
+> **Dev:** "The property card shows two SAP scores side by side. Why?"
+>
+> **Domain expert:** "Those are **Lodged Performance** and **Effective Performance**. **Lodged** is what the gov register says — the EPC was rated under SAP 2012. **Effective** is what we scored against — we ran **Rebaselining** to predict the SAP10-equivalent rating because the methodology changed. Both stay on the **Baseline Performance** so users can see what's on record and what we're modelling against."
+
+> **Dev:** "A landlord wants a 3-year retrofit plan — fabric work this year, heat pump next, solar after. How do we model that?"
+>
+> **Domain expert:** "Three **Scenario Phases** in one **Scenario**. Phase 1 allows fabric measures with this year's budget, phase 2 allows the heat pump with next year's budget, phase 3 allows solar. When we model, the **Optimiser Service** runs per phase against the rolling state — the heat pump is scored against the post-insulation property, not the original one. Each **Plan Phase** captures the **Optimised Package** plus the ending SAP / bills, and any **Rolled-over Options** that didn't make this phase's budget become candidates next phase."
+
+## Flagged ambiguities
+
+- **"property"** was historically warned against in favour of "dwelling"; that has been inverted. **Property** is now canonical for the Ara domain aggregate. Legacy code still uses "dwelling" in places — treat as alias.
+- **"energy assessment"** in the existing codebase (`energy_assessment_functions`, `energy_assessments_by_uprn`) refers to what is now canonically called **Site Notes**. New code uses **Site Notes**.
+- **"patch"** / `patch_epc` in the existing codebase has been merged into **Landlord Overrides**; the original concept is deprecated.
+- **"already_installed measures"** in the existing codebase is likely subsumed by **Landlord Overrides** ("we have a heat pump now" → override the heating fields). Final call deferred to implementation.
+- **"address"** appears as both the raw **User Address** (free-text) and a structured field on an **EPC Search Result** (normalised lines). Always qualify: "user address" vs "EPC address" or "address line 1".
+- **"score"** is used for `AddressMatch.score()` output, the `lexiscore` column, and informally. Prefer **Lexiscore** in domain discussions; reserve "score" for method-level code comments.
+- **"user_inputed_address"** in `backend/address2UPRN/main.py` is a misspelling and a synonym for **User Address** — the canonical term. New code should use `user_address`.
+- **"EPC"** is overloaded as both the document and the rating band letter. Use **EPC** for the document, **EPC Band** for the letter.
+- **"re-scoring"** has two meanings in the codebase — **Rebaselining** (re-predicting baseline performance after an EPC change) and post-optimisation measure re-prediction. Prefer **Rebaselining** for the former; for the latter, the **Optimiser Service** step does its own scoring without a special name.
+- **"phase"** appears in two unrelated contexts: as cut-over timeline language in the PRD ("Phase 0 — Status quo", "Phase 1 — Forced cut-over") and as a domain concept in **Scenario Phase** / **Plan Phase**. Only the latter is a glossary term; cut-over phases are project-management vocabulary that does not enter code.
+- **"stale"** appears in two senses: cache-freshness ("a Repo record is stale and the orchestrator should refetch") — a legitimate operational concept; and as loose shorthand for the EPC's recorded cost fields being unusable. The cost fields are not stale — they are pinned to the inspection-date fuel rates by design. Use "pinned to inspection date" or "pre-SAP10 schema" (whichever applies) instead.
diff --git a/UBIQUITOUS_LANGUAGE.md b/UBIQUITOUS_LANGUAGE.md
index 1765cbc8..66684925 100644
--- a/UBIQUITOUS_LANGUAGE.md
+++ b/UBIQUITOUS_LANGUAGE.md
@@ -1,78 +1,7 @@
 # Ubiquitous Language
 
-Domain terminology glossary for this project. Generated and maintained by the `/ubiquitous-language` Claude Code skill.
+This file has been **superseded by [CONTEXT.md](./CONTEXT.md)**.
 
-Invoke `/ubiquitous-language` in any session to extract new terms from the conversation, flag ambiguities, and update this file with canonical definitions.
+The project's domain glossary now lives at the repo root in `CONTEXT.md`, maintained by the `/grill-with-docs` skill (which replaced `/ubiquitous-language`).
 
----
-
-## Energy Performance Certificates
-
-| Term | Definition | Aliases to avoid |
-|------|------------|------------------|
-| **EPC** | An Energy Performance Certificate — a government-issued document rating a dwelling's energy efficiency from A (best) to G (worst). | "energy certificate", "energy report" |
-| **Certificate Number** | The unique identifier assigned to an EPC by the government registry. | "cert number", "EPC ID" |
-| **Registration Date** | The date an EPC was lodged with the government register; used to identify the most recent certificate for a property. | "assessment date", "submission date" |
-| **EPC Band** | A single letter A–G representing a property's current or potential energy efficiency rating. | "energy rating", "EPC grade", "EPC score" |
-| **Schema Type** | The versioned RdSAP or SAP schema that describes the structure of a certificate's raw data (e.g. `RdSAP-Schema-21.0.1`). | "schema version", "EPC format" |
-| **Domestic Certificate** | An EPC issued for a residential dwelling, as opposed to a commercial one. | "residential EPC", "home EPC" |
-
-## Properties and Addresses
-
-| Term | Definition | Aliases to avoid |
-|------|------------|------------------|
-| **UPRN** | Unique Property Reference Number — the government-issued permanent identifier for a physical address in the UK. | "property ID", "address ID", "code" |
-| **Postcode** | A UK postal code used to group nearby addresses; the primary search key for finding EPC records. | "zip code", "postal code" |
-| **User Address** | A free-text address string provided by a user or imported from a customer dataset, before any normalisation or matching. | "user input", "raw address", "user_inputed_address" |
-| **Dwelling** | A single residential unit that can hold an EPC — a house, flat, or maisonette. | "property", "unit", "home" |
-
-## Address Matching
-
-| Term | Definition | Aliases to avoid |
-|------|------------|------------------|
-| **Lexiscore** | A similarity score in [0, 1] between a user address and a candidate EPC address; combines token overlap and character-level similarity. | "score", "match score", "similarity" |
-| **Lexirank** | Dense rank of candidates sorted by lexiscore descending; rank 1 = best match. | "rank", "position" |
-| **UPRN Candidate** | An EPC search result that is a plausible match for a given user address, before scoring decides the winner. | "match candidate", "result" |
-| **Score Threshold** | The minimum lexiscore (currently 0.6) below which no match is returned even if a candidate exists. | "minimum score", "cutoff" |
-| **Ambiguous Match** | A matching outcome where two or more candidates share lexirank 1, making it impossible to select a unique winner. | "tie", "draw", "duplicate" |
-| **Best Match** | The single UPRN candidate with lexirank 1 that meets or exceeds the score threshold. | "winner", "top result" |
-
-## API and Integration
-
-| Term | Definition | Aliases to avoid |
-|------|------------|------------------|
-| **EPC Search Result** | A lightweight record returned by the government domestic search endpoint — contains address lines, postcode, UPRN, band, and certificate number but not the full certificate data. | "search row", "EPC row", "result" |
-| **EPC Property Data** | The fully mapped domain object produced after fetching and parsing a complete EPC certificate. | "EPC data", "certificate data", "parsed EPC" |
-| **Old EPC API** | The retired government API (`epc.opendatacommunities.org`) using HTTP Basic auth; decommissioned May 2026. | "legacy API" |
-| **New EPC API** | The replacement government API (`api.get-energy-performance-data.communities.gov.uk`) using Bearer token auth. | "new API", "current API" |
-| **Bearer Token** | The auth credential required by the new EPC API; stored in the `EPC_AUTH_TOKEN` environment variable. | "API key", "auth token", "secret" |
-
-## Relationships
-
-- An **EPC** belongs to exactly one **Dwelling** and has one **Certificate Number**.
-- A **Dwelling** may have multiple **EPCs** across time; the one with the most recent **Registration Date** is the current one.
-- A **UPRN** identifies a **Dwelling** permanently; it does not change when the property changes owner.
-- An **EPC Search Result** is a summary; it points to a full **EPC** via its **Certificate Number**.
-- **Address Matching** uses a **User Address** and **Postcode** to find a **UPRN** by scoring **UPRN Candidates** from an EPC search.
-- A **Lexirank** of 1 with no **Ambiguous Match** and a **Lexiscore** ≥ the **Score Threshold** produces a **Best Match**.
-
-## Example dialogue
-
-> **Dev:** "We have a user address and postcode. How do we find the UPRN?"
-
-> **Domain expert:** "Search the **New EPC API** by **Postcode** — you get back a list of **EPC Search Results** for that area. Each one has an address and a **UPRN**. Score each against the **User Address** using the **Lexiscore**. If the top **UPRN Candidate** scores above the **Score Threshold** and there's no **Ambiguous Match**, that's your **Best Match**."
-
-> **Dev:** "What if two results share the same address line 1?"
-
-> **Domain expert:** "That's an **Ambiguous Match** — two candidates at **Lexirank** 1. Fall back to scoring on the full address using all address lines joined together. If that still ties, return nothing."
-
-> **Dev:** "Once we have the best match, do we use the UPRN or fetch the full EPC?"
-
-> **Domain expert:** "Depends on what you need. The **EPC Search Result** gives you the **EPC Band** and **Certificate Number**. If you need energy efficiency detail, use the **Certificate Number** to fetch the full **EPC Property Data**."
-
-## Flagged ambiguities
-
-- **"address"** appears as both the raw **User Address** (free-text from customer data) and a structured field on an **EPC Search Result** (normalised address lines). Always qualify: "user address" vs "EPC address" or "address line 1".
-- **"score"** is used for the `AddressMatch.score()` function output, the `lexiscore` DataFrame column, and informally in conversation. Prefer **Lexiscore** in domain discussions; reserve "score" for method-level code comments.
-- **"user_inputed_address"** in `backend/address2UPRN/main.py` is a misspelling and a synonym for **User Address** — the canonical term. New code should use `user_address`.
-- **"EPC"** is overloaded as both the document (an Energy Performance Certificate) and the rating band letter. Use **EPC** for the document and **EPC Band** for the letter.
+If you arrived here from a link in `CLAUDE.md` or older docs, follow the link above. This file is kept only to preserve git history and may be removed once internal references are updated.
diff --git a/ara_backend_design.md b/ara_backend_design.md
new file mode 100644
index 00000000..de64cc1d
--- /dev/null
+++ b/ara_backend_design.md
@@ -0,0 +1,783 @@
+# ARA Backend Redesign — Design PRD
+
+**Status**: Draft for team review
+**Author**: Khalim Conn-Kowlessar (with Claude grill session)
+**Branch**: `ara-backend-design-prd`
+**Scope**: Service architecture + domain model + contracts for the new modelling backend. Linked sub-PRDs cover ML training pipeline, DB schema migration, and historical EPC re-mapping.
+
+---
+
+## 1. Context
+
+### 1.1 The forcing function
+
+The current modelling backend (`backend/engine/engine.py` — `model_engine`, 1331 LOC) was built as an MVP. It is:
+
+- **Tightly coupled** to a specific gov EPC API that is being **decommissioned on 30 May 2026** (~17 days from today).
+- **A monolith** — one async function reaches into DB modules, HTTP clients, ML lambdas, S3, and queue infrastructure directly.
+- **Bottlenecked on a single person** — Khalim is the only contributor able to safely modify the engine because no one else can predict the blast radius of a change.
+- **Already returning erroneous data** from the old API (clients are aware). The replacement API is partially built (`backend/epc_client/epc_client_service.py`) on the current feature branch.
+
+### 1.2 What needs to change
+
+Beyond just swapping API clients, this is the moment to **rebuild the backend into a production-grade, contribute-able codebase**, with:
+
+- A clear domain model rooted in the new EPC schema (`EpcPropertyData`).
+- Service boundaries that other team members can read, fix, and extend without needing the entire mental model.
+- Repository-mediated persistence so business logic can be tested without spinning up a database.
+- A separation between **data fetching** (slow, IO-heavy, external) and **modelling** (deterministic, fast, internal).
+- Baseline kWh and bills derived deterministically from the Effective EPC (SAP physics + UCL correction + per-fuel rates from a refreshable repo) rather than from the EPC's recorded cost fields (which use fuel rates pinned to the inspection date) or from an ML kWh prediction.
+
+### 1.3 Out of scope for this PRD
+
+These ship as **linked sub-PRDs**:
+
+- **Sub-PRD (ii) — ML training pipeline** (autogluon repo + parquet generation in this repo + scoring model retraining for the new EPC schema)
+- **Sub-PRD (iii) — DB schema migration** (new tables: `site_notes`, `landlord_overrides`, EPC cache, parallel write strategy)
+- **Sub-PRD (iv) — Historical EPC re-mapping** (one-off + ongoing batch job: legacy stored EPCs → new `EpcPropertyData` shape)
+
+The contracts this PRD defines are the inputs each sub-PRD consumes.
+
+---
+
+## 2. Goals and non-goals
+
+### 2.1 Goals
+
+1. **Survive the 30 May API shutdown** — even if it means a brief degraded window, modelling continues to function against the new gov EPC API.
+2. **Decouple data fetching from modelling** — modelling never makes external HTTP calls; it reads everything from repositories.
+3. **Make every service unit-testable against fakes** — no test needs a real DB, a real gov API, or a real ML lambda to verify business logic.
+4. **Establish a single `Property` aggregate root** as the domain centrepiece; all 9 modelling concerns are slices of one aggregate.
+5. **Versioned ML data contract** — the EPC-to-features transform is the single shared artifact between this repo and the autogluon repo.
+6. **Per-property UI surfaces** — fetched data can be shown to users for review and override **before** modelling runs; modelling is triggered separately. This will enable a landlord facing version of the product where we fetch the open data, present back to the user for review and then perform the modelling.
+
+### 2.2 Non-goals
+
+- Multi-region deploy, GDPR-class data minimisation work, or compliance reporting — separate workstreams.
+- Replacement of the front-end. The new APIs preserve enough of the existing response shape that the FE migrates incrementally.
+- Removing pandas. The ML transform output is a parquet-friendly DataFrame-like shape; that stays.
+- A workflow engine (Prefect / Temporal / Airflow). Coordinator-class orchestration plus the existing SQS-fanout pattern is sufficient at the scale we serve.
+
+---
+
+## 3. Cutover plan
+
+Forced cut-over, driven by the 30 May deadline. There is no strangler period because the Old EPC API death takes `model_engine` with it.
+
+### 3.1 Phase 0 — Status quo (now → 30 May)
+
+- `model_engine` keeps running against the Old EPC API for as long as it works.
+- Build of the 9 new services starts **this week**, in parallel to the old engine continuing to serve traffic.
+- The new `ara/` package lives alongside `backend/` but is not yet wired into any production endpoint.
+- Goal: keep the lights on until the API dies; start the build immediately so the dark period is short.
+
+### 3.2 Phase 1 — Forced cut-over (30 May onwards)
+
+- On 30 May the Old EPC API dies; `model_engine` ceases to function for any new modelling run.
+- Some downtime is expected and accepted. Clients are aware.
+- Modelling resumes when the new pipeline is ready end-to-end. Remains to be decided if we have a per-portfolio flag, purely for the front end to reference old tables where necessary. No parallel pipelines, no traffic split — the new pipeline is the only pipeline.
+- **Calico** and **Hyde** are the first live clients onto the new pipeline in June.
+- `model_engine`, `SearchEpc`, the legacy `Property`, and surrounding modules in `backend/` are deleted once the new pipeline is serving all traffic.
+
+### 3.3 What is *not* done
+
+- No strangler — there is nothing to strangle once the Old EPC API dies on 30 May.
+- No parallel-shadow run — would double compute and require diff tooling we don't have, while the old engine is already known to return bad data so diffs would be noise.
+- TBC per-portfolio feature flag. Without this, the cut-over is all-or-nothing. All old portfolios are broken.
+
+---
+
+## 4. Architecture overview
+
+```
+┌─────────────────────────────────────────────────────────────────────┐
+│  Trigger endpoint(s)                                                │
+│  (one or two — see §4.5; deferred decision)                         │
+└───────────┬──────────────────────────────────────────┬──────────────┘
+            │                                          │
+            ▼                                          ▼
+   ┌─────────────────┐                       ┌─────────────────┐
+   │ IngestionPipe   │   SQS, batches of N   │ ModellingPipe   │
+   │ -----------     │ ◄─────────────────────│ -----------     │
+   │ Fetchers run    │                       │ Reads via Repos │
+   │ Persist via     │                       │ Calls Services  │
+   │ Repos           │                       │ ML predictions  │
+   └────────┬────────┘                       └────────┬────────┘
+            │                                         │
+            └───────────────► Repos ◄─────────────────┘
+                                  │
+                                  ▼
+                        ┌──────────────────┐
+                        │  Postgres tables │
+                        │  (property,      │
+                        │   epc_cache,     │
+                        │   site_notes,    │
+                        │   landlord_      │
+                        │   overrides,     │
+                        │   plans, etc.)   │
+                        └──────────────────┘
+
+  ┌──────────────────────────┐
+  │  RefreshOrchestrator     │  triggers Ingestion → diff → conditionally Modelling
+  └──────────────────────────┘
+```
+
+### 4.1 Class taxonomy
+
+Every class falls into exactly one of four roles:
+
+| Role | Job | Examples |
+|------|-----|----------|
+| **Fetchers** | Call external APIs. Return raw response data. No DB. | `EpcClientService`, `GeospatialFetcher`, `SolarFetcher`, `SiteNotesIngester` |
+| **Repos** | Persist and load domain aggregates. SQL hidden inside. No external IO. | `PropertyRepo`, `EpcCacheRepo`, `SiteNotesRepo`, `LandlordOverridesRepo`, `RecommendationsRepo`, `GenericDataRepo`, `SubtaskRepo` |
+| **Services** | Business logic over domain objects. No external IO except via injected Fetchers / Repos. | `EpcRemappingService`, `EpcPredictionService`, `EpcEnergyDerivationService`, `KwhImpactService`, `ImpactPredictionService`, `RecommendationService`, `OptimiserService`, `FeatureBuilder`, `ResultsPersister` |
+| **Orchestrators** | Compose Fetchers + Services + Repos to produce an end-to-end result. The only place where step order is encoded. | `IngestionPipeline`, `ModellingPipeline`, `RefreshOrchestrator` |
+
+This taxonomy is **strict**. A class that fetches *and* persists belongs in the Service layer and depends on a Fetcher + a Repo. No back-channels.
+
+### 4.2 Two pipelines, one direction
+
+Data flows one way only: **Ingestion → Repos → Modelling**.
+
+- **Ingestion** writes; never calls Modelling.
+- **Modelling** reads; never calls Fetchers.
+
+If Modelling needs fresh data, it returns "stale" and the caller decides whether to ingest first. This makes Modelling a pure function of repository state, which is the property that makes it reproducible, debuggable, and testable.
+
+### 4.3 RefreshOrchestrator
+
+Sits above both pipelines. Job:
+
+1. Trigger `IngestionPipeline` for a portfolio.
+2. After ingestion completes, ask repos: "did anything change vs the last modelled snapshot?"
+3. If yes, trigger `ModellingPipeline`. If no, return early.
+
+This avoids re-modelling 100k properties when only 200 had refreshed EPC data.
+
+### 4.4 SQS fanout (preserved from current architecture)
+
+The existing `trigger_plan_entrypoint` SQS-chunking pattern is kept. Both pipelines fan out per batch of ~30–100 properties (tuneable). Each consumer runs one batch end-to-end through the relevant pipeline.
+
+UPRN partitioning: the trigger endpoint groups UPRNs by **locality** (postcode prefix / UPRN range) before chunking, so each batch maximises shared upstream fetches (one geospatial-range pull serves all 30 properties in the batch).
+
+### 4.5 One endpoint for v1
+
+For Phase 1 we ship **one trigger endpoint** that internally chains Ingestion → Modelling via `RefreshOrchestrator`. This matches the current FastAPI-fronted Lambda pattern (the FastAPI app in `services/<svc>/` is a thin entrypoint that invokes the modelling Lambda).
+
+We can split into two endpoints later (refresh-only vs model-only) once a real workflow demands it — e.g. a Landlord-Override edit that should re-model without re-fetching open data. The class taxonomy and `RefreshOrchestrator` boundary allow this split without re-architecting.
+
+### 4.6 Trigger contract
+
+The trigger payload is reduced compared to today's `PlanTriggerRequest` ([backend/app/plan/schemas.py:98](../../backend/app/plan/schemas.py#L98)) — most of what's currently in the request body moves into the persisted `Scenario` aggregate.
+
+```python
+class ModelTriggerRequest(BaseModel):
+    portfolio_id: UUID
+    property_ids: list[UUID] | S3Ref           # inline up to ~10k, S3 ref above
+    scenario_ids: list[UUID]                   # 1+; resolved + pinned to ScenarioSnapshot at fan-out
+    task_id: UUID
+    subtask_id: UUID                           # SQS state machine, preserved from today
+```
+
+Everything that used to ride at the top level dies or moves:
+
+- `goal`, `budget`, `goal_value`, `inclusions`, `exclusions`, `required_measures`, `enforce_fabric_first`, `scenario_name`, `housing_type` → into `Scenario` / `ScenarioPhase`.
+- `patches_file_path`, `already_installed_file_path`, `non_invasive_recommendations_file_path` → gone; Landlord Overrides covers all three.
+- `valuation_file_path` → gone; `ValuationService` derives it.
+- `ashp_cop`, `default_u_values` → `HeatingSystemAssumptionsRepo` / global config; not per-trigger.
+- `multi_plan` → gone; `scenario_ids: list[...]` handles N runs natively (one Plan per scenario per property).
+- `event_type`, `epc_certificate_number`, `lmk_key`, `file_format`, `sheet_name`, `index_start`/`index_end`, `file_type` → ingestion-side concerns; if needed, ride on a separate ingestion-trigger payload.
+
+**Scenario snapshotting**: at fan-out time `RefreshOrchestrator` reads each requested `Scenario`, writes a `ScenarioSnapshot` keyed by `(task_id, scenario_id)`, and per-batch SQS messages reference the snapshot. Mid-run edits to the live `Scenario` do not affect an in-flight modelling job. Snapshots are read-only and can be garbage-collected after the task completes.
+
+---
+
+## 5. Domain model
+
+### 5.1 Aggregate root: `Property`
+
+`Property` is the centrepiece. Every service operates on one or more `Property` instances. Every repo writes one slice of `Property`. The aggregate carries all state for a single property's modelling run.
+
+```python
+@dataclass
+class PropertyIdentity:
+    portfolio_id: UUID
+    uprn: Optional[int]
+    landlord_property_id: Optional[str]
+    address: AddressLines
+    postcode: str
+
+@dataclass
+class Property:
+    identity: PropertyIdentity
+
+    # --- Source data — modelling path is determined by which of these are set ---
+    epc: Optional[EpcPropertyData]              # from gov API (or remapped historical)
+    site_notes: Optional[SiteNotes]             # our own survey; supersedes EPC when present
+    landlord_overrides: Optional[LandlordOverrides]   # sparse, only meaningful when epc set
+
+    # --- Enrichments ---
+    geospatial: Optional[GeoSpatial]
+    solar: Optional[SolarPotential]
+    epc_anomaly_flags: Optional[EpcAnomalyFlags]  # from EpcPredictionService vs neighbours
+
+    # --- Modelling outputs ---
+    baseline_performance: Optional[BaselinePerformance]   # carries lodged + effective pair; see §5.4
+    recommendations: list[Recommendation]
+    impact_predictions: Optional[ImpactPredictions]
+    plans: list[Plan]                                     # one per Scenario the property was modelled against
+
+    # --- Derived ---
+    @property
+    def source_path(self) -> Literal["site_notes", "epc_with_overlay"]: ...
+
+    @property
+    def effective_epc(self) -> EpcPropertyData:
+        """The EPC the modelling pipeline actually scores against."""
+        ...
+```
+
+### 5.2 `Properties` collection
+
+A first-class iterable, so batch operations are obvious:
+
+```python
+@dataclass
+class Properties:
+    items: list[Property]
+
+    def __iter__(self) -> Iterator[Property]: ...
+    def __len__(self) -> int: ...
+    def filter(self, pred: Callable[[Property], bool]) -> "Properties": ...
+    def map(self, fn: Callable[[Property], Property]) -> "Properties": ...
+    def with_landlord_overrides(self) -> "Properties": ...
+```
+
+Services typically take and return `Properties`, not lists.
+
+### 5.3 Other aggregates
+
+| Aggregate | Owns | Repo |
+|---|---|---|
+| `Property` | property identity, epc, site_notes, landlord_overrides, enrichments, modelling results | `PropertyRepo` |
+| `Plan` | per-property modelling output for one Scenario: ordered `phases: list[PlanPhase]`, each carrying its `OptimisedPackage`, ending state snapshot, and rolled-over options | `RecommendationsRepo` |
+| `Scenario` | portfolio-wide scenario metadata (goal, budget, exclusions, housing type) plus ordered `phases: list[ScenarioPhase]`; each phase carries `measure_types_allowed`, phase budget, phase target | `RecommendationsRepo` |
+| `ScenarioSnapshot` | frozen copy of a `Scenario` pinned at trigger time, keyed by `(task_id, scenario_id)`, so mid-run scenario edits don't affect an in-flight modelling job | `RecommendationsRepo` |
+| `Subtask` / `Task` | SQS fanout state | `SubtaskRepo` |
+| `EpcCache` | gov-API responses keyed by UPRN, with freshness/TTL | `EpcCacheRepo` |
+| `GenericData` | UPRN-range geospatial, postcode lookups, shared static data | `GenericDataRepo` |
+| `FuelRates` | time-versioned, region-aware per-fuel rates (pence/kWh), standing charges, SEG export rate, calorific values | `FuelRatesRepo` |
+| `CarbonFactors` | time-versioned per-fuel CO2 emission factors (kgCO2e/kWh); Defra publishes annually | `CarbonFactorsRepo` |
+| `HeatingSystemAssumptions` | boiler efficiency tables, ASHP/GSHP COPs, solar-thermal coverage proportion; per-property physical assumptions, not fuel-market data | `HeatingSystemAssumptionsRepo` |
+
+Aggregates are loaded **whole** — never half a `Property`. If a slice is too large to load eagerly (e.g. recommendation history), it lives in a separate aggregate.
+
+A single-phase Scenario is `phases: [<one ScenarioPhase>]` with all measure types allowed and the full budget on it — no special-case path through the pipeline.
+
+### 5.4 `BaselinePerformance` carries lodged + effective
+
+```python
+@dataclass
+class BaselinePerformance:
+    # As-lodged: unmodified EPC fields (or Site Notes' recorded values where Site Notes are the source).
+    lodged_sap: int
+    lodged_band: Epc
+    lodged_carbon: float
+    lodged_heat_demand: float
+
+    # Effective: what the modelling pipeline actually scored against.
+    # Equals lodged when neither rebaselining trigger fires; equals ML output when rebaselined.
+    effective_sap: int
+    effective_band: Epc
+    effective_carbon: float
+    effective_heat_demand: float
+
+    # kWh / fuel split / bills — always derived deterministically from the Effective EPC by
+    # EpcEnergyDerivationService (SAP physics + UCL correction + FuelRates lookup).
+    # Lodged kWh / bills are not stored separately — the EPC's recorded cost fields are pinned to
+    # inspection-date fuel rates, so we always re-derive bills from current FuelRates regardless.
+    annual_kwh: float
+    fuel_split: dict[Fuel, float]
+    annual_bills: dict[Fuel, float]
+
+    rebaselined: bool
+    rebaseline_reason: Optional[Literal["pre_sap10", "physical_state_changed", "both"]]
+```
+
+The pair lets the FE show "lodged rating vs SAP10-equivalent rebaselined rating" side by side without a separate query. Both fields are always populated; when no rebaselining trigger fires, `effective_*` equals `lodged_*`.
+
+---
+
+## 6. Source-of-truth and overlay precedence
+
+There are exactly **two modelling paths**. The `Property.source_path` property selects.
+
+### 6.1 Path 1 — Site notes
+
+If a `Property` has `site_notes` and they are newer than any available EPC (or no EPC exists), site notes are the **complete** source of truth:
+
+- `effective_epc` = `site_notes.to_epc_property_data()`.
+- EPC fields not covered by site notes — **none expected**. Site notes are committed to being a full-coverage survey. Treat any gap as a survey-quality bug, not a fallback signal.
+- `LandlordOverrides` are not applicable in Path 1 (the survey supersedes).
+
+### 6.2 Path 2 — EPC with landlord overlay
+
+If a `Property` has no site notes (or the EPC is newer):
+
+- `effective_epc` = `epc` with `landlord_overrides` applied as a sparse field-level overlay (`landlord > epc`).
+- `LandlordOverrides` are sparse: each row represents one corrected field. Schema TBD at implementation time; assume flat input via Excel/CSV for v1, with a flag to revisit shape after first customer onboarding.
+
+### 6.3 Recency tie-break
+
+When a property has **both** site notes and a public EPC, the newer of the two wins. Rationale: a recent EPC may reflect retrofit work done after our survey; conversely a recent survey reflects on-site observations the EPC cannot capture.
+
+This tie-break is implemented in `Property.source_path` and may be tuned later (e.g. always prefer surveys regardless of date, or per-portfolio policy).
+
+### 6.4 Rebaselining trigger
+
+ML re-predicts SAP / carbon / heat when **either** of these holds:
+
+1. **Pre-SAP10 schema** — `effective_epc.sap_version < 10.0`. The EPC was rated under SAP 2012 (or earlier) and we want a SAP10-equivalent baseline so all properties are scored against the same model version. Canonical signal is the `sap_version: float` field; fall back to `schema_type` string, then to `lodgement_date` if both are absent. Site Notes are assumed SAP10 by construction (PasHub / ECMK produce them now) — Path 1 typically doesn't trigger this leg.
+2. **Physical state changed** — `effective_epc` differs from the lodged EPC's physical fields (walls / heating / windows / etc.). Triggered by Landlord Overrides changing physical state, or by Site Notes that contradict the lodged EPC.
+
+When triggered, a single ML call re-predicts SAP/carbon/heat with the current Effective EPC state as input. Both reasons can fire together; the prediction is still one call.
+
+kWh is **always** re-derived via `EpcEnergyDerivationService` — even when no ML rebaseline runs — because the EPC's recorded cost fields use fuel rates pinned to the inspection date, and current rates from `FuelRatesRepo` are what we want to surface to users.
+
+The diff mechanism for "physical state changed" (content hash, dirty flag, etc.) is an implementation detail; start with a content hash of the physical-state subset of `EpcPropertyData` stored alongside the previous run.
+
+### 6.5 Deprecated concepts
+
+- **Patches** (`patch_epc`) — removed. Functionality subsumed by `LandlordOverrides`.
+- **Already-installed measures** — likely subsumed by `LandlordOverrides` ("we have a heat pump now" → override heating fields). Confirmed at implementation time.
+- **Non-invasive recommendations** — TBD whether this concept survives; not blocking.
+
+---
+
+## 7. Persistence: repositories and unit of work
+
+### 7.1 What a repository is
+
+A repository owns the SQL for one aggregate. Nothing else writes SQL for that aggregate. Callers see only domain objects.
+
+```python
+class PropertyRepo(Protocol):
+    def get(self, identity: PropertyIdentity) -> Optional[Property]: ...
+    def bulk_save(self, uow: UnitOfWork, properties: Properties) -> None: ...
+    def find_by_portfolio(self, portfolio_id: UUID) -> Properties: ...
+    def find_stale(self, portfolio_id: UUID, threshold: timedelta) -> Properties: ...
+```
+
+Implementation references current `db_funcs.*` modules during phase 0 to avoid a big-bang SQL rewrite, but the interface is fixed.
+
+### 7.2 Unit of Work
+
+Multi-table writes inside a single aggregate, or across aggregates that share a transaction (e.g. property + plan + recommendations) go through a `UnitOfWork`:
+
+```python
+with self.uow_factory() as uow:
+    self.property_repo.bulk_save(uow, properties)
+    self.recommendations_repo.bulk_save(uow, plans)
+    uow.commit()
+```
+
+UoW owns the SQLAlchemy session lifecycle. Repos use the session passed in via the UoW. Outside a UoW, repos use a short-lived read session.
+
+### 7.3 Repository inventory
+
+| Repo | Tables it owns |
+|------|----------------|
+| `PropertyRepo` | `properties`, `property_details_epc`, `property_spatial` |
+| `EpcCacheRepo` | new table: `epc_api_cache` (TTL, raw API response, mapped `EpcPropertyData`) |
+| `SiteNotesRepo` | new table: `site_notes` (replaces current `energy_assessments`) |
+| `LandlordOverridesRepo` | new table: `landlord_overrides` (sparse, per-field rows for audit) |
+| `RecommendationsRepo` | `plans`, `plan_phases`, `recommendations`, `recommendation_parts`, `scenarios`, `scenario_phases`, `scenario_snapshots` |
+| `GenericDataRepo` | new table or S3-backed: UPRN-range geospatial + postcode-keyed shared static data |
+| `FuelRatesRepo` | new table: `fuel_rates` — `(fuel_type, rate_pence_per_kwh, standing_charge_pence_per_day, calorific_value_kwh_per_unit, unit, effective_from, effective_to, region_code Optional, source)`. SEG export rate is a row with `fuel_type = 'electricity_export'`. |
+| `CarbonFactorsRepo` | new table: `carbon_factors` — `(fuel_type, kgco2e_per_kwh, effective_from, effective_to, source)`. Defra publishes annually. |
+| `HeatingSystemAssumptionsRepo` | new table(s): boiler efficiency, ASHP/GSHP COP, solar-thermal coverage proportion. Static-ish, manual refresh. |
+| `SubtaskRepo` | `tasks`, `subtasks` (existing) |
+
+DDL migrations are scoped to sub-PRD (iii).
+
+### 7.4 Fakes
+
+For tests, each repo has a `FakeXRepo` companion backed by a dict. Service unit tests inject fakes. No DB required.
+
+---
+
+## 8. ML contract
+
+### 8.1 Where ML lives
+
+| Concern | Owner |
+|---|---|
+| Defining the EPC → features transform | **This repo** (`ara.domain.ml.EpcMlTransform`) |
+| Loading data, applying transform, writing training parquet to S3 | **This repo** (sub-PRD (ii) batch job) |
+| Training, hyperparameter search, deployment | **Autogluon repo** |
+| Scoring at modelling time | **This repo** (`FeatureBuilder` calls `EpcMlTransform`, sends DataFrame to deployed lambda) |
+
+The autogluon repo is intentionally **dumb**: it consumes parquet, knows which column is the target, knows which columns to ignore. It has no EPC semantics.
+
+### 8.2 `EpcMlTransform`
+
+A separate class (not a method on `EpcPropertyData`), because:
+
+- The data class stays clean of training-infrastructure concerns.
+- Versioned transforms (`EpcMlTransformV1`, `EpcMlTransformV2`) swap easily.
+- Future need: injection of normalisation stats from the training set is straightforward on a class, awkward on a dataclass.
+
+```python
+class EpcMlTransform:
+    VERSION: str = "1.0.0"  # semver
+
+    def to_row(self, epc: EpcPropertyData) -> dict[str, Any]: ...
+    def to_rows(self, properties: Properties) -> pd.DataFrame: ...
+    def schema(self) -> dict[str, type]: ...  # for parquet emission + validation
+```
+
+The interesting work — flattening `List[SapWindow]`, `List[SapBuildingPart]` into fixed-width columns — lives inside this class. Domain decisions (top-N windows, aggregate roofs, etc.) are encoded here and reviewed by Khalim. Sub-PRD (ii) goes into detail.
+
+### 8.3 Versioning
+
+- Transform class is **semver-tagged** (`VERSION = "1.0.0"`).
+- S3 path for training parquet includes the version: `s3://.../training/v1.0.0/...`.
+- Deployed scoring lambda is tagged with the transform version it was trained against.
+- Modelling pipeline asserts at startup that its `EpcMlTransform.VERSION` matches the deployed lambda's tag; mismatch = hard fail at deploy time.
+
+Bump major when removing or renaming columns. Bump minor when adding optional columns (older models still scoreable; new models can be trained against new fields).
+
+### 8.4 ML model families
+
+Both ML calls (rebaselining + per-measure impact) use the same `EpcMlTransform`:
+
+| Service | Lambda | Target |
+|---|---|---|
+| `RebaseliningService` (S4b) | `baseline-models-*` | SAP / carbon / heat demand under the current Effective EPC state (SAP10-equivalent) |
+| `ImpactPredictionService` (S6) | `impact-models-*` | SAP / carbon / heat demand impact per measure (and per battery option, using new EPC battery fields) |
+
+Annual kWh and bills are never an ML target — derived deterministically by `EpcEnergyDerivationService` (S4a). Recommendation kWh delta is derived from the SAP delta predicted by S6 plus heating-system fuel + COP, not via a separate ML call.
+
+The two families are trained against the same input feature schema; only target columns differ. Sub-PRD (ii) handles training-time details.
+
+---
+
+## 9. Service catalogue
+
+The classes below implement the pipeline end-to-end. Detailed signatures are deliberately left for implementers — this PRD documents purpose, dependencies, and rough shape; per-service grill sessions produce the contracts.
+
+**Out of the legacy engine** (deleted, not migrated): `PredictionMatrix` (debug-only, moves to test fixtures), `extract_portfolio_aggregation_data` (dead code, FE aggregates dynamically per §10), inspections plumbing (`inspections_map` is initialised but never populated in the current engine), patches / `already_installed` / `non_invasive_recommendations` (subsumed by Landlord Overrides), ECO4 / WHLG funding integration (`get_funding_data` and `optimise_with_scenarios`' funding paths), the pre-recommendation kWh ML lambda (`KWH_MODEL_PREFIXES`), and floor-count / heat-loss-perimeter estimation from geospatial (now on `EpcPropertyData`). Address matching (`address2UPRN`) lives as a separate service, not inside `EpcClientService`.
+
+### 9.1 Fetchers (called by `IngestionPipeline`)
+
+| # | Class | Purpose | Dependencies |
+|---|---|---|---|
+| F1 | `EpcClientService` | Fetches EPCs from new gov API. Already exists at `backend/epc_client/`. Scope narrows compared to current `SearchEpc` — address matching (`address2uprn`) and OS API estimation are not its concern. | httpx |
+| F2 | `GeospatialFetcher` | Fetches UPRN-range geospatial data. Replaces `OpenUprnClient`. **Floor count and heat-loss perimeter estimation are no longer needed** — both are now on `EpcPropertyData` directly (`number_of_storeys`, `SapFloorDimension.heat_loss_perimeter_m`). Scope reduces to building geometry and postcode-area context. | S3 / Ordnance Survey API |
+| F3 | `SolarFetcher` | Wraps Google Solar API; building-level + unit-level scenes. | Google Solar API |
+| F4 | `SiteNotesIngester` | Loads site notes from Excel uploads / structured input. Persists via `SiteNotesRepo`. | S3, repo |
+| F5 | `FuelRatesFetcher` | Scheduled ETL — scrapes Ofgem regional caps and per-fuel rates, writes timeseries rows to `FuelRatesRepo`. Manual CSV upload fallback for off-cycle corrections. | Ofgem feed, repo |
+| F6 | `CarbonFactorsFetcher` | Same shape as F5 against Defra's annual CO2 factor publication. | Defra feed, repo |
+
+### 9.2 Domain services (called by `ModellingPipeline`)
+
+| # | Class | Original-list # | Purpose | Reads | Writes |
+|---|---|---|---|---|---|
+| S1 | `EpcRemappingService` | 4 | Re-map legacy / historical EPCs into new `EpcPropertyData` shape. | `EpcCacheRepo` | `EpcCacheRepo` (mapped column) |
+| S2 | `EpcPredictionService` | 3 | For every property: produce predicted EPC + per-field anomaly flags vs neighbours. Used both for gap-fill (Path 2 if EPC missing) and UI surfacing. | `EpcCacheRepo`, `GenericDataRepo` | — |
+| S3 | `FeatureBuilder` | (new) | Wraps `EpcMlTransform`. Converts `Properties` → scoring DataFrame. | — | — |
+| S4a | `EpcEnergyDerivationService` | (new) | Derives annual kWh + fuel split + bills from the Effective EPC. Deterministic, no ML. Pipeline: (1) source regulated PEUI — either from `energy_consumption_current × floor_area` when EPC field present and no physical override, or from SAP physics (heat demand × area + SAP hot-water + SAP lighting) for Site Notes / overridden cases; (2) add appliance + cooking via SAP Appendix L formulas (port of [`AnnualBillSavings.estimate_appliances_energy_use`](../../backend/ml_models/AnnualBillSavings.py)); (3) apply UCL per-band correction (Few et al. 2023, Table 3), keyed on the **post-state Effective EPC's band** — not the lodged band; (4) decompose total PEUI into end-use shares via SAP-physics proportions; (5) primary→delivered per fuel using SAP primary factors; (6) bills = delivered kWh per fuel × current rate from `FuelRatesRepo` + standing charges + SEG credits. CO2 emissions from `CarbonFactorsRepo`. | `FuelRatesRepo`, `CarbonFactorsRepo`, `HeatingSystemAssumptionsRepo` | — |
+| S4b | `RebaseliningService` | (new, partial overlap with old "rebaselining" logic) | Triggered by §6.4 conditions (pre-SAP10 schema **or** physical state changed). Calls SAP/carbon/heat ML lambdas to produce SAP10-equivalent baseline against the current Effective EPC state. Both `BaselinePerformance.lodged_*` and `effective_*` are populated downstream — pair is always stored, equal when not rebaselined. kWh is re-derived via S4a, not ML. | `FeatureBuilder` | — |
+| S5 | `RecommendationService` | 6 | Generates per-property recommendations against the current rolling Effective EPC. Invoked **once per (scenario × phase)** — filters candidates to the phase's `measure_types_allowed`, returns candidates eligible against the post-prior-phase state. Replaces current `Recommendations` (1383 LOC). | `MaterialsRepo` | — |
+| S6 | `ImpactPredictionService` | 7 | Calls SAP / carbon / heat impact ML lambda for **every** candidate recommendation (FE displays all options to user). Invoked per (scenario × phase) with the rolling state's feature vector. Recommendation kWh delta is derived deterministically from SAP delta + heating-system fuel/COP, not from a separate ML call. Battery impact uses the new EPC battery fields (`energy_pv_battery_count`, `energy_pv_battery_capacity`) as ML inputs — the deterministic `BatterySAPScorer` from the legacy engine is replaced by ML prediction. | `FeatureBuilder` | — |
+| S7 | `OptimiserService` | 8 | Per-phase optimisation against rolling state. Reads `PlanPhase.state_at_end[n-1]` to honour cross-phase constraints (fabric-first, heat-pump-needs-insulation, ventilation). Wraps current `CostOptimiser` / `GainOptimiser` / `optimise_with_scenarios` minus the dead ECO-funding paths. Unselected candidates roll into phase n+1's candidate pool (auto vs user-marked TBD, §15). | — | — |
+| S8 | `ValuationService` | — | Estimates per-property valuation (current + post-retrofit) from academic-paper-based regression on EPC change, property type, region. Improvement on the existing `PropertyValuation.estimate` code — exact shape deferred to per-service grill. | — | — |
+| S9 | `ResultsPersister` | 9 | Final step: writes Plan (with `phases[]`) + Recommendations + Property updates via repos under one UoW, per scenario. | — | All write repos |
+
+### 9.3 Orchestrators
+
+| # | Class | Purpose |
+|---|---|---|
+| O1 | `IngestionPipeline` | Per-batch SQS consumer. Calls F1–F4, persists via repos. |
+| O2 | `ModellingPipeline` | Per-batch SQS consumer. Reads from repos, runs S1→S8 in order, ends with persistence. |
+| O3 | `RefreshOrchestrator` | Top-level: triggers Ingestion → diff → optionally Modelling. |
+
+### 9.4 `ModellingPipeline` step order
+
+For each `Property` in the batch, against each pinned `ScenarioSnapshot` from the trigger payload:
+
+```
+Per-property setup (runs once regardless of scenario count):
+  1.  PropertyRepo.get()  →  Property (epc, site_notes, overrides, geospatial, solar)
+  2.  EpcRemappingService — if epc is in legacy schema, upgrade to current
+  3.  EpcPredictionService — predicted EPC + per-field anomaly flags (always runs)
+  4.  Compute Property.effective_epc (path-1 or path-2)
+  5.  RebaseliningService — IF §6.4 conditions hold (pre-SAP10 OR physical state changed),
+                            re-predict SAP/carbon/heat via ML against the Effective EPC state.
+                            Populate BaselinePerformance.lodged_* + effective_*.
+  6.  EpcEnergyDerivationService — SAP-physics + UCL (post-state band) + FuelRates → kWh, fuel split, bills.
+
+Per-scenario loop:
+  Per-phase loop (in scenario phase order):
+    7.  RecommendationService — generate candidate measures, restricted to phase's measure_types_allowed,
+                                 against the rolling Effective EPC state (baseline for phase 1; updated for phase 2+).
+    8.  ImpactPredictionService — predict SAP/carbon/heat impact for those candidates, ML scored against
+                                   the rolling state's feature vector. All candidates scored (FE shows options).
+    9.  OptimiserService — select package within phase budget + phase goal. Reads earlier-phase state to honour
+                            cross-phase constraints (fabric-first, heat-pump-needs-insulation, ventilation).
+    10. Apply package → roll state forward (simulate post-package SAP / kWh / bills via S4a + impact predictions
+                        from step 8). Record `PlanPhase.state_at_end`. Unselected options become
+                        `PlanPhase.rolled_over_options` and are eligible candidates next phase.
+  11. ResultsPersister — write Plan (phases[]) + Recommendations under one UoW for this scenario.
+```
+
+Steps 1–6 run **once per property** regardless of scenario count.
+Steps 7–10 run **once per (scenario × phase)** per property.
+Step 11 runs once per scenario per property.
+
+Batching: steps 5, 8 batch the whole batch into one ML call where possible. Step 8's cost scales with `N_phases × N_scenarios × N_candidate_measures`; multi-phase pays its own ML bill, single-phase scenarios cost the same as today.
+
+Note vs the current `model_engine`: the **pre-recommendation** kWh ML call has been removed. Baseline kWh now comes from `EpcEnergyDerivationService` (SAP physics + UCL + FuelRates). ML is reserved for SAP/carbon/heat (rebaselining + impact prediction). Recommendation-level kWh delta is derived deterministically from the impact-predicted SAP delta plus heating-system fuel + COP from `HeatingSystemAssumptionsRepo`; no separate kWh ML lambda.
+
+**Open future change** (flagged §15): SAP-impact-of-a-measure is not strictly additive — installing measure A changes the SAP impact of measure B. The current per-measure ML scoring + linear optimisation approximates this. A future iteration may pre-define candidate packages and ML-score whole packages, accepting the combinatorial cost in return for accuracy. Defer until implementation reveals where the approximation hurts.
+
+### 9.5 Per-service contracts — deferred
+
+Method signatures, return types, error semantics, and edge-case behaviour are **explicitly out of scope** for this PRD. The implementer of each service runs a `/grill-me` session against this document and produces a detailed sub-design before coding.
+
+---
+
+## 10. Cross-batch concerns
+
+| Concern | Status | Approach |
+|---|---|---|
+| Building-level solar adjustment | Deferred — future TODO, not implemented today. | The current `building_ids` block in `model_engine` is dead-ish; it operates on the in-process batch only. New design preserves that limitation. Future feature: a post-modelling consolidation pass that groups results by `building_id` across batches and re-optimises. |
+| Portfolio aggregation | Dropped. | Front-end computes aggregations dynamically from per-property plans. `extract_portfolio_aggregation_data` in current engine is dead code (defined, never called) — deleting. |
+| Shared upstream data | Handled by orchestrator partitioning + `GenericDataRepo`. | Trigger endpoint groups UPRNs by postcode / UPRN-range before SQS chunking so each batch maximises intra-batch sharing. `GenericDataRepo` caches across batches so first batch pays, subsequent batches hit cache. |
+
+---
+
+## 11. Repository layout — monorepo via uv workspaces
+
+The repo is restructured as a Python monorepo using **uv workspaces**. Shared types and shared infra live as workspace packages under `packages/`; each deployable Lambda or microservice lives as its own package under `services/`. Each `services/<svc>/` has its own `pyproject.toml`, `Dockerfile`, and Lambda image — the bundle contains only that service's deps + its workspace deps, keeping cold-start size and package weight contained.
+
+```
+/
+├── pyproject.toml                      # workspace root
+├── uv.lock
+│
+├── packages/                           # shared workspace packages — imported by services/
+│   ├── domain/                         # "domna-domain"
+│   │   ├── pyproject.toml
+│   │   └── src/domain/
+│   │       ├── property.py             # Property, Properties, PropertyIdentity
+│   │       ├── site_notes.py
+│   │       ├── landlord_overrides.py
+│   │       ├── baseline_performance.py # lodged + effective pair
+│   │       ├── plan.py                 # Plan, PlanPhase, OptimisedPackage
+│   │       ├── scenario.py             # Scenario, ScenarioPhase, ScenarioSnapshot
+│   │       ├── recommendation.py
+│   │       ├── geospatial.py
+│   │       ├── solar.py
+│   │       ├── anomaly_flags.py
+│   │       └── ml/
+│   │           ├── transform.py        # EpcMlTransform (versioned)
+│   │           └── schema.py
+│   │
+│   ├── repos/                          # "domna-repos" — persistence, no business logic
+│   │   ├── pyproject.toml
+│   │   └── src/repos/
+│   │       ├── unit_of_work.py
+│   │       ├── property_repo.py
+│   │       ├── epc_cache_repo.py
+│   │       ├── site_notes_repo.py
+│   │       ├── landlord_overrides_repo.py
+│   │       ├── recommendations_repo.py
+│   │       ├── generic_data_repo.py
+│   │       ├── fuel_rates_repo.py
+│   │       ├── carbon_factors_repo.py
+│   │       ├── heating_system_assumptions_repo.py
+│   │       └── subtask_repo.py
+│   │
+│   ├── fetchers/                       # "domna-fetchers" — external API clients
+│   │   ├── pyproject.toml
+│   │   └── src/fetchers/
+│   │       ├── epc_client.py           # wraps backend/epc_client/
+│   │       ├── geospatial.py
+│   │       ├── solar.py
+│   │       ├── fuel_rates_fetcher.py
+│   │       └── carbon_factors_fetcher.py
+│   │
+│   └── utils/                          # "domna-utils" — logging, AWS, S3, cloudwatch, subtasks
+│       ├── pyproject.toml
+│       └── src/utils/
+│
+├── services/                           # deployable units, one Lambda image each
+│   ├── ara/                            # the modelling backend
+│   │   ├── pyproject.toml              # deps: domna-domain, domna-repos, domna-fetchers, domna-utils, ML libs
+│   │   ├── Dockerfile
+│   │   ├── src/ara/
+│   │   │   ├── services/               # EpcRemappingService, EpcPredictionService,
+│   │   │   │                           # EpcEnergyDerivationService, RebaseliningService,
+│   │   │   │                           # FeatureBuilder, RecommendationService,
+│   │   │   │                           # ImpactPredictionService, OptimiserService,
+│   │   │   │                           # ValuationService, ResultsPersister
+│   │   │   ├── orchestrators/          # IngestionPipeline, ModellingPipeline, RefreshOrchestrator
+│   │   │   └── lambdas/                # handler.py per Lambda + event-shape contracts
+│   │   └── tests/
+│   │       ├── fakes/                  # FakePropertyRepo, FakeEpcClient, etc.
+│   │       ├── unit/                   # service tests using fakes only
+│   │       └── integration/            # real DB + real SQS via localstack
+│   │
+│   ├── address2uprn/                   # messy-address → UPRN matching, pre-modelling step
+│   │   ├── pyproject.toml
+│   │   ├── Dockerfile
+│   │   └── src/address2uprn/
+│   ├── hubspot/                        # existing Hubspot ETL
+│   ├── pashub/                         # PasHub survey ingestion
+│   ├── ecmk/                           # ECMK assessment ingestion
+│   └── magicplan/                      # MagicPlan integration
+│
+├── backend/                            # legacy FastAPI app + microservices, kept until cut-over
+│   ├── app/                            # FastAPI; thin entrypoints that invoke service Lambdas
+│   └── ...                             # legacy engine, SearchEpc, etc.; deleted after cut-over
+│
+├── datatypes/                          # existing — EPC schemas; eventually folds into packages/domain/
+└── docs/
+    └── adr/                            # architectural decision records
+```
+
+**Boundary properties** (enforced by package structure, not convention):
+- A `services/<svc>/` package can `import domain.*`, `import repos.*`, `import fetchers.*`, `import utils.*`. It **cannot** import another service's modules — they're separate distributions with no cross-import path.
+- ADR-0003 (Ingestion / Modelling separation) is preserved: modelling services in `services/ara/src/ara/services/` depend only on `repos.*` + `domain.*`, never on fetchers. Orchestrators are the only place fetchers and services meet.
+
+**Migration** (incremental, not big-bang):
+1. Carve out `packages/domain/` first — fold `datatypes/epc/domain/` + the new aggregate types into it.
+2. Carve out `packages/utils/` from current `utils/` + `backend/utils/`.
+3. Carve out `packages/repos/` and `packages/fetchers/` once `services/ara/` is being built and needs them.
+4. `services/ara/` is greenfield — no legacy code lives in it.
+5. `services/address2uprn/`, `services/pashub/`, etc. are split out as their owners pick them up.
+6. `backend/` shrinks to the FastAPI entrypoint layer once everything else has moved.
+
+**Reused intact** (no rewrite needed at carve-out time):
+- `backend/epc_client/` → folds into `packages/fetchers/src/fetchers/epc_client.py`.
+- `datatypes/epc/domain/` → folds into `packages/domain/src/domain/epc/`.
+- `recommendations/optimiser/` → wrapped by `services/ara/src/ara/services/optimiser.py`.
+- `backend/app/db/` → repos delegate into `db_funcs.*` until SQL is rewritten under sub-PRD (iii).
+
+---
+
+## 12. Testing strategy
+
+### 12.1 Unit tests (the bulk)
+
+Every service test injects fake fetchers and fake repos. No DB, no network, no ML lambda. A service test verifies one slice of logic in 5–30 lines.
+
+Example:
+
+```python
+def test_epc_prediction_flags_anomalous_wall_type():
+    neighbours = [_make_epc(wall_construction="solid") for _ in range(5)]
+    target = _make_property(epc=_make_epc(wall_construction="cavity"))
+    repo = FakeGenericDataRepo(neighbours_by_postcode={target.identity.postcode: neighbours})
+
+    svc = EpcPredictionService(generic_repo=repo)
+    result = svc.run(Properties([target]))
+
+    assert result[0].epc_anomaly_flags.wall_construction == "differs_from_neighbours"
+```
+
+### 12.2 Integration tests
+
+One per pipeline (Ingestion, Modelling, Refresh). Real Postgres (testcontainers or localstack), fake fetchers (hitting recorded fixtures), fake ML lambdas (returning canned predictions). Catches schema / SQL / transaction issues.
+
+### 12.3 Contract tests
+
+The transform (`EpcMlTransform`) has its own test suite:
+
+- Golden file: given a fixed `Property`, output matches an expected DataFrame row exactly.
+- Schema test: the output columns exactly match a checked-in CSV header (so autogluon team sees breakage on PR).
+
+### 12.4 What is NOT tested
+
+- The autogluon repo's training code — owned there.
+- The gov EPC API behaviour — assumed via the official spec.
+- Front-end aggregation logic — owned there.
+
+---
+
+## 13. Observability
+
+Each pipeline step emits a **structured log line** at start and end with:
+
+```
+{step, property_id, uprn, portfolio_id, subtask_id, duration_ms, outcome, error?}
+```
+
+Errors propagate with the `Property.identity` attached, so a portfolio of 100k can be triaged by grep.
+
+The existing task/subtask state machine is preserved — `IngestionPipeline` and `ModellingPipeline` update subtask status at start (`in progress`), end (`complete` / `failed`), with the CloudWatch log URL attached as today.
+
+CloudWatch alarms exist on subtask failure rate; thresholds remain unchanged.
+
+---
+
+## 14. Data flow: a worked example
+
+A landlord uploads a corrected heating system for UPRN 12345 via the UI.
+
+1. **UI** → `POST /properties/12345/overrides` → writes to `landlord_overrides` table via `LandlordOverridesRepo`.
+2. **RefreshOrchestrator** invoked (either automatically on override-write, or by a "re-model" button). Notes: ingestion is *not* triggered because no external state changed.
+3. **ModellingPipeline** invoked on a batch of `[12345]`:
+   - Reads `Property(uprn=12345)` from `PropertyRepo`.
+   - `Property.effective_epc` = epc + landlord_overrides → heating system fields differ from baseline.
+   - `RebaseliningService` triggered: ML re-predicts SAP / carbon / heat against the new effective EPC.
+   - `EpcEnergyDerivationService` re-runs over the new effective EPC to derive baseline kWh + fuel split + bills (no ML).
+   - `RecommendationService` regenerates recommendations against the new baseline.
+   - `OptimiserService` re-picks optimal package.
+   - `ResultsPersister` writes new plan under one UoW (old plan is superseded; whether to soft-archive is a sub-PRD (iii) decision).
+
+Total external calls: zero. The override write is the only thing that hit a network boundary, and that was the inbound HTTP from the UI.
+
+---
+
+## 15. Open questions for team review
+
+1. **One endpoint vs two** (§4.5) — **resolved**: single endpoint for Phase 1; split later when a real workflow demands it.
+2. **`LandlordOverrides` shape** (§6.2) — flat-Excel-shape for v1, with a flag to revisit after first customer.
+3. **`already_installed` and `non_invasive_recommendations`** (§6.5) — both likely subsumed by overlay, but final call deferred.
+4. **Recency tie-break policy** (§6.3) — default "newer wins"; team to consider per-portfolio override.
+5. **`GenericDataRepo` storage backend** — Postgres table, S3, or DynamoDB. Postgres is the path of least infra change; recommend defaulting to that.
+6. **Soft-archive vs hard-overwrite** for superseded plans (§14) — affects audit / undo behaviour. Defer to sub-PRD (iii).
+7. **Building-level optimisation as a Phase 2 service** (§10) — agreed deferred; flag for roadmap discussion.
+8. **Transform versioning policy** (§8.3) — semver chosen; team to confirm bump conventions.
+9. **UCL EPC-correction model** (§9.2 S4a) — **resolved**: Few et al. 2023 (Energy & Buildings 288, 113024). Implementation pattern already in [`AnnualBillSavings.adjust_energy_to_metered`](../../backend/ml_models/AnnualBillSavings.py) — port the per-band gradients/intercepts (Table 3) into `EpcEnergyDerivationService`, keyed on the post-state Effective EPC band.
+10. **Fuel-price source for bill calculation** (§9.2 S4a) — **resolved**: `FuelRatesRepo` is a time-versioned, region-aware table; ETL by `FuelRatesFetcher` (Ofgem feed + manual upload fallback). Per-portfolio override deferred to v2 — confirm whether Calico / Hyde have bulk-buy contracts before first onboarding.
+11. **kWh handling under Rebaselining** (§9.4) — **resolved**: ML re-predicts SAP/carbon/heat only; `EpcEnergyDerivationService` re-derives kWh from the rebaselined Effective EPC. Heating-fuel-type change is handled naturally because S4a re-reads heating fields from the Effective EPC.
+12. **Phase rollover semantics** (§9.2 S7) — when a candidate measure isn't selected in phase n, does it auto-roll into phase n+1's candidate pool, or does the user mark which measure types can roll? Auto is simpler; user-marked is more flexible. Decide at scenario-builder UX time.
+13. **Package-level vs per-measure ML scoring** (§9.4) — SAP impact of a measure is not strictly additive; the current per-measure scoring + linear optimisation approximates this. A future iteration may pre-define candidate packages and ML-score whole packages. Defer until per-service grill on `OptimiserService`.
+14. **UCL extrapolation scope** (§9.2 S4a) — the Few et al. paper is gas-heated, no PV, England + Wales only. Current legacy code applies the correction to all properties regardless. Keep silent extrapolation for v1, or stratify (no correction for non-gas / PV) and surface uncertainty to FE? Defer to per-service grill.
+15. **`ValuationService` rebuild** (§9.2 S8) — existing `PropertyValuation.estimate` cites several papers; the rebuild should improve the regression. Shape deferred to per-service grill.
+16. **Battery-via-ML cutover** (§9.2 S6) — confirm the new ML model is trained against `energy_pv_battery_count` + `energy_pv_battery_capacity` and the legacy `BatterySAPScorer` can be retired without regression for battery-equipped properties.
+
+---
+
+## 16. Linked sub-PRDs (placeholders)
+
+- **Sub-PRD (ii) — ML training pipeline** — `docs/sub-prds/ml-training-pipeline.md` (TBC)
+- **Sub-PRD (iii) — DB schema migration** — `docs/sub-prds/db-schema-migration.md` (TBC)
+- **Sub-PRD (iv) — Historical EPC re-mapping** — `docs/sub-prds/historical-epc-remap.md` (TBC)
+
+Each sub-PRD owner: TBC. Each is independently reviewable but consumes the contracts defined in §5 (`Property` aggregate), §7 (repos), §8 (ML transform).
+
+---
+
+## 17. Next steps
+
+1. Team review of this PRD (target: ~1 week).
+2. Open follow-up grill sessions per service (`/grill-me` on each of S1–S8 + F1–F4) before that service is implemented.
+3. Break into issues via `/to-issues` against the project tracker.
+4. Stand up the empty `ara/` package skeleton + fakes + first integration-test scaffold as PR-1.
+5. Land services in dependency order: domain → repos → fetchers → services → orchestrators → API.
+
+Phase 1 milestone gate: first portfolio (Calico or Hyde) routed through the new pipeline end-to-end in June, with a manual spot-check on 5 representative properties to confirm outputs are reasonable. No parity-against-old-engine check — the old engine is dead by then.
diff --git a/docs/adr/0001-two-source-paths.md b/docs/adr/0001-two-source-paths.md
new file mode 100644
index 00000000..82615810
--- /dev/null
+++ b/docs/adr/0001-two-source-paths.md
@@ -0,0 +1,10 @@
+# Two source paths for a Property, not layered precedence
+
+For modelling a Property we considered a strict layered precedence stack — `patches > site_notes > energy_assessment > epc > predicted` — with per-field provenance tracking. We rejected that in favour of **two strictly disjoint source paths**: a Property is modelled either from its Site Notes alone, or from the public EPC with Landlord Overrides applied on top. Site Notes are committed to being full-coverage by the domain ([CONTEXT.md](../../CONTEXT.md): _Site Notes_), so once we have them the EPC is irrelevant; conversely, Landlord Overrides are only meaningful when the EPC is the source of physical state.
+
+The trade-off: layered precedence is more flexible (it tolerates a partial Site Notes survey by falling through to EPC for missing fields), but mixed-source data muddles the audit trail and undermines the "if we surveyed it, trust the survey" promise. The two-path model gives a cleaner derivation rule and an unambiguous source-of-truth per Property, at the cost of treating survey gaps as a survey-quality bug rather than a fallback signal. A Recency Tie-Break covers the one case where both exist: the newer of the two wins.
+
+## Consequences
+
+- Reversing this means rewriting `Property.effective_epc` and every service that reads it. Hard to roll back once 12 services depend on the two-path shape.
+- Future addition of a third path (e.g. partial-survey) is a real change, not just a config tweak — flag it as an ADR if proposed.
diff --git a/docs/adr/0002-property-aggregate-root.md b/docs/adr/0002-property-aggregate-root.md
new file mode 100644
index 00000000..1114bc15
--- /dev/null
+++ b/docs/adr/0002-property-aggregate-root.md
@@ -0,0 +1,14 @@
+# `Property` is the aggregate root, not `EpcPropertyData`
+
+The Ara modelling pipeline produces nine slices of per-property data (EPC, geospatial, solar, baseline performance, recommendations, optimised package, etc.). We considered making `EpcPropertyData` — the rich RdSAP-21-style EPC schema — the centrepiece, with other data hanging off it. We rejected that and introduced a new **`Property` aggregate root** that holds identity, all source data (EPC, Site Notes, Landlord Overrides), enrichments, and modelling outputs as named fields. Services take `Property` (or `Properties`) and return them with one slice populated.
+
+Two reasons drove this:
+1. **Geospatial, solar, recommendations, and overrides are peers to the EPC**, not properties of it. Putting them on `EpcPropertyData` conflates physical-state schema with modelling-run state.
+2. **A typed `ModellingContext` dict-bag (the obvious alternative)** is exactly what the current legacy `Property` class became — 1259 lines of accumulated stuff, hard to read, hard to test, hard to extend. Named fields on a dataclass force the type system to keep us honest.
+
+The cost is more domain types up front (`Property`, `Properties`, `PropertyIdentity`, `BaselinePerformance`, `OptimisedPackage`, etc.) and the discipline of one service writing one slice. The benefit is that every service has a single job and every test injects fake repos against a small, named structure.
+
+## Consequences
+
+- Every service signature accepts or returns `Property` / `Properties`. Refactoring later means touching all of them.
+- `EpcPropertyData` stays a pure physical-state schema (defined in [datatypes/epc/domain/epc_property_data.py](../../datatypes/epc/domain/epc_property_data.py)) — no modelling outputs or run state on it.
diff --git a/docs/adr/0003-strict-ingestion-modelling-separation.md b/docs/adr/0003-strict-ingestion-modelling-separation.md
new file mode 100644
index 00000000..68361ba9
--- /dev/null
+++ b/docs/adr/0003-strict-ingestion-modelling-separation.md
@@ -0,0 +1,13 @@
+# Strict separation between Ingestion and Modelling
+
+Data flows one way only: **Ingestion → Repos → Modelling**. Modelling services never make external HTTP calls; Ingestion services never run business logic. If Modelling needs fresh data, it sees a stale record in a repo and returns; the caller (a refresh orchestrator or the FE) decides whether to ingest first. We considered allowing modelling services to call fetchers directly on cache miss — convenient — and rejected it.
+
+The trade-off is that modelling cannot "self-heal" by going to the gov EPC API when it finds stale data. The benefit is that modelling becomes a deterministic function of repository state: same Property in the repos, same modelling output. That is the property that makes modelling unit-testable against fakes (no DB, no network, no ML lambda), reproducible, and debuggable. It also enables a per-property UI flow where fetched data is shown to the user for review and possible override **before** modelling runs.
+
+Under the rushed timeline this constraint is more valuable, not less. Mixing fetchers into services is the easy thing to do when shipping fast; once it's done it's hard to extract.
+
+## Consequences
+
+- Every modelling service depends only on Repos (and other Services / domain logic). No HTTP libraries in the modelling import graph.
+- A `RefreshOrchestrator` is the only thing that calls Ingestion then Modelling in sequence; nothing else may.
+- "Modelling is stale, refetch in-line" is a forbidden pattern — surface staleness, do not silently repair it.
diff --git a/packages/README.md b/packages/README.md
new file mode 100644
index 00000000..0911a1d3
--- /dev/null
+++ b/packages/README.md
@@ -0,0 +1,16 @@
+# Shared packages
+
+Workspace packages consumed by `services/*`. Each package is its own Python distribution with its own `pyproject.toml`; services import via the workspace dependency mechanism (`{ workspace = true }`).
+
+| Package | Purpose |
+|---------|---------|
+| [`domain/`](./domain/) | Shared domain types — `Property`, `BaselinePerformance`, `Plan`, `Scenario`, `EpcPropertyData`, etc. No persistence, no IO, no business logic. |
+| [`repos/`](./repos/) | Persistence layer — one repo per aggregate. Owns the SQL. Depends on `domain`. |
+| [`fetchers/`](./fetchers/) | External API clients (gov EPC, Ofgem, Google Solar, etc.). Depend on `domain` for response shapes. |
+| [`utils/`](./utils/) | Cross-cutting infra — logging, S3, CloudWatch URL builders, SQS task helpers. |
+
+## Adding a new shared package
+
+Only when a real second consumer materialises. Don't pre-shatter (`repos-epc`, `repos-property`, ...) — split when a deployment needs to drop a dep, not before.
+
+See [`../ara_backend_design.md`](../ara_backend_design.md) §11 for the broader monorepo layout and [`../CONTEXT.md`](../CONTEXT.md) for the domain glossary that names the types living in `domain/`.
diff --git a/packages/domain/README.md b/packages/domain/README.md
new file mode 100644
index 00000000..6dc69d41
--- /dev/null
+++ b/packages/domain/README.md
@@ -0,0 +1,30 @@
+# domna-domain
+
+Shared domain types — `Property`, `Properties`, `BaselinePerformance`, `Plan`, `PlanPhase`, `Scenario`, `ScenarioPhase`, `ScenarioSnapshot`, `Recommendation`, `OptimisedPackage`, `EpcPropertyData`, etc.
+
+**Boundary**: types only. No persistence, no IO, no business logic. Other packages and services depend on `domna-domain`; this package depends on nothing internal.
+
+Domain definitions live in [`../../CONTEXT.md`](../../CONTEXT.md). New types added here must match the glossary terms.
+
+## Layout
+
+```
+src/domain/
+├── __init__.py
+├── property.py             # Property, Properties, PropertyIdentity
+├── site_notes.py
+├── landlord_overrides.py
+├── baseline_performance.py # lodged + effective pair (ADR-0004)
+├── plan.py                 # Plan, PlanPhase, OptimisedPackage
+├── scenario.py             # Scenario, ScenarioPhase, ScenarioSnapshot (ADR-0005)
+├── recommendation.py
+├── geospatial.py
+├── solar.py
+├── anomaly_flags.py
+└── ml/
+    ├── __init__.py
+    ├── transform.py        # EpcMlTransform (versioned per §8.3)
+    └── schema.py
+```
+
+When `datatypes/epc/domain/` folds in, the EPC schema types move under `src/domain/epc/`.
diff --git a/packages/domain/pyproject.toml b/packages/domain/pyproject.toml
new file mode 100644
index 00000000..5e820371
--- /dev/null
+++ b/packages/domain/pyproject.toml
@@ -0,0 +1,13 @@
+[project]
+name = "domna-domain"
+version = "0.1.0"
+description = "Shared domain types for the Ara modelling pipeline and sibling Domna services."
+requires-python = ">=3.11"
+dependencies = []
+
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+
+[tool.hatch.build.targets.wheel]
+packages = ["src/domain"]
diff --git a/packages/domain/src/domain/__init__.py b/packages/domain/src/domain/__init__.py
new file mode 100644
index 00000000..1d52198c
--- /dev/null
+++ b/packages/domain/src/domain/__init__.py
@@ -0,0 +1,4 @@
+"""Shared domain types for the Ara modelling pipeline and sibling Domna services.
+
+No persistence, no IO, no business logic. See README.md for layout.
+"""
diff --git a/packages/fetchers/README.md b/packages/fetchers/README.md
new file mode 100644
index 00000000..ebe47f74
--- /dev/null
+++ b/packages/fetchers/README.md
@@ -0,0 +1,19 @@
+# domna-fetchers
+
+External API clients. Each fetcher is responsible for one external system — `EpcClientService` for the gov EPC API, `GeospatialFetcher` for Ordnance Survey, `SolarFetcher` for Google Solar, `FuelRatesFetcher` for Ofgem, `CarbonFactorsFetcher` for Defra.
+
+**Boundary**: makes HTTP calls + returns raw or lightly-mapped responses. No DB, no business logic. Modelling services never depend on fetchers — only orchestrators do (per [ADR-0003](../../docs/adr/0003-strict-ingestion-modelling-separation.md)).
+
+## Layout
+
+```
+src/fetchers/
+├── __init__.py
+├── epc_client.py          # wraps backend/epc_client/
+├── geospatial.py
+├── solar.py
+├── fuel_rates_fetcher.py
+└── carbon_factors_fetcher.py
+```
+
+`backend/epc_client/` will fold into `epc_client.py` during the migration; until then this module re-exports from the legacy location.
diff --git a/packages/fetchers/pyproject.toml b/packages/fetchers/pyproject.toml
new file mode 100644
index 00000000..69404681
--- /dev/null
+++ b/packages/fetchers/pyproject.toml
@@ -0,0 +1,19 @@
+[project]
+name = "domna-fetchers"
+version = "0.1.0"
+description = "External API clients — gov EPC, Ofgem, Google Solar, Defra, etc."
+requires-python = ">=3.11"
+dependencies = [
+    "domna-domain",
+    "httpx>=0.27",
+]
+
+[tool.uv.sources]
+domna-domain = { workspace = true }
+
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+
+[tool.hatch.build.targets.wheel]
+packages = ["src/fetchers"]
diff --git a/packages/fetchers/src/fetchers/__init__.py b/packages/fetchers/src/fetchers/__init__.py
new file mode 100644
index 00000000..74646907
--- /dev/null
+++ b/packages/fetchers/src/fetchers/__init__.py
@@ -0,0 +1,4 @@
+"""External API clients for Ara and sibling services.
+
+One fetcher per external system. No DB, no business logic. See README.md.
+"""
diff --git a/packages/repos/README.md b/packages/repos/README.md
new file mode 100644
index 00000000..990b66db
--- /dev/null
+++ b/packages/repos/README.md
@@ -0,0 +1,27 @@
+# domna-repos
+
+Persistence layer. One repo per aggregate; owns the SQL for its tables. Callers see only domain objects from `domna-domain`.
+
+**Boundary**: depends on `domna-domain` for types. No external IO except the DB. No business logic — services do that.
+
+## Repos (per [PRD §7.3](../../ara_backend_design.md))
+
+```
+src/repos/
+├── __init__.py
+├── unit_of_work.py
+├── property_repo.py
+├── epc_cache_repo.py
+├── site_notes_repo.py
+├── landlord_overrides_repo.py
+├── recommendations_repo.py
+├── generic_data_repo.py
+├── fuel_rates_repo.py
+├── carbon_factors_repo.py
+├── heating_system_assumptions_repo.py
+└── subtask_repo.py
+```
+
+Each repo has a `Fake*Repo` companion in its service's test tree (typically `services/ara/tests/fakes/`) — dict-backed, no DB.
+
+DDL migrations are scoped to sub-PRD (iii); during Phase 0 repos may delegate into the legacy `backend/app/db/db_funcs.*` modules.
diff --git a/packages/repos/pyproject.toml b/packages/repos/pyproject.toml
new file mode 100644
index 00000000..53689812
--- /dev/null
+++ b/packages/repos/pyproject.toml
@@ -0,0 +1,19 @@
+[project]
+name = "domna-repos"
+version = "0.1.0"
+description = "Persistence layer — one repo per aggregate. Owns the SQL."
+requires-python = ">=3.11"
+dependencies = [
+    "domna-domain",
+    "sqlalchemy>=2.0",
+]
+
+[tool.uv.sources]
+domna-domain = { workspace = true }
+
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+
+[tool.hatch.build.targets.wheel]
+packages = ["src/repos"]
diff --git a/packages/repos/src/repos/__init__.py b/packages/repos/src/repos/__init__.py
new file mode 100644
index 00000000..a981395a
--- /dev/null
+++ b/packages/repos/src/repos/__init__.py
@@ -0,0 +1,4 @@
+"""Persistence layer for the Ara domain aggregates.
+
+One repo per aggregate. Owns SQL; exposes domain objects. See README.md.
+"""
diff --git a/packages/utils/README.md b/packages/utils/README.md
new file mode 100644
index 00000000..1fba6457
--- /dev/null
+++ b/packages/utils/README.md
@@ -0,0 +1,15 @@
+# domna-utils
+
+Cross-cutting infrastructure helpers. Nothing domain-specific — anything in here should be portable across services.
+
+## Will live here (migrating from `utils/` and `backend/utils/`)
+
+- Logging — `logger.py`
+- S3 — `s3.py`
+- Pandas helpers — `pandas_utils.py`
+- CloudWatch URL builder — `cloudwatch.py`
+- SQS subtask helpers — `subtasks.py`
+
+## Will NOT live here
+
+Service-specific parsers (Osmosis condition report, full-SAP parser, SharePoint integration) move into the service that owns them, not here.
diff --git a/packages/utils/pyproject.toml b/packages/utils/pyproject.toml
new file mode 100644
index 00000000..cf739bbd
--- /dev/null
+++ b/packages/utils/pyproject.toml
@@ -0,0 +1,15 @@
+[project]
+name = "domna-utils"
+version = "0.1.0"
+description = "Cross-cutting infrastructure helpers — logging, S3, CloudWatch, SQS tasks."
+requires-python = ">=3.11"
+dependencies = [
+    "boto3>=1.34",
+]
+
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+
+[tool.hatch.build.targets.wheel]
+packages = ["src/utils"]
diff --git a/packages/utils/src/utils/__init__.py b/packages/utils/src/utils/__init__.py
new file mode 100644
index 00000000..d010a3be
--- /dev/null
+++ b/packages/utils/src/utils/__init__.py
@@ -0,0 +1,4 @@
+"""Cross-cutting infrastructure helpers — logging, S3, CloudWatch, SQS tasks.
+
+Nothing domain-specific belongs here. See README.md.
+"""
diff --git a/pyproject.toml b/pyproject.toml
index 49108861..75aabc82 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1 +1,13 @@
 [tool.pyright]
+
+# uv workspace root.
+# Each workspace member has its own pyproject.toml under packages/<name>/ or services/<name>/.
+# Run `uv sync` at the root to install everything; `uv sync --package <name>` for one.
+[tool.uv.workspace]
+members = [
+    "packages/domain",
+    "packages/repos",
+    "packages/fetchers",
+    "packages/utils",
+    "services/ara",
+]
diff --git a/services/README.md b/services/README.md
new file mode 100644
index 00000000..c82ef6a4
--- /dev/null
+++ b/services/README.md
@@ -0,0 +1,13 @@
+# Services
+
+Each subdirectory is a deployable unit — typically a Lambda image. Own `pyproject.toml`, own `Dockerfile`, own deps. Lambda bundle contains only that service's deps + its workspace deps.
+
+| Service | Purpose |
+|---------|---------|
+| [`ara/`](./ara/) | The Domna retrofit modelling backend — ingestion + modelling pipelines, all 9 services in [PRD §9.2](../ara_backend_design.md). |
+
+Other Domna services (address2uprn, hubspot, pashub, ecmk, magicplan) live in the legacy `backend/` and `etl/` trees for now; they are slated to migrate here as their owners pick them up — see [PRD §11](../ara_backend_design.md). When that work starts, scaffold the service under `services/<name>/` and add it to the workspace members in the root `pyproject.toml`.
+
+## Service boundary
+
+A service can `import domain.*`, `import repos.*`, `import fetchers.*`, `import utils.*` (workspace deps). It **cannot** import another service's modules — they are separate distributions with no cross-import path. This is the structural enforcement of the modelling/ingestion separation ([ADR-0003](../docs/adr/0003-strict-ingestion-modelling-separation.md)).
diff --git a/services/ara/Dockerfile b/services/ara/Dockerfile
new file mode 100644
index 00000000..c45d6bc1
--- /dev/null
+++ b/services/ara/Dockerfile
@@ -0,0 +1,12 @@
+# Lambda image for the Ara modelling backend.
+#
+# This is a scaffold — final image will install only ara + its workspace deps
+# (domna-domain, domna-repos, domna-fetchers, domna-utils) plus ML/data libraries.
+# Build via uv to keep cold-start size contained.
+
+FROM public.ecr.aws/lambda/python:3.11
+
+# TODO: install uv, sync this service's deps from the workspace lock file,
+# copy src/ara/ into ${LAMBDA_TASK_ROOT}/, set CMD to the Lambda handler.
+
+CMD ["ara.lambdas.handler.handler"]
diff --git a/services/ara/README.md b/services/ara/README.md
new file mode 100644
index 00000000..71e71a5d
--- /dev/null
+++ b/services/ara/README.md
@@ -0,0 +1,30 @@
+# ara
+
+The Domna retrofit modelling backend. Replaces the legacy `backend/engine/engine.py` monolith with a service-oriented pipeline that survives the 30 May 2026 gov EPC API cut-over and that other team members can read, fix, and extend.
+
+Design document: [`../../ara_backend_design.md`](../../ara_backend_design.md).
+Domain glossary: [`../../CONTEXT.md`](../../CONTEXT.md).
+
+## Layout
+
+```
+src/ara/
+├── services/        # the 9 domain services from PRD §9.2:
+│                    #   EpcRemappingService, EpcPredictionService,
+│                    #   FeatureBuilder, EpcEnergyDerivationService,
+│                    #   RebaseliningService, RecommendationService,
+│                    #   ImpactPredictionService, OptimiserService,
+│                    #   ValuationService, ResultsPersister
+├── orchestrators/   # IngestionPipeline, ModellingPipeline, RefreshOrchestrator
+└── lambdas/         # one handler.py per Lambda + the event-shape contracts
+```
+
+## Pipeline
+
+See [PRD §9.4](../../ara_backend_design.md) for the per-batch step order. Briefly: per-property setup (steps 1–6) runs once per Property; the per-scenario × per-phase loop (steps 7–10) re-derives candidates and impact predictions against the rolling Effective EPC state; results are persisted under one Unit of Work per (Plan, Scenario).
+
+## Testing
+
+- `tests/unit/` — service tests against fakes from `tests/fakes/`. No DB, no network, no ML lambda.
+- `tests/integration/` — real Postgres (testcontainers / localstack), fake fetchers + fake ML lambdas.
+- ML transform contract tests live with `domain.ml.transform` in `packages/domain/`.
diff --git a/services/ara/pyproject.toml b/services/ara/pyproject.toml
new file mode 100644
index 00000000..3556a15f
--- /dev/null
+++ b/services/ara/pyproject.toml
@@ -0,0 +1,28 @@
+[project]
+name = "ara"
+version = "0.1.0"
+description = "The Domna retrofit modelling backend. Ingestion + modelling pipelines."
+requires-python = ">=3.11"
+dependencies = [
+    "domna-domain",
+    "domna-repos",
+    "domna-fetchers",
+    "domna-utils",
+    "pandas>=2.0",
+    "pandas-stubs",
+    "numpy>=1.26",
+    "pydantic>=2.0",
+]
+
+[tool.uv.sources]
+domna-domain = { workspace = true }
+domna-repos = { workspace = true }
+domna-fetchers = { workspace = true }
+domna-utils = { workspace = true }
+
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+
+[tool.hatch.build.targets.wheel]
+packages = ["src/ara"]
diff --git a/services/ara/src/ara/__init__.py b/services/ara/src/ara/__init__.py
new file mode 100644
index 00000000..26856c73
--- /dev/null
+++ b/services/ara/src/ara/__init__.py
@@ -0,0 +1,4 @@
+"""The Domna retrofit modelling backend.
+
+See README.md and ara_backend_design.md (repo root) for the architecture.
+"""
diff --git a/services/ara/src/ara/lambdas/__init__.py b/services/ara/src/ara/lambdas/__init__.py
new file mode 100644
index 00000000..93b08582
--- /dev/null
+++ b/services/ara/src/ara/lambdas/__init__.py
@@ -0,0 +1,5 @@
+"""Lambda handlers + event-shape contracts.
+
+One handler per deployable Lambda. See PRD §4.6 for the ModelTriggerRequest
+shape.
+"""
diff --git a/services/ara/src/ara/orchestrators/__init__.py b/services/ara/src/ara/orchestrators/__init__.py
new file mode 100644
index 00000000..4d2c9a60
--- /dev/null
+++ b/services/ara/src/ara/orchestrators/__init__.py
@@ -0,0 +1,5 @@
+"""Orchestrators for the Ara pipeline.
+
+IngestionPipeline, ModellingPipeline, RefreshOrchestrator. The only place
+where step order is encoded and where fetchers + services + repos meet.
+"""
diff --git a/services/ara/src/ara/services/__init__.py b/services/ara/src/ara/services/__init__.py
new file mode 100644
index 00000000..b561f336
--- /dev/null
+++ b/services/ara/src/ara/services/__init__.py
@@ -0,0 +1,9 @@
+"""Domain services for the Ara modelling pipeline (PRD §9.2).
+
+EpcRemappingService, EpcPredictionService, FeatureBuilder,
+EpcEnergyDerivationService, RebaseliningService, RecommendationService,
+ImpactPredictionService, OptimiserService, ValuationService, ResultsPersister.
+
+Each service operates on `Properties` and depends only on repos + other services
++ domain objects. No external IO (per ADR-0003).
+"""
diff --git a/services/ara/tests/__init__.py b/services/ara/tests/__init__.py
new file mode 100644
index 00000000..e69de29b
diff --git a/services/ara/tests/fakes/__init__.py b/services/ara/tests/fakes/__init__.py
new file mode 100644
index 00000000..cc032044
--- /dev/null
+++ b/services/ara/tests/fakes/__init__.py
@@ -0,0 +1,4 @@
+"""Fake repos and fetchers for unit tests.
+
+One Fake<Name>Repo per real repo; dict-backed; no DB. Same for fetchers.
+"""
diff --git a/services/ara/tests/integration/__init__.py b/services/ara/tests/integration/__init__.py
new file mode 100644
index 00000000..e69de29b
diff --git a/services/ara/tests/unit/__init__.py b/services/ara/tests/unit/__init__.py
new file mode 100644
index 00000000..e69de29b