From ce12b114c7289302cfecd28efcdb04381e0de375 Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Mon, 1 Jun 2026 16:20:06 +0000
Subject: [PATCH 01/12] docs(ara): next-agent handover for Property Baseline
 (SAP calc) + Modelling
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Orientation for the next chat picking up the two open fronts after the
ara_first_run rebuild shipped:
- where things stand (merged to main via per-cert; branch/worktree layout;
  PRs into per-cert), authoritative ADRs/CONTEXT to read,
- current architecture + key files (post baseline→property_baseline /
  FirstRun→AraFirstRun rename),
- conventions + gotchas (TDD, ephemeral PG, FakeUnitOfWork, pyright noise to
  ignore, gh-credential push workaround),
- Task 1: wire Sap10Calculator into PropertyBaselineOrchestrator (Calculated
  SAP10 Performance as a third value-set; failure-posture decision),
- Task 2: Modelling (stubs to build out; MaterialsRepository naming open;
  needs a UoW when writing Plans),
- the raising/no-op seams not to mistake for done,
- known doc drift flagged (CONTEXT term vs PropertyBaselinePerformance class;
  stale domain/sap/ path → domain/sap10_calculator).

Also banners ara_backend_design.md as superseded (architecture) by ADR-0011/0012.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 ara_backend_design.md     |   6 ++
 docs/HANDOVER_ARA_NEXT.md | 155 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 161 insertions(+)
 create mode 100644 docs/HANDOVER_ARA_NEXT.md

diff --git a/ara_backend_design.md b/ara_backend_design.md
index f3d11696..c2f3f542 100644
--- a/ara_backend_design.md
+++ b/ara_backend_design.md
@@ -1,5 +1,11 @@
 # ARA Backend Redesign — Design PRD
 
+> ⚠️ **SUPERSEDED (architecture sections).** This is an early draft PRD. The actual
+> architecture as built differs — see the ADRs in `docs/adr/` (especially 0011
+> composable stage orchestrators, 0012 Unit-of-Work per-stage batch) and
+> `docs/HANDOVER_ARA_NEXT.md` for current state. Treat this doc as historical context,
+> not the source of truth for layout/contracts.
+
 **Status**: Draft for team review
 **Author**: Khalim Conn-Kowlessar (with Claude grill session)
 **Branch**: `ara-backend-design-prd`
diff --git a/docs/HANDOVER_ARA_NEXT.md b/docs/HANDOVER_ARA_NEXT.md
new file mode 100644
index 00000000..61eac61a
--- /dev/null
+++ b/docs/HANDOVER_ARA_NEXT.md
@@ -0,0 +1,155 @@
+# Handover — Ara backend: Property Baseline (SAP calculator) + Modelling
+
+You are picking up a clean, merged baseline. The `ara_first_run` backend rebuild is
+**done and shipped**; the next two fronts are (1) wiring the SAP calculator into
+Property Baseline, and (2) starting Modelling. This doc is the orientation — the ADRs
+and CONTEXT.md are authoritative for decisions; don't re-derive them.
+
+## Where things stand
+
+- The **`ara_first_run` rebuild is complete and merged to `main`** (via
+  `feature/per-cert-mapper-validation`): the full pipeline spine
+  **Ingestion → Baseline → Modelling(stub)** on a flat-hexagonal layout with a
+  per-stage Unit-of-Work. Issues #1129–#1138 (parent PRD #1128) are all done.
+- **Branch + worktree:** you are on `feature/property-baseline-sap10`, cut from the
+  up-to-date `feature/per-cert-mapper-validation` (which contains `main` + the merged
+  ara work + the ongoing per-cert SAP-calculator validation slices). Worktree:
+  `/workspaces/home/hestia-worktrees/model-assemble-new-backend`. The
+  `/workspaces/model` worktree holds `feature/per-cert-mapper-validation` itself.
+- **PRs go into `feature/per-cert-mapper-validation`, NOT `main` directly** — one PR
+  per slice, the rhythm used for #1129–#1138.
+
+## Read first (authoritative — don't re-derive)
+
+- **ADRs** `docs/adr/`: 0002 (Property aggregate root), 0003 (strict Ingestion→Modelling
+  separation, amended), 0004 (BaselinePerformance = Lodged+Effective pair, amended for
+  the standalone table), 0005 (multi-phase Scenarios, per-phase recompute — **governs
+  Modelling**), 0006/0007 (deterministic kWh / kWh-as-ML-target), 0009+0010
+  (deterministic SAP calculator + its spec target & validation cohort), 0011 (composable
+  stage orchestrators, one lambda per use case, stages talk through repos), 0012
+  (Unit-of-Work per-stage batch transaction).
+- **CONTEXT.md** — the glossary; use this vocabulary in code + commits.
+- **`ara_backend_design.md`** is a **stale draft PRD** — its architecture sections are
+  superseded by ADR-0011/0012 (a banner now says so). Trust the ADRs, not it.
+
+## Architecture (current — flat hexagonal at repo root)
+
+```
+applications/<lambda>/   thin handler + trigger body + Dockerfile + local_handler
+orchestration/           stage orchestrators + AraFirstRunPipeline (deps injected)
+domain/                  pure aggregates + services
+repositories/<agg>/      port (ABC) + adapter (*_postgres_repository / *_s3_repository)
+infrastructure/          clients + SQLModel rows (*_table.py) + engine/config
+```
+
+Stages communicate **only through repos**, threading just `property_ids` — never an
+in-memory hand-off (ADR-0011/0003). Each stage runs its batch in **one Unit of Work and
+commits once** (ADR-0012); all-or-nothing per batch, fail noisily → subtask FAILED →
+debug & re-run; re-runs are idempotent (replace-by-`property_id`). Ingestion is
+fetch-then-write so a DB connection is never held during external IO.
+
+## Key files (note the recent rename: baseline → property_baseline; FirstRun → AraFirstRun)
+
+- `orchestration/ara_first_run_pipeline.py` — `AraFirstRunPipeline`, `AraFirstRunCommand`,
+  the `IngestionStage`/`PropertyBaselineStage`/`ModellingStage` Protocols.
+- `orchestration/property_baseline_orchestrator.py` — `PropertyBaselineOrchestrator`
+  (**this is where the SAP calculator gets wired**).
+- `orchestration/ingestion_orchestrator.py`, `orchestration/modelling_orchestrator.py` (stub).
+- `domain/property_baseline/` — `PropertyBaselinePerformance`, `Performance`,
+  `lodged_performance()`, `Rebaseliner`/`StubRebaseliner`.
+- `repositories/property_baseline/` (port + postgres adapter),
+  `repositories/unit_of_work.py` + `repositories/postgres_unit_of_work.py`.
+- `repositories/scenario/`, `repositories/materials/` — **empty seam ports** for Modelling.
+- `infrastructure/postgres/property_baseline_performance_table.py` — flat-column row.
+- `applications/ara_first_run/handler.py` — `build_first_run_pipeline` wiring +
+  `_source_clients_from_env` (a seam that **raises** — see Stubs below).
+- **SAP calculator (for task 1):** `domain/sap10_calculator/calculator.py`, class
+  `Sap10Calculator`, returns a `SapResult` (5 quantities + monthly + worksheet audit).
+  It is mature and heavily validated by the per-cert work on this branch.
+
+## Conventions + gotchas
+
+- **TDD**, one test → one impl; `# Arrange / # Act / # Assert` headers; **commit per
+  slice** with a spec/ADR citation and the
+  `Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>` trailer.
+- Tests: real ephemeral PostgreSQL via the `db_engine` fixture (JSONB needs real PG).
+  **Orchestrator/repo unit tests use fakes** — `tests/orchestration/fakes.py`
+  (`FakeUnitOfWork` exposing `property`/`epc`/`solar`/`property_baseline` repos + commit
+  count). Run with `-p no:cacheprovider`; ignore coverage spam.
+- **pyright strict, zero errors.** Known noise to ignore: a `venvPath` warning; the
+  `moto`-not-installed import errors in `test_postcode_splitter_orchestrator.py` +
+  `test_user_address_csv_s3_repository.py` (those modules don't collect — `--ignore`
+  them); and 4 pre-existing failures outside `tests/` (summary_pdf_mapper_chain ×3 +
+  from_rdsap_schema total_floor_area).
+- **Pushing from this worktree:** the VS Code git credential helpers are broken
+  (missing node binaries), so use a one-shot gh override:
+  `git -c credential.helper= -c credential.helper='!gh auth git-credential' push`.
+
+## Next task 1 — SAP calculator on Property Baseline (the user expects this to be simple)
+
+Wire `Sap10Calculator` into `PropertyBaselineOrchestrator` to produce **Calculated SAP10
+Performance** per property. Per CONTEXT (≈line 100), this is a quantity **distinct from**
+Lodged/Effective Performance — surfaced *alongside* them during the validation phase; it
+may supersede Effective Performance in a later ADR once parity is confirmed (ADR-0009/0010).
+
+**Grill these two before coding (`/grill-with-docs`):**
+1. **Where it sits.** Recommended: a *third* value-set on `PropertyBaselinePerformance`
+   (`calculated: Performance` + its space/water kWh), persisted as `calculated_*` columns
+   on `property_baseline_performance` — **not** an overwrite of `effective`. Pin the
+   aggregate shape + table migration in one pass (the table migration is FE-owned/Drizzle —
+   see `docs/migrations/property-baseline-performance-table.md`).
+2. **Failure posture.** The calculator strict-raises (`UnmappedSapCode`, etc.) on certs it
+   can't yet handle. Running it over a real cohort *surfaces those gaps* — which is the
+   validation work `feature/per-cert-mapper-validation` exists for. Decide: let the raise
+   abort the batch (ADR-0012 all-or-nothing), or collect/skip-and-report. This is the main
+   judgment call; "simple to wire" but it lights up the validation surface.
+
+Then TDD: inject the calculator into `PropertyBaselineOrchestrator`, call it on the
+Effective EPC, persist the calculated set in the same unit.
+
+## Next task 2 — Modelling (Recommendations / Optimiser / Plans)
+
+`ModellingOrchestrator.run(property_ids, scenario_ids)` is a **no-op stub**;
+`ScenarioRepository` and `MaterialsRepository` are **empty seam ports**. Building this out
+is the third stage. ADR-0005 (multi-phase Scenarios, per-phase recompute) governs it.
+Relevant CONTEXT terms: Modelling (stage), Scenario, Scenario Phase, Scenario Snapshot,
+Optimised Package, Plans, Recommendations, Optimiser Service.
+
+Before coding, grill the port shapes + the Scenario/Materials domain aggregates. Two
+known open points:
+- **`MaterialsRepository` naming.** A PR reviewer suggested `BuildingMaterialsRepository`;
+  this was **deliberately deferred to this grill** because "building materials" may
+  under-describe retrofit measures (a heat pump / ASHP is a *measure/product*, not a
+  building material). Settle the term (Materials / Measures / Products / BuildingMaterials)
+  here.
+- **Modelling will need a Unit of Work** when it writes Plans — the stub currently takes
+  no `unit_of_work`; it gains one (ADR-0012) when its body is built.
+
+## Stubs / seams that raise or no-op (do NOT mistake for "done")
+
+- `applications/ara_first_run/handler.py::_source_clients_from_env` — **raises**
+  `NotImplementedError`. EPC-API / Google-Solar / geospatial-S3 client config + env-var
+  names + pandas/s3fs deps + Terraform wiring are a separate deploy piece (out of scope so
+  far). The lambda is not end-to-end runnable until this is filled in.
+- `ModellingOrchestrator.run` — no-op.
+- `ScenarioRepository` / `MaterialsRepository` — empty ABC ports.
+- `StubRebaseliner` — raises `RebaselineNotImplemented` on pre-SAP10 certs (`sap_version
+  < 10`); ML Rebaselining is not implemented.
+- **EPC Energy Derivation** (fuel split + bills + the Ofgem-cap Fuel Rates ETL) is
+  deferred — kWh is carried on `PropertyBaselinePerformance`, the rest is not.
+
+## Known doc drift to be aware of (flagged, intentionally not auto-fixed)
+
+- **CONTEXT.md term vs code class.** The glossary term is **"Baseline Performance"**; the
+  code class is **`PropertyBaselinePerformance`** (renamed on PR review). The glossary was
+  *deliberately* left un-renamed — treat "Baseline Performance" as the spoken concept and
+  `PropertyBaselinePerformance` as its class. If you want them aligned, rename the term to
+  "Property Baseline Performance" across CONTEXT + ADR prose (a quick, mechanical change).
+- **CONTEXT.md ≈line 105** says the calculator lives in `domain/sap/` — that's **stale**;
+  it's `domain/sap10_calculator/calculator.py`. Safe to correct.
+
+## Issues / process
+
+Parent PRD: `gh issue view 1128 --repo Hestia-Homes/Model`. #1129–#1138 done (each with a
+"Done." comment). New work → new issues (use `/to-issues` or `/triage`), `ready-for-agent`
+labelled, parented to #1128.

From ce33cd94ef3cc59cf8d5406f6346d4f74a232465 Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Mon, 1 Jun 2026 18:56:41 +0000
Subject: [PATCH 02/12] =?UTF-8?q?docs:=20correct=20SAP=20calculator=20path?=
 =?UTF-8?q?=20in=20CONTEXT=20(domain/sap=20=E2=86=92=20domain/sap10=5Fcalc?=
 =?UTF-8?q?ulator)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Factual staleness fix flagged in the handover; the calculator lives in
domain/sap10_calculator/calculator.py. Glossary term 'Baseline Performance'
deliberately left unchanged (concept vs PropertyBaselinePerformance class).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 CONTEXT.md                | 2 +-
 docs/HANDOVER_ARA_NEXT.md | 2 --
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/CONTEXT.md b/CONTEXT.md
index 345e5ce1..4e31c0a9 100644
--- a/CONTEXT.md
+++ b/CONTEXT.md
@@ -102,7 +102,7 @@ The SAP score, EPC Band, CO2 emissions, Primary Energy Intensity, space heating
 _Avoid_: calculator output, computed performance, worksheet performance, SAP10 output
 
 **SAP10 Calculation**:
-The process that runs the deterministic SAP 10.2 (14-03-2025 amendment) worksheet over a Property's EpcPropertyData and emits **Calculated SAP10 Performance**. Implemented by the `Sap10Calculator` service class in `domain/sap/`. Reads cert fabric/heating/geometry fields, applies the RdSAP 10 (10-06-2025) cert→input mapping, executes the 12-month heat balance per SAP 10.2 §§1-14, looks up boiler/heat-pump performance in the **PCDB** when the cert lodges a product index, and returns a `SapResult` carrying the five Calculated SAP10 Performance quantities plus a monthly breakdown and worksheet-line audit trail. Distinct from **Rebaselining**, which is ML-based. ADR-0009 originally targeted SAP 10.3 (13-01-2026); ADR-0010 retargets to SAP 10.2 (14-03-2025) until the cert corpus migrates.
+The process that runs the deterministic SAP 10.2 (14-03-2025 amendment) worksheet over a Property's EpcPropertyData and emits **Calculated SAP10 Performance**. Implemented by the `Sap10Calculator` service class in `domain/sap10_calculator/` (`calculator.py`). Reads cert fabric/heating/geometry fields, applies the RdSAP 10 (10-06-2025) cert→input mapping, executes the 12-month heat balance per SAP 10.2 §§1-14, looks up boiler/heat-pump performance in the **PCDB** when the cert lodges a product index, and returns a `SapResult` carrying the five Calculated SAP10 Performance quantities plus a monthly breakdown and worksheet-line audit trail. Distinct from **Rebaselining**, which is ML-based. ADR-0009 originally targeted SAP 10.3 (13-01-2026); ADR-0010 retargets to SAP 10.2 (14-03-2025) until the cert corpus migrates.
 _Avoid_: SAP calculation (ambiguous with the gov calculator), SAP scoring, calculator run, SAP 10.3 calculation (active target is 10.2 — see [[sap-spec-version]])
 
 **SAP Spec Version**:
diff --git a/docs/HANDOVER_ARA_NEXT.md b/docs/HANDOVER_ARA_NEXT.md
index 61eac61a..4f61d9ff 100644
--- a/docs/HANDOVER_ARA_NEXT.md
+++ b/docs/HANDOVER_ARA_NEXT.md
@@ -145,8 +145,6 @@ known open points:
   *deliberately* left un-renamed — treat "Baseline Performance" as the spoken concept and
   `PropertyBaselinePerformance` as its class. If you want them aligned, rename the term to
   "Property Baseline Performance" across CONTEXT + ADR prose (a quick, mechanical change).
-- **CONTEXT.md ≈line 105** says the calculator lives in `domain/sap/` — that's **stale**;
-  it's `domain/sap10_calculator/calculator.py`. Safe to correct.
 
 ## Issues / process
 

From 561e1b8b497ac1acc31bc811acf1a2eb6ae7b71a Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Tue, 2 Jun 2026 08:01:47 +0000
Subject: [PATCH 03/12] feat(baseline): run Sap10Calculator in shadow on
 Property Baseline (ADR-0013)

Wire Sap10Calculator into PropertyBaselineOrchestrator as a non-load-bearing
shadow runner. For each property it scores the Effective EPC beside the
load-bearing Lodged/Effective write, catches any strict-raise -> log.error
(never aborts the batch), and on success log.warning's divergence from Lodged:
SAP |continuous - lodged| > 0.5; PEUI/CO2 > 1% relative (CO2 after kg->tonnes).
Every line is tagged with sap_version so SAP-10.2 signal separates from
older-spec drift (ADR-0010 Validation Cohort).

Per ADR-0013, Calculated SAP10 Performance is not a persisted third value-set:
effective = calculated in every baselining scenario, so the calculator IS the
mechanism that produces Effective Performance (the Rebaseliner). It runs in
shadow only while being hardened; when overrides/estimation land it is promoted
to drive Effective and the failure posture flips to abort (ADR-0012, calculator
now load-bearing). No table change.

- ADR-0013 + CONTEXT (Calculated SAP10 Performance / Effective Performance /
  Rebaselining) record the decision.
- CalculatorShadow port + LoggingCalculatorShadow + Calculator protocol.
- FakeCalculatorShadow for orchestrator unit tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 CONTEXT.md                                    |   8 +-
 applications/ara_first_run/handler.py         |   5 +
 ...uces-effective-performance-shadow-first.md |  88 ++++++++++
 domain/property_baseline/calculator_shadow.py | 141 +++++++++++++++
 .../property_baseline_orchestrator.py         |  11 ++
 .../test_calculator_shadow.py                 | 166 ++++++++++++++++++
 tests/orchestration/fakes.py                  |  19 ++
 ...test_ara_first_run_pipeline_integration.py |   5 +-
 .../test_property_baseline_orchestrator.py    |  37 +++-
 9 files changed, 473 insertions(+), 7 deletions(-)
 create mode 100644 docs/adr/0013-calculator-produces-effective-performance-shadow-first.md
 create mode 100644 domain/property_baseline/calculator_shadow.py
 create mode 100644 tests/domain/property_baseline/test_calculator_shadow.py

diff --git a/CONTEXT.md b/CONTEXT.md
index 4e31c0a9..a41d597a 100644
--- a/CONTEXT.md
+++ b/CONTEXT.md
@@ -82,7 +82,7 @@ The EpcPropertyData scored by the modelling pipeline for a single Property, deri
 _Avoid_: modelling EPC, working EPC, resolved EPC, derived EPC
 
 **Rebaselining**:
-Re-predicting a Property's SAP score, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh via ML so the modelling pipeline scores it against the current SAP10 methodology. Triggered when either (a) the Effective EPC was lodged under a pre-SAP10 schema (`sap_version < 10.0`), so the recorded scores reflect a superseded methodology, or (b) Site Notes / Landlord Overrides changed the physical state of the Property (walls / heating / windows / etc.) so the lodged scores no longer reflect what's installed. Both triggers may fire together. Produces Effective Performance; Lodged Performance is preserved unchanged. kWh is included as ML targets per ADR-0007 — see [[epc-ml-transform]].
+Re-predicting a Property's SAP score, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh via **SAP10 Calculation** (the deterministic `Sap10Calculator`, which superseded the old ML-API rebaseliner; an ML residual head over the calculator is future — ADR-0009/0013) so the modelling pipeline scores it against the current SAP10 methodology. Triggered when either (a) the Effective EPC was lodged under a pre-SAP10 schema (`sap_version < 10.0`), so the recorded scores reflect a superseded methodology, or (b) Site Notes / Landlord Overrides changed the physical state of the Property (walls / heating / windows / etc.) so the lodged scores no longer reflect what's installed. Both triggers may fire together. Produces Effective Performance; Lodged Performance is preserved unchanged. kWh is included as ML targets per ADR-0007 — see [[epc-ml-transform]].
 _Avoid_: re-scoring, re-prediction, performance recomputation, refresh (for cache-freshness)
 
 **Baseline Performance**:
@@ -94,12 +94,12 @@ The SAP / EPC Band / carbon emissions / Primary Energy Intensity recorded on the
 _Avoid_: original performance, raw EPC values, recorded baseline
 
 **Effective Performance**:
-The SAP / EPC Band / carbon emissions / Primary Energy Intensity the modelling pipeline actually scored against — equal to Lodged Performance when no Rebaselining trigger fires, replaced by ML output when triggered. The half of Baseline Performance that says "what we modelled".
+The SAP / EPC Band / carbon emissions / Primary Energy Intensity the modelling pipeline actually scored against — equal to Lodged Performance when no Rebaselining trigger fires, replaced by **SAP10 Calculation** output (the deterministic `Sap10Calculator`, which superseded the old ML-API rebaseliner; an ML residual head over the calculator is future — ADR-0009/0013) when triggered. The half of Baseline Performance that says "what we modelled".
 _Avoid_: modelled performance, rebaselined performance (only correct when rebaselining ran), scored values
 
 **Calculated SAP10 Performance**:
-The SAP score, EPC Band, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh produced by **SAP10 Calculation** from a Property's EpcPropertyData. Distinct from Effective Performance (ML output) and Lodged Performance (gov register) during the validation phase. Surfaced alongside Effective Performance in the UI; may supersede Effective Performance in a later ADR once parity is confirmed against the cert-reported SAP across ≥1000 sample certs lodged on the calculator's target spec version (see [[sap-spec-version]]). ADR-0009 (as amended by ADR-0010).
-_Avoid_: calculator output, computed performance, worksheet performance, SAP10 output
+The SAP score, EPC Band, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh produced by **SAP10 Calculation** from a Property's EpcPropertyData. It is **not** a separately-persisted third value-set beside Lodged and Effective: in every baselining scenario the calculator's output *is* the **Effective Performance** (real lodged SAP10 EPC with no overrides ⇒ Calculated = Lodged = Effective; overrides or an estimated / pre-SAP10 EPC ⇒ Calculated = Effective, there being no lodged SAP10 figure to compare against). The calculator is therefore the mechanism that produces Effective Performance, having superseded the old ML-API rebaseliner. While it is being hardened it runs in **shadow** for the first baselining slice — computed on every Property, compared to Lodged, and any divergence (SAP > 0.5, or PEUI / CO2 beyond tolerance) or strict-raise **logged, not persisted** — then is promoted to drive Effective Performance once overrides / estimation land (ADR-0013). The ≥1000-cert parity confirmation against the cert-reported SAP (see [[sap-spec-version]]) gates that promotion. ADR-0009 introduced the term, as amended by ADR-0010 and realized by ADR-0013.
+_Avoid_: calculator output, computed performance, worksheet performance, SAP10 output, calculated value-set (it is not a stored third set)
 
 **SAP10 Calculation**:
 The process that runs the deterministic SAP 10.2 (14-03-2025 amendment) worksheet over a Property's EpcPropertyData and emits **Calculated SAP10 Performance**. Implemented by the `Sap10Calculator` service class in `domain/sap10_calculator/` (`calculator.py`). Reads cert fabric/heating/geometry fields, applies the RdSAP 10 (10-06-2025) cert→input mapping, executes the 12-month heat balance per SAP 10.2 §§1-14, looks up boiler/heat-pump performance in the **PCDB** when the cert lodges a product index, and returns a `SapResult` carrying the five Calculated SAP10 Performance quantities plus a monthly breakdown and worksheet-line audit trail. Distinct from **Rebaselining**, which is ML-based. ADR-0009 originally targeted SAP 10.3 (13-01-2026); ADR-0010 retargets to SAP 10.2 (14-03-2025) until the cert corpus migrates.
diff --git a/applications/ara_first_run/handler.py b/applications/ara_first_run/handler.py
index 761fd207..8aca4fea 100644
--- a/applications/ara_first_run/handler.py
+++ b/applications/ara_first_run/handler.py
@@ -10,7 +10,9 @@ from sqlmodel import Session
 from applications.ara_first_run.ara_first_run_trigger_body import (
     AraFirstRunTriggerBody,
 )
+from domain.property_baseline.calculator_shadow import LoggingCalculatorShadow
 from domain.property_baseline.rebaseliner import StubRebaseliner
+from domain.sap10_calculator.calculator import Sap10Calculator
 from infrastructure.postgres.config import PostgresConfig
 from infrastructure.postgres.engine import make_engine
 from orchestration.property_baseline_orchestrator import PropertyBaselineOrchestrator
@@ -81,6 +83,9 @@ def build_first_run_pipeline(
         baseline=PropertyBaselineOrchestrator(
             unit_of_work=unit_of_work,
             rebaseliner=StubRebaseliner(),
+            # Shadow only: validates the calculator over the wild cohort without
+            # gating the load-bearing baseline write (ADR-0013).
+            calculator_shadow=LoggingCalculatorShadow(Sap10Calculator()),
         ),
         modelling=ModellingOrchestrator(
             scenario_repo=ScenarioRepository(),
diff --git a/docs/adr/0013-calculator-produces-effective-performance-shadow-first.md b/docs/adr/0013-calculator-produces-effective-performance-shadow-first.md
new file mode 100644
index 00000000..206aa2f5
--- /dev/null
+++ b/docs/adr/0013-calculator-produces-effective-performance-shadow-first.md
@@ -0,0 +1,88 @@
+---
+Status: accepted
+---
+
+# The `Sap10Calculator` produces Effective Performance (it is the Rebaseliner); Calculated SAP10 Performance is not a persisted third value-set, and is wired in shadow first
+
+Refines [ADR-0004](0004-baseline-performance-lodged-effective-pair.md) (the Lodged/Effective
+pair), [ADR-0009](0009-deterministic-sap-calculator.md)/[ADR-0010](0010-sap10-calculator-spec-target-and-validation.md)
+(the calculator + the **Calculated SAP10 Performance** term), [ADR-0011](0011-composable-stage-orchestrators.md)
+(the `Rebaseliner` seam) and [ADR-0012](0012-unit-of-work-per-stage-batch-transaction.md)
+(all-or-nothing per batch). Decided in a `/grill-with-docs` session (2026-06-01) before wiring
+`Sap10Calculator` into `PropertyBaselineOrchestrator`.
+
+## Context
+
+The old `model_engine` (`backend/engine/engine.py`) called out to an **ML API**
+(`model_api.predict_all` over `BASELINE_MODEL_PREFIXES`) to rebaseline the properties that needed
+it. The rebuild replaces that round-trip with the **deterministic `Sap10Calculator`, run live**.
+
+The handover and CONTEXT (line 100) framed **Calculated SAP10 Performance** as a *third* value-set
+persisted *alongside* Lodged and Effective (`calculated_*` columns). Walking the baselining
+scenarios shows that framing reifies a distinction that does not exist in the domain:
+
+- real lodged SAP10 EPC, no overrides ⇒ Calculated = Lodged = Effective;
+- real EPC + property/landlord overrides ⇒ Calculated = Lodged-plus-overrides = Effective;
+- estimated EPC (± overrides), or a pre-SAP10 EPC ⇒ Calculated = Effective (no lodged SAP10 to
+  compare against — Lodged Performance exists only for a *real lodged* EPC).
+
+In every scenario **Effective = Calculated**. There is no third quantity.
+
+## Decision
+
+**The calculator is the mechanism that produces Effective Performance** — i.e. the deterministic
+`Rebaseliner` (ADR-0011's seam), superseding the old ML-API rebaseliner. "Calculated SAP10
+Performance" is the *name of that output during validation*, **not** a separately-persisted third
+value-set. No `calculated_*` columns are added; `property_baseline_performance` keeps its
+Lodged/Effective shape (ADR-0004). The ADR-0009 ML model is repositioned as a *future residual head*
+over the calculator, not the baseline producer.
+
+**Shadow-first, then promotion.** The calculator still strict-raises (`UnmappedSapCode`,
+`MissingMainFuelType`, `UnresolvedPcdbCombiLoss`) on cert mappings it has not yet hardened, and the
+strict-typing of `EpcPropertyData` that will close most of those gaps is still pending. A ~40,000
+property test cohort is about to flow through baselining. So this lands in two steps:
+
+1. **This slice — shadow.** Performance is still **defined by the input data**: `StubRebaseliner`
+   keeps producing Effective (`= Lodged` for the only live scenario, real SAP10 + no overrides).
+   The calculator runs *beside* it, on every Property's Effective EPC, **purely to be battle-tested
+   in the wild**. It is **not load-bearing**, therefore:
+   - a calculator raise is **caught and logged at `error`, never aborts the batch** — otherwise one
+     unmappable cert would lose the load-bearing Lodged/Effective write for the whole batch, and
+     over a 40k run most batches would never baseline;
+   - on success, its output is **compared to Lodged and logged, not persisted** — `warning` when
+     `|sap_continuous − lodged_sap| > 0.5`, or PEUI / CO2 diverge beyond tolerance (CO2 after the
+     kg→tonnes conversion). Each log is tagged with the cert's `sap_version` so SAP-10.2 divergence
+     (a real calculator signal) is separable from older-spec drift (expected — see
+     [ADR-0010](0010-sap10-calculator-spec-target-and-validation.md) Validation Cohort).
+
+2. **Next slice or two — load-bearing.** When overrides + EPC estimation land (days away),
+   `StubRebaseliner` is replaced by a calculator-backed `Rebaseliner`: the calculator's output
+   **becomes Effective Performance**. The failure posture **flips to abort** per ADR-0012 — now that
+   the calculator *is* the baseline, a silent wrong answer is the expensive outcome, so a raise must
+   fail the batch noisily. Same exception, opposite handling, because the calculator went from
+   shadow to load-bearing. The shadow logging is then retired.
+
+## Considered options
+
+- **A third persisted `calculated_*` value-set on `PropertyBaselinePerformance`** (the handover's
+  recommendation) — rejected: `Effective = Calculated` in every scenario, so the columns would
+  store a distinction with no domain reality, and the future "supersede effective" promotion would
+  be a data move instead of nothing.
+- **Promote the calculator to drive Effective immediately** — rejected for this one slice: it still
+  strict-raises on un-hardened mappings, so over the imminent 40k run it would gate the
+  load-bearing baseline write. Shadow-first surfaces every gap as an aggregatable error log without
+  blocking baselining.
+- **A separate `calculator_shadow` validation table** — held in reserve: log-only is enough while
+  the calculator is moving and the shadow step is a 1–2 day stepping stone; we add a queryable table
+  only if log aggregation proves too weak.
+
+## Consequences
+
+- `property_baseline_performance` is **unchanged** this slice — no migration.
+- CONTEXT **Calculated SAP10 Performance**, **Effective Performance**, and **Rebaselining** are
+  updated: the calculator (not ML) is the rebaseliner mechanism in the rebuilt engine; Calculated is
+  not a stored third set.
+- The shadow runner's broad `except` is deliberate (the point is to discover *what* breaks in the
+  wild); each caught exception is logged with its type and `property_id`.
+- This decision is short-lived in its shadow form by design; the durable half — "the calculator
+  produces Effective Performance; there is no third value-set" — outlives it.
diff --git a/domain/property_baseline/calculator_shadow.py b/domain/property_baseline/calculator_shadow.py
new file mode 100644
index 00000000..ba7927d8
--- /dev/null
+++ b/domain/property_baseline/calculator_shadow.py
@@ -0,0 +1,141 @@
+from __future__ import annotations
+
+import logging
+from abc import ABC, abstractmethod
+from typing import TYPE_CHECKING, Optional, Protocol
+
+from domain.property_baseline.performance import Performance
+
+if TYPE_CHECKING:
+    from datatypes.epc.domain.epc_property_data import EpcPropertyData
+    from domain.sap10_calculator.calculator import SapResult
+
+logger = logging.getLogger(__name__)
+
+# A continuous SAP this far from the lodged integer would round to a different
+# band-driving score; PEUI / CO2 scale with dwelling size so they use a relative
+# tolerance (ADR-0013). Starting dials — tune against the wild-cohort logs.
+_SAP_ABS_TOL = 0.5
+_REL_TOL = 0.01
+_KG_PER_TONNE = 1000.0
+
+
+class CalculatorShadow(ABC):
+    """Runs SAP10 Calculation in shadow beside the load-bearing baseline write
+    and reports divergence from Lodged Performance (ADR-0013).
+
+    The calculator is not yet load-bearing — it is still being hardened, and a
+    large test cohort is about to flow through baselining. So an implementation
+    **must never raise**: a shadow failure may not abort the batch (ADR-0012's
+    all-or-nothing governs only the load-bearing Lodged/Effective write). It
+    observes, compares against Lodged, and logs; it does not feed Effective
+    Performance. The seam is retired when the calculator is promoted to the
+    Rebaseliner and its output *becomes* Effective Performance.
+    """
+
+    @abstractmethod
+    def observe(
+        self,
+        *,
+        property_id: int,
+        effective_epc: "EpcPropertyData",
+        lodged: Performance,
+    ) -> None: ...
+
+
+def _relative_diff(calculated: float, lodged: float) -> float:
+    """|calculated − lodged| / |lodged|; a zero lodged value diverges iff
+    calculated is non-zero (avoids a divide-by-zero on degenerate certs)."""
+    if lodged == 0:
+        return 0.0 if calculated == 0 else float("inf")
+    return abs(calculated - lodged) / abs(lodged)
+
+
+class Calculator(Protocol):
+    """The slice of `Sap10Calculator` the shadow needs: cert in, result out.
+    `Sap10Calculator` satisfies it structurally — no coupling to its module."""
+
+    def calculate(self, epc: "EpcPropertyData") -> "SapResult": ...
+
+
+class LoggingCalculatorShadow(CalculatorShadow):
+    """Runs the calculator and logs, never persists, never raises (ADR-0013).
+
+    A strict-raise (an un-mapped cert) is caught and logged at ``error`` so the
+    wild-cohort gap is greppable; a successful result whose SAP / PEUI / CO2
+    diverges from Lodged beyond tolerance is logged at ``warning``. Every line
+    is tagged with ``property_id`` and the cert's ``sap_version`` so SAP-10.2
+    divergence (a real calculator signal) is separable from older-spec drift.
+    """
+
+    def __init__(self, calculator: Calculator) -> None:
+        self._calculator = calculator
+
+    def observe(
+        self,
+        *,
+        property_id: int,
+        effective_epc: "EpcPropertyData",
+        lodged: Performance,
+    ) -> None:
+        sap_version = effective_epc.sap_version
+        try:
+            # Broad by design: the point is to discover *what* breaks in the
+            # wild, and a shadow failure must never abort the batch (ADR-0013).
+            result = self._calculator.calculate(effective_epc)
+        except Exception as exc:
+            logger.error(
+                "SAP10 shadow calculation failed for property_id=%s "
+                "sap_version=%s: %r",
+                property_id,
+                sap_version,
+                exc,
+            )
+            return
+        if abs(result.sap_score_continuous - lodged.sap_score) > _SAP_ABS_TOL:
+            self._warn_divergence(
+                quantity="sap_score",
+                property_id=property_id,
+                sap_version=sap_version,
+                lodged=lodged.sap_score,
+                calculated=result.sap_score_continuous,
+            )
+        if _relative_diff(
+            result.primary_energy_kwh_per_m2, lodged.primary_energy_intensity
+        ) > _REL_TOL:
+            self._warn_divergence(
+                quantity="primary_energy_intensity",
+                property_id=property_id,
+                sap_version=sap_version,
+                lodged=lodged.primary_energy_intensity,
+                calculated=result.primary_energy_kwh_per_m2,
+            )
+        # Lodged CO2 is tonnes/yr; the calculator emits kg/yr (ADR-0013).
+        calculated_co2_t = result.co2_kg_per_yr / _KG_PER_TONNE
+        if _relative_diff(calculated_co2_t, lodged.co2_emissions) > _REL_TOL:
+            self._warn_divergence(
+                quantity="co2_emissions",
+                property_id=property_id,
+                sap_version=sap_version,
+                lodged=lodged.co2_emissions,
+                calculated=calculated_co2_t,
+            )
+
+    def _warn_divergence(
+        self,
+        *,
+        quantity: str,
+        property_id: int,
+        sap_version: Optional[float],
+        lodged: float,
+        calculated: float,
+    ) -> None:
+        logger.warning(
+            "SAP10 shadow divergence on %s for property_id=%s sap_version=%s: "
+            "lodged=%s calculated=%s",
+            quantity,
+            property_id,
+            sap_version,
+            lodged,
+            calculated,
+        )
diff --git a/orchestration/property_baseline_orchestrator.py b/orchestration/property_baseline_orchestrator.py
index df2bf579..119889bd 100644
--- a/orchestration/property_baseline_orchestrator.py
+++ b/orchestration/property_baseline_orchestrator.py
@@ -6,6 +6,7 @@ from datatypes.epc.domain.epc_property_data import (
     EpcPropertyData,
     RenewableHeatIncentive,
 )
+from domain.property_baseline.calculator_shadow import CalculatorShadow
 from domain.property_baseline.property_baseline_performance import PropertyBaselinePerformance
 from domain.property_baseline.performance import lodged_performance
 from domain.property_baseline.rebaseliner import Rebaseliner
@@ -32,9 +33,11 @@ class PropertyBaselineOrchestrator:
         *,
         unit_of_work: Callable[[], UnitOfWork],
         rebaseliner: Rebaseliner,
+        calculator_shadow: CalculatorShadow,
     ) -> None:
         self._unit_of_work = unit_of_work
         self._rebaseliner = rebaseliner
+        self._calculator_shadow = calculator_shadow
 
     def run(self, property_ids: list[int]) -> None:
         with self._unit_of_work() as uow:
@@ -54,6 +57,14 @@ class PropertyBaselineOrchestrator:
                     water_heating_kwh=rhi.water_heating_kwh,
                 )
                 uow.property_baseline.save(baseline, property_id)
+                # Shadow only: validate the calculator in the wild without
+                # gating the load-bearing write above (ADR-0013). `observe`
+                # never raises, so it cannot abort the batch.
+                self._calculator_shadow.observe(
+                    property_id=property_id,
+                    effective_epc=effective_epc,
+                    lodged=lodged,
+                )
             uow.commit()
 
 
diff --git a/tests/domain/property_baseline/test_calculator_shadow.py b/tests/domain/property_baseline/test_calculator_shadow.py
new file mode 100644
index 00000000..81718b72
--- /dev/null
+++ b/tests/domain/property_baseline/test_calculator_shadow.py
@@ -0,0 +1,166 @@
+from __future__ import annotations
+
+import logging
+from typing import Optional
+
+import pytest
+
+from datatypes.epc.domain.epc import Epc
+from datatypes.epc.domain.epc_property_data import EpcPropertyData
+from domain.property_baseline.calculator_shadow import LoggingCalculatorShadow
+from domain.property_baseline.performance import Performance
+from domain.sap10_calculator.calculator import SapResult
+from domain.sap10_calculator.exceptions import UnmappedSapCode
+
+
+def _epc(*, sap_version: Optional[float]) -> EpcPropertyData:
+    epc = object.__new__(EpcPropertyData)
+    epc.sap_version = sap_version
+    return epc
+
+
+def _lodged() -> Performance:
+    return Performance(
+        sap_score=72, epc_band=Epc.C, co2_emissions=1.8, primary_energy_intensity=180
+    )
+
+
+def _sap_result(
+    *,
+    sap_score_continuous: float = 72.0,
+    primary_energy_kwh_per_m2: float = 180.0,
+    co2_kg_per_yr: float = 1800.0,
+) -> SapResult:
+    """A `SapResult` whose three compared quantities default to *matching*
+    `_lodged()`; each test perturbs one axis."""
+    return SapResult(
+        sap_score=round(sap_score_continuous),
+        sap_score_continuous=sap_score_continuous,
+        ecf=0.0,
+        total_fuel_cost_gbp=0.0,
+        co2_kg_per_yr=co2_kg_per_yr,
+        space_heating_kwh_per_yr=0.0,
+        space_cooling_kwh_per_yr=0.0,
+        fabric_energy_efficiency_kwh_per_m2_yr=0.0,
+        main_heating_fuel_kwh_per_yr=0.0,
+        main_2_heating_fuel_kwh_per_yr=0.0,
+        secondary_heating_fuel_kwh_per_yr=0.0,
+        space_cooling_fuel_kwh_per_yr=0.0,
+        hot_water_kwh_per_yr=0.0,
+        pumps_fans_kwh_per_yr=0.0,
+        lighting_kwh_per_yr=0.0,
+        primary_energy_kwh_per_yr=0.0,
+        primary_energy_kwh_per_m2=primary_energy_kwh_per_m2,
+        monthly=(),
+        intermediate={},
+    )
+
+
+class _RaisingCalculator:
+    def calculate(self, epc: EpcPropertyData) -> SapResult:
+        raise UnmappedSapCode("heat_emitter_type", 99)
+
+
+class _StubCalculator:
+    def __init__(self, result: SapResult) -> None:
+        self._result = result
+
+    def calculate(self, epc: EpcPropertyData) -> SapResult:
+        return self._result
+
+
+def test_observe_swallows_a_calculator_raise_and_logs_error(
+    caplog: pytest.LogCaptureFixture,
+) -> None:
+    # Arrange — the calculator strict-raises on a cert it cannot yet map.
+    shadow = LoggingCalculatorShadow(_RaisingCalculator())
+    epc = _epc(sap_version=10.2)
+
+    # Act — observe must not propagate the raise (ADR-0013: shadow is not
+    # load-bearing, so it cannot abort the batch).
+    with caplog.at_level(logging.ERROR):
+        shadow.observe(property_id=42, effective_epc=epc, lodged=_lodged())
+
+    # Assert — exactly one error record, tagged with property_id + sap_version
+    # and carrying the exception so the wild-cohort gap is greppable.
+    assert len(caplog.records) == 1
+    message = caplog.records[0].getMessage()
+    assert caplog.records[0].levelno == logging.ERROR
+    assert "property_id=42" in message
+    assert "sap_version=10.2" in message
+    assert "heat_emitter_type" in message
+
+
+def test_observe_warns_when_sap_diverges_beyond_half_a_point(
+    caplog: pytest.LogCaptureFixture,
+) -> None:
+    # Arrange — calculated SAP 75.0 vs lodged 72 is 3.0 out (> 0.5).
+    shadow = LoggingCalculatorShadow(
+        _StubCalculator(_sap_result(sap_score_continuous=75.0))
+    )
+    epc = _epc(sap_version=10.2)
+
+    # Act
+    with caplog.at_level(logging.WARNING):
+        shadow.observe(property_id=42, effective_epc=epc, lodged=_lodged())
+
+    # Assert — one warning, naming the diverging quantity + the tags.
+    assert len(caplog.records) == 1
+    message = caplog.records[0].getMessage()
+    assert caplog.records[0].levelno == logging.WARNING
+    assert "sap_score" in message
+    assert "property_id=42" in message
+    assert "sap_version=10.2" in message
+
+
+def test_observe_warns_when_peui_diverges_beyond_one_percent(
+    caplog: pytest.LogCaptureFixture,
+) -> None:
+    # Arrange — calculated PEUI 200 vs lodged 180 is ~11% out (> 1%).
+    shadow = LoggingCalculatorShadow(
+        _StubCalculator(_sap_result(primary_energy_kwh_per_m2=200.0))
+    )
+    epc = _epc(sap_version=10.2)
+
+    # Act
+    with caplog.at_level(logging.WARNING):
+        shadow.observe(property_id=42, effective_epc=epc, lodged=_lodged())
+
+    # Assert
+    assert len(caplog.records) == 1
+    assert "primary_energy_intensity" in caplog.records[0].getMessage()
+
+
+def test_observe_warns_when_co2_diverges_beyond_one_percent_after_kg_to_tonnes(
+    caplog: pytest.LogCaptureFixture,
+) -> None:
+    # Arrange — calculator emits kg/yr; 2000 kg = 2.0 t vs lodged 1.8 t (~11%).
+    shadow = LoggingCalculatorShadow(
+        _StubCalculator(_sap_result(co2_kg_per_yr=2000.0))
+    )
+    epc = _epc(sap_version=10.2)
+
+    # Act
+    with caplog.at_level(logging.WARNING):
+        shadow.observe(property_id=42, effective_epc=epc, lodged=_lodged())
+
+    # Assert — the kg→tonnes conversion is applied before comparison, so a
+    # matching 1800 kg would *not* fire (guarded by the silent-when-aligned test).
+    assert len(caplog.records) == 1
+    assert "co2_emissions" in caplog.records[0].getMessage()
+
+
+def test_observe_is_silent_when_the_calculator_agrees_with_lodged(
+    caplog: pytest.LogCaptureFixture,
+) -> None:
+    # Arrange — all three quantities at the matching defaults (SAP 72, PEUI 180,
+    # 1800 kg ≡ 1.8 t): nothing should be logged.
+    shadow = LoggingCalculatorShadow(_StubCalculator(_sap_result()))
+    epc = _epc(sap_version=10.2)
+
+    # Act
+    with caplog.at_level(logging.WARNING):
+        shadow.observe(property_id=42, effective_epc=epc, lodged=_lodged())
+
+    # Assert
+    assert caplog.records == []
diff --git a/tests/orchestration/fakes.py b/tests/orchestration/fakes.py
index 3e2feef0..23b1fc90 100644
--- a/tests/orchestration/fakes.py
+++ b/tests/orchestration/fakes.py
@@ -10,6 +10,8 @@ from types import TracebackType
 from typing import Any, Optional
 
 from datatypes.epc.domain.epc_property_data import EpcPropertyData
+from domain.property_baseline.calculator_shadow import CalculatorShadow
+from domain.property_baseline.performance import Performance
 from domain.property_baseline.property_baseline_performance import PropertyBaselinePerformance
 from domain.property.properties import Properties
 from domain.property.property import Property
@@ -88,6 +90,23 @@ class FakePropertyBaselineRepo(PropertyBaselineRepository):
         raise NotImplementedError
 
 
+class FakeCalculatorShadow(CalculatorShadow):
+    """Records each `observe` call so a test can assert the orchestrator runs
+    the shadow per property without dragging in the real calculator."""
+
+    def __init__(self) -> None:
+        self.observed: list[tuple[int, EpcPropertyData, Performance]] = []
+
+    def observe(
+        self,
+        *,
+        property_id: int,
+        effective_epc: EpcPropertyData,
+        lodged: Performance,
+    ) -> None:
+        self.observed.append((property_id, effective_epc, lodged))
+
+
 class FakeUnitOfWork(UnitOfWork):
     """A unit that holds in-memory repos and counts commits."""
 
diff --git a/tests/orchestration/test_ara_first_run_pipeline_integration.py b/tests/orchestration/test_ara_first_run_pipeline_integration.py
index 381f3f21..357ea7f2 100644
--- a/tests/orchestration/test_ara_first_run_pipeline_integration.py
+++ b/tests/orchestration/test_ara_first_run_pipeline_integration.py
@@ -36,6 +36,7 @@ from repositories.geospatial.geospatial_repository import GeospatialRepository
 from repositories.materials.materials_repository import MaterialsRepository
 from repositories.postgres_unit_of_work import PostgresUnitOfWork
 from repositories.scenario.scenario_repository import ScenarioRepository
+from tests.orchestration.fakes import FakeCalculatorShadow
 
 _JSON_SAMPLES = Path(__file__).resolve().parents[2] / "backend/epc_api/json_samples"
 
@@ -111,7 +112,9 @@ def test_first_run_baselines_through_repos_and_is_idempotent_on_rerun(
             solar_fetcher=_UnusedSolarFetcher(),
         ),
         baseline=PropertyBaselineOrchestrator(
-            unit_of_work=unit_of_work, rebaseliner=StubRebaseliner()
+            unit_of_work=unit_of_work,
+            rebaseliner=StubRebaseliner(),
+            calculator_shadow=FakeCalculatorShadow(),
         ),
         modelling=ModellingOrchestrator(
             scenario_repo=ScenarioRepository(),
diff --git a/tests/orchestration/test_property_baseline_orchestrator.py b/tests/orchestration/test_property_baseline_orchestrator.py
index cb67d176..b14574f0 100644
--- a/tests/orchestration/test_property_baseline_orchestrator.py
+++ b/tests/orchestration/test_property_baseline_orchestrator.py
@@ -13,6 +13,7 @@ from domain.property_baseline.rebaseliner import RebaselineNotImplemented, StubR
 from domain.property.property import Property, PropertyIdentity
 from orchestration.property_baseline_orchestrator import PropertyBaselineOrchestrator
 from tests.orchestration.fakes import (
+    FakeCalculatorShadow,
     FakePropertyBaselineRepo,
     FakePropertyRepo,
     FakeUnitOfWork,
@@ -37,6 +38,34 @@ def _property(*, sap_version: float) -> Property:
     )
 
 
+def test_run_invokes_the_calculator_shadow_per_property_and_still_persists() -> None:
+    # Arrange
+    property_baseline_repo = FakePropertyBaselineRepo()
+    shadow = FakeCalculatorShadow()
+    prop = _property(sap_version=10.2)
+    uow = FakeUnitOfWork(
+        property=FakePropertyRepo({10: prop}),
+        property_baseline=property_baseline_repo,
+    )
+    orchestrator = PropertyBaselineOrchestrator(
+        unit_of_work=lambda: uow,
+        rebaseliner=StubRebaseliner(),
+        calculator_shadow=shadow,
+    )
+
+    # Act
+    orchestrator.run([10])
+
+    # Assert — the load-bearing write + single commit are unchanged, and the
+    # shadow observed the Effective EPC + Lodged Performance once (ADR-0013).
+    lodged = Performance(
+        sap_score=72, epc_band=Epc.C, co2_emissions=1.8, primary_energy_intensity=180
+    )
+    assert len(property_baseline_repo.saved) == 1
+    assert uow.commits == 1
+    assert shadow.observed == [(10, prop.effective_epc, lodged)]
+
+
 def test_run_establishes_persists_and_commits_the_batch_once() -> None:
     # Arrange
     property_baseline_repo = FakePropertyBaselineRepo()
@@ -45,7 +74,9 @@ def test_run_establishes_persists_and_commits_the_batch_once() -> None:
         property_baseline=property_baseline_repo,
     )
     orchestrator = PropertyBaselineOrchestrator(
-        unit_of_work=lambda: uow, rebaseliner=StubRebaseliner()
+        unit_of_work=lambda: uow,
+        rebaseliner=StubRebaseliner(),
+        calculator_shadow=FakeCalculatorShadow(),
     )
 
     # Act
@@ -79,7 +110,9 @@ def test_run_raises_on_a_pre_sap10_property_and_does_not_commit() -> None:
         property_baseline=property_baseline_repo,
     )
     orchestrator = PropertyBaselineOrchestrator(
-        unit_of_work=lambda: uow, rebaseliner=StubRebaseliner()
+        unit_of_work=lambda: uow,
+        rebaseliner=StubRebaseliner(),
+        calculator_shadow=FakeCalculatorShadow(),
     )
 
     # Act / Assert — the raise propagates; the batch is neither persisted nor

From 57867832f6bb57ee4fbfeaf7f3ee2be615719f6f Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Tue, 2 Jun 2026 09:20:50 +0000
Subject: [PATCH 04/12] docs(adr): Bill Derivation (ADR-0014) + calculator goes
 load-bearing (ADR-0013 amend)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Pin the bills design from a /grill-with-docs session:
- ADR-0014: whole-home annual bill from SAP10 Calculation's delivered kWh per
  end use, re-priced at real Fuel Rates (NOT the calculator's SAP-notional
  total_fuel_cost_gbp, which is RdSAP Table 32 standardised prices ~half real
  electricity). Fuel enum + FuelRates + FuelRatesRepository static snapshot;
  per-section + total flat columns; raise on unpriced fuel (house coal /
  heat network are the named gaps).
- ADR-0013 amendment: the shadow stepping-stone is collapsed — the calculator
  is load-bearing now. effective=calculated for sap_version<10.2 (StubRebaseliner
  floor 10.0->10.2); >=10.2 keeps lodged + logs divergence; a strict-raise
  aborts the batch (load-bearing for bills regardless of version).
- CONTEXT: EPC Energy Derivation -> Bill Derivation (no "service" suffix);
  Baseline Performance energy block = per-end-use kWh + per-section bill + total;
  Fuel Rates = committed static snapshot; Rebaselining trigger threshold 10.2.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 CONTEXT.md                                    | 20 ++---
 ...uces-effective-performance-shadow-first.md | 21 +++++
 ...14-bill-derivation-from-real-fuel-rates.md | 89 +++++++++++++++++++
 3 files changed, 120 insertions(+), 10 deletions(-)
 create mode 100644 docs/adr/0014-bill-derivation-from-real-fuel-rates.md

diff --git a/CONTEXT.md b/CONTEXT.md
index a41d597a..3580b93e 100644
--- a/CONTEXT.md
+++ b/CONTEXT.md
@@ -82,11 +82,11 @@ The EpcPropertyData scored by the modelling pipeline for a single Property, deri
 _Avoid_: modelling EPC, working EPC, resolved EPC, derived EPC
 
 **Rebaselining**:
-Re-predicting a Property's SAP score, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh via **SAP10 Calculation** (the deterministic `Sap10Calculator`, which superseded the old ML-API rebaseliner; an ML residual head over the calculator is future — ADR-0009/0013) so the modelling pipeline scores it against the current SAP10 methodology. Triggered when either (a) the Effective EPC was lodged under a pre-SAP10 schema (`sap_version < 10.0`), so the recorded scores reflect a superseded methodology, or (b) Site Notes / Landlord Overrides changed the physical state of the Property (walls / heating / windows / etc.) so the lodged scores no longer reflect what's installed. Both triggers may fire together. Produces Effective Performance; Lodged Performance is preserved unchanged. kWh is included as ML targets per ADR-0007 — see [[epc-ml-transform]].
+Re-predicting a Property's SAP score, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh via **SAP10 Calculation** (the deterministic `Sap10Calculator`, which superseded the old ML-API rebaseliner; an ML residual head over the calculator is future — ADR-0009/0013) so the modelling pipeline scores it against the current SAP10 methodology. Triggered when either (a) the Effective EPC was lodged under a methodology the calculator supersedes (`sap_version < 10.2`, the calculator's target spec), so the recorded scores reflect a superseded methodology, or (b) Site Notes / Landlord Overrides changed the physical state of the Property (walls / heating / windows / etc.) so the lodged scores no longer reflect what's installed. Both triggers may fire together. Produces Effective Performance; Lodged Performance is preserved unchanged. kWh is included as ML targets per ADR-0007 — see [[epc-ml-transform]].
 _Avoid_: re-scoring, re-prediction, performance recomputation, refresh (for cache-freshness)
 
 **Baseline Performance**:
-A Property's current performance aggregate, holding both Lodged Performance and Effective Performance plus annual space heating kWh, hot water kWh, fuel split, and bills derived from the Effective EPC — kWh values come from the EPC's recorded fields for SAP10 baselines or from ML when Rebaselining fires; bills are derived deterministically from kWh × current Fuel Rates. Persisted as one row; surfaced as one block in the UI.
+A Property's current performance aggregate, holding both Lodged Performance and Effective Performance plus the energy block: delivered kWh **per end use** (heating, hot water, lighting, appliances, cooking, pumps/fans, …) and the **annual bill** composed into per-section costs plus a total, produced by **Bill Derivation** from SAP10 Calculation's per-end-use kWh × current Fuel Rates. Persisted as one row (flat typed columns, per-section kWh + cost + total); surfaced as one block in the UI.
 _Avoid_: baseline predictions, predicted baseline, rebaselined values
 
 **Lodged Performance**:
@@ -98,7 +98,7 @@ The SAP / EPC Band / carbon emissions / Primary Energy Intensity the modelling p
 _Avoid_: modelled performance, rebaselined performance (only correct when rebaselining ran), scored values
 
 **Calculated SAP10 Performance**:
-The SAP score, EPC Band, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh produced by **SAP10 Calculation** from a Property's EpcPropertyData. It is **not** a separately-persisted third value-set beside Lodged and Effective: in every baselining scenario the calculator's output *is* the **Effective Performance** (real lodged SAP10 EPC with no overrides ⇒ Calculated = Lodged = Effective; overrides or an estimated / pre-SAP10 EPC ⇒ Calculated = Effective, there being no lodged SAP10 figure to compare against). The calculator is therefore the mechanism that produces Effective Performance, having superseded the old ML-API rebaseliner. While it is being hardened it runs in **shadow** for the first baselining slice — computed on every Property, compared to Lodged, and any divergence (SAP > 0.5, or PEUI / CO2 beyond tolerance) or strict-raise **logged, not persisted** — then is promoted to drive Effective Performance once overrides / estimation land (ADR-0013). The ≥1000-cert parity confirmation against the cert-reported SAP (see [[sap-spec-version]]) gates that promotion. ADR-0009 introduced the term, as amended by ADR-0010 and realized by ADR-0013.
+The SAP score, EPC Band, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh produced by **SAP10 Calculation** from a Property's EpcPropertyData. It is **not** a separately-persisted third value-set beside Lodged and Effective: in every baselining scenario the calculator's output *is* the **Effective Performance** (real lodged SAP10 EPC with no overrides ⇒ Calculated = Lodged = Effective; overrides or an estimated / pre-SAP10 EPC ⇒ Calculated = Effective, there being no lodged SAP10 figure to compare against). The calculator is therefore the mechanism that produces Effective Performance, having superseded the old ML-API rebaseliner. The calculator is **load-bearing**: for `sap_version < 10.2` (lodged under a superseded methodology) its output *is* the Effective Performance; for `≥ 10.2` the API's lodged figures are kept and the calculator runs **alongside, logging any divergence** (SAP > 0.5, PEUI/CO2 beyond tolerance) as a validation signal (see [[sap-spec-version]]). It is load-bearing for **Bill Derivation regardless of version** (the EPC lodges no per-end-use kWh), so a calculator strict-raise **aborts the batch** and the un-mapped cert is fixed immediately. ADR-0009 introduced the term, amended by ADR-0010, realized by ADR-0013 (whose shadow stepping-stone is superseded) and ADR-0014.
 _Avoid_: calculator output, computed performance, worksheet performance, SAP10 output, calculated value-set (it is not a stored third set)
 
 **SAP10 Calculation**:
@@ -117,9 +117,9 @@ _Avoid_: parity cohort, validation set, corpus sample
 The process that translates an Optimised Package into cert-field changes and produces the "ending state snapshot" EpcPropertyData that Plan Phase persists. Implemented by the `MeasureApplicator` service class in `domain/sap/` (or a sibling package). Each Measure Type's translation rules (e.g. `loft_insulation` → `roof_insulation_thickness_mm = 270mm`, `ashp` → `main_heating_details[0]` replacement) live here. Pure function — does not run SAP10 Calculation itself; the caller chains `MeasureApplicator.apply(epc, package) → Sap10Calculator.calculate(post_epc)`. ADR-0009.
 _Avoid_: measure overrides (rejected during ADR-0009 grill — phantom mid-layer), package applier, retrofit simulator
 
-**EPC Energy Derivation**:
-The process that derives a Property's fuel split and annual bills from its space heating kWh and hot water kWh values plus the heating fuel deduced from SAP fields. kWh values themselves come from the EPC's recorded fields (`renewable_heat_incentive.space_heating_kwh` and `.water_heating_kwh`) for SAP10 baselines, or from ML prediction when Rebaselining fires or when scoring a post-measure state. Bills are computed deterministically from delivered kWh × current Fuel Rates + standing charges + SEG credits. The UCL Correction is no longer applied at runtime — it is folded into ML training labels (see [[epc-ml-transform]] and ADR-0007).
-_Avoid_: kWh prediction (kWh is now an ML target — see Rebaselining), baseline kWh, energy estimation
+**Bill Derivation**:
+The deterministic process that derives a Property's annual energy **bill**, composed into per-end-use sections (heating, hot water, lighting, appliances, cooking, pumps/fans, …) plus a **total**, by pricing **SAP10 Calculation**'s delivered kWh per end use at **current Fuel Rates** — each end use billed at its fuel's rate, rolled up per fuel for **standing charges** (metered fuels only — gas/electricity; oil/LPG/solid have none) minus **SEG** export credit on PV. Implemented by `BillDerivation` in `domain/property_baseline/` (deterministic, ADR-0006). Reads Fuel Rates from a committed static snapshot via `FuelRatesRepository` (no live ETL yet). **Distinct from the calculator's `total_fuel_cost_gbp`**, which is the SAP-rating notional cost at RdSAP Table 32 standardised prices (~half the real electricity price) — not what the household pays. Raises on a fuel it has no rate for (e.g. house coal, heat network). ADR-0014.
+_Avoid_: EPC Energy Derivation (renamed), EpcEnergyDerivationService (no "service" suffix), kWh prediction, baseline kWh, energy estimation
 
 **UCL Correction**:
 The per-band linear correction (Few et al. 2023, _Energy & Buildings_ 288 113024) that aligns EPC-modelled Primary Energy Intensity with metered consumption. Folded into ML training labels at fit time (per ADR-0007) rather than applied at runtime — the trained model emits metered-equivalent PEUI directly, avoiding the discontinuities at EPC band boundaries that arose when the per-band linear correction was applied post-prediction. Calibrated against gas-heated, non-PV homes in England and Wales rated under SAP 2012; the current implementation extrapolates it to all properties (open question §15.14).
@@ -174,11 +174,11 @@ _Avoid_: code list, code dictionary, vocab
 ### Reference data
 
 **Fuel Rates**:
-The current per-fuel rate (pence/kWh) and standing charge used to compute a Property's bills; time-versioned and regional, refreshed from Ofgem's published caps via an ETL. The Smart Export Guarantee rate sits in the same set as `electricity_export`. Consumed by EPC Energy Derivation.
+The current per-fuel rate (pence/kWh) and standing charge used to compute a Property's bills; time-versioned and regional. Sourced for now from a **committed static snapshot** (national, Ofgem-cap period for gas/electricity + DESNZ/NEP for off-gas fuels), read via `FuelRatesRepository`; an Ofgem-cap ETL automating the refresh is future, not a prerequisite. The Smart Export Guarantee rate sits in the same set as `electricity_export`. Consumed by Bill Derivation.
 _Avoid_: fuel prices (commodity prices, different concept), tariff, energy cost
 
 **Carbon Factors**:
-The per-fuel CO2 emission factor (kgCO2e/kWh) used to compute a Property's carbon emissions; time-versioned, refreshed from Defra's annual publication. Consumed by EPC Energy Derivation.
+The per-fuel CO2 emission factor (kgCO2e/kWh) used to compute a Property's carbon emissions; time-versioned, refreshed from Defra's annual publication. Consumed by Bill Derivation.
 _Avoid_: emission factors (ambiguous), CO2 rates
 
 ### Outputs
@@ -277,7 +277,7 @@ _Avoid_: API key, auth token, secret
 - When a **Property** has both **Site Notes** and a public **EPC**, the newer of the two derives the **Effective EPC**. **Landlord Overrides** apply only when the **EPC** is the source — never when **Site Notes** are.
 - A Property's **Baseline Performance** holds two halves: **Lodged Performance** (the gov register's SAP / band / carbon / heat) and **Effective Performance** (what the modelling pipeline scored against). The two are equal unless **Rebaselining** fires.
 - **Rebaselining** produces **Effective Performance** by ML re-prediction across SAP score, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh, when either (a) the Effective EPC was lodged under a pre-SAP10 schema, or (b) the Effective EPC's physical state diverges from the lodged EPC. **Lodged Performance** is never overwritten.
-- **EPC Energy Derivation** derives **fuel split** and **bills** from kWh values (sourced from the EPC's `renewable_heat_incentive` fields for baseline SAP10 properties, or from ML when Rebaselining fires), reading current **Fuel Rates** and **Carbon Factors** from their respective repos.
+- **Bill Derivation** derives **fuel split** and **bills** from kWh values (sourced from the EPC's `renewable_heat_incentive` fields for baseline SAP10 properties, or from ML when Rebaselining fires), reading current **Fuel Rates** and **Carbon Factors** from their respective repos.
 - The **EPC Prediction Service** uses **Comparable Properties** for both gap-filling and producing **EPC Anomaly Flags**.
 - A **Scenario** carries one or more ordered **Scenario Phases**. Triggering the model against N Scenarios produces N **Plans** per Property; each Plan carries an ordered list of **Plan Phases** matching the Scenario's shape.
 - Each **Plan Phase** holds its **Optimised Package**, the ending state snapshot, and any **Rolled-over Options** that flow as candidates into the next Plan Phase. A single-phase Scenario is one Scenario Phase with all measure types allowed; the same machinery handles it.
@@ -289,7 +289,7 @@ _Avoid_: API key, auth token, secret
 
 > **Dev:** "A landlord uploads a corrected boiler for one of their properties. What happens?"
 >
-> **Domain expert:** "That's a **Landlord Override** on the heating fields. Save it against the **Property**. The **Effective EPC** has changed, so **Rebaselining** runs to re-predict SAP / carbon / PEUI / space heating kWh / hot water kWh, and **EPC Energy Derivation** re-runs to update the fuel split and bills based on the new kWh values and fuel deduction. With fresh **Baseline Performance** we regenerate **Recommendations**."
+> **Domain expert:** "That's a **Landlord Override** on the heating fields. Save it against the **Property**. The **Effective EPC** has changed, so **Rebaselining** runs to re-predict SAP / carbon / PEUI / space heating kWh / hot water kWh, and **Bill Derivation** re-runs to update the fuel split and bills based on the new kWh values and fuel deduction. With fresh **Baseline Performance** we regenerate **Recommendations**."
 
 > **Dev:** "What if the same Property also has Site Notes?"
 >
diff --git a/docs/adr/0013-calculator-produces-effective-performance-shadow-first.md b/docs/adr/0013-calculator-produces-effective-performance-shadow-first.md
index 206aa2f5..6dd9a044 100644
--- a/docs/adr/0013-calculator-produces-effective-performance-shadow-first.md
+++ b/docs/adr/0013-calculator-produces-effective-performance-shadow-first.md
@@ -86,3 +86,24 @@ property test cohort is about to flow through baselining. So this lands in two s
   wild); each caught exception is logged with its type and `property_id`.
 - This decision is short-lived in its shadow form by design; the durable half — "the calculator
   produces Effective Performance; there is no third value-set" — outlives it.
+
+## Amendment (2026-06-02): shadow collapsed — the calculator is load-bearing now
+
+The shadow stepping-stone was right in shape but wrong in duration: the calculator was ready, and
+wiring [Bill Derivation](0014-bill-derivation-from-real-fuel-rates.md) onto its delivered-kWh
+breakdown makes it load-bearing for *bills on every property* — so the "shadow until overrides /
+estimation land" timeline collapses to now. The durable decision stands (calculator produces
+Effective Performance; no third value-set); only the timing changes:
+
+- **`sap_version < 10.2`** → effective performance **is** the calculator's output (the
+  `StubRebaseliner` floor moves `10.0 → 10.2`; mechanism is the calculator, not ML).
+- **`sap_version ≥ 10.2`** → effective = the API's lodged figures; the calculator still runs
+  **alongside, logging divergence** (the surviving half of the shadow runner) as a validation signal.
+- **Failure posture flips to abort:** the calculator is load-bearing for Bill Derivation regardless
+  of version, so a strict-raise **aborts the batch** (ADR-0012) — the un-mapped cert is fixed
+  immediately rather than skipped. The shadow's catch-and-log of raises is retired; divergence
+  *warnings* on `≥ 10.2` certs remain.
+
+The `≥1000-cert parity` gate from ADR-0009/0010 still governs whether the calculator's figures are
+*trusted as definitive* for the SAP-10.2 cohort, but it no longer gates *wiring* — pre-10.2 certs
+have no current-spec lodged figure to fall back to, so the calculator is the only source there.
diff --git a/docs/adr/0014-bill-derivation-from-real-fuel-rates.md b/docs/adr/0014-bill-derivation-from-real-fuel-rates.md
new file mode 100644
index 00000000..7c033085
--- /dev/null
+++ b/docs/adr/0014-bill-derivation-from-real-fuel-rates.md
@@ -0,0 +1,89 @@
+---
+Status: accepted
+---
+
+# Bill Derivation: whole-home annual bill from the calculator's delivered kWh × real Fuel Rates (not SAP prices)
+
+Lifts the bills/fuel-split deferral in [ADR-0004](0004-baseline-performance-lodged-effective-pair.md)
+and its migration note, and builds on [ADR-0013](0013-calculator-produces-effective-performance-shadow-first.md)
+(the calculator is load-bearing). Decided in a `/grill-with-docs` session (2026-06-02).
+
+## Context
+
+ADR-0004's amendment deferred fuel split + bills "because bills require a current Fuel Rates
+source (Ofgem-cap ETL) that does not yet exist." A static snapshot lifts that blocker. The old
+`backend/ml_models/AnnualBillSavings.py` is the fragile reference (a blended `PRICE_FACTOR`, two
+disagreeing rate sources, a standing-charge precedence bug, a 10× unit slip) — we rewrite, not port.
+
+## Decisions
+
+### 1. The bill is whole-home, composed per end use, from the calculator's delivered kWh
+
+`SAP10 Calculation` already emits delivered (post-efficiency, billable) kWh for every regulated end
+use — main/secondary heating, hot water, pumps/fans, lighting, cooling — and computes appliances +
+cooking electricity internally (Appendix L L13-L20). **`BillDerivation`** consumes that per-end-use
+breakdown and produces per-section costs + a total. The EPC lodges no per-end-use kWh, so the
+calculator is the only source — which is why it is **load-bearing for bills regardless of
+`sap_version`** (a raise aborts the batch, ADR-0013).
+
+### 2. Bills use real Fuel Rates, not the calculator's `total_fuel_cost_gbp`
+
+The calculator's fuel cost is the SAP-rating notional cost at **RdSAP Table 32 standardised
+prices** — deliberately frozen for rating comparability, and ~half the real electricity price
+(Table 32 elec ~13 p/kWh vs Ofgem Apr–Jun 2026 cap ~24.7 p/kWh). Billing on it would roughly halve
+an electric/heat-pump home's bill. So `BillDerivation` **re-prices** the delivered kWh at current
+**Fuel Rates**, and the calculator's `total_fuel_cost_gbp` is used only for the SAP rating.
+
+### 3. Fuel Rates = committed static snapshot, read via `FuelRatesRepository`
+
+A national snapshot (Ofgem-cap period for gas/electricity, DESNZ/NEP for off-gas fuels), keyed by a
+canonical **`Fuel`** enum (`MAINS_GAS, ELECTRICITY, ELECTRICITY_OFF_PEAK, OIL, LPG, SMOKELESS,
+WOOD_LOGS, WOOD_PELLETS, HEAT_NETWORK`), each entry carrying `unit_rate_p_per_kwh` +
+`standing_charge_p_per_day`, plus a top-level `seg_export_p_per_kwh`. The calculator's per-end-use
+SAP fuel codes map to this enum via the existing `is_gas_code` / `is_electric_fuel_code` /
+`is_liquid_fuel_code` helpers — so the snapshot and the calculator meet at one vocabulary, not raw
+SAP codes. Read through a `FuelRatesRepository` port (ADR-0011: a Repo reads stored reference data
+by key); an Ofgem-cap ETL automating the refresh is future, behind the same port — not a
+prerequisite. National now; the 14 cap regions are a later refinement behind the same port.
+
+### 4. Bill arithmetic
+
+Total = Σ (per-end-use delivered kWh × that end use's fuel unit rate) + per-meter **standing
+charges** (metered fuels only — gas/electricity; oil/LPG/solid have none) − **SEG** export credit on
+PV. Off-peak electricity splits day/night via the calculator's existing Table 12a high/low-rate
+fractions.
+
+### 5. Strict-raise on an unpriced fuel
+
+`BillDerivation` **raises** on a fuel it has no rate for — same discipline as the calculator. Two
+named gaps surface immediately rather than billing at a wrong default:
+- **House coal** — no standard domestic price (its domestic sale is illegal in England).
+- **Communal / heat network** — scheme-specific, no national tariff. The one common case (flats);
+  a heat-network rate model is a named follow-up.
+
+### 6. Persistence: flat per-section columns on `property_baseline_performance`
+
+The energy block lands as **flat typed columns** on the existing row (ADR-0004's flat-column rule
+holds — the SAP end-uses are a *fixed enumerable set*, so there is no column explosion and no
+variable-shape JSON): per-section `*_kwh` + `*_cost_gbp` (heating, hot water, lighting, appliances,
+cooking, pumps/fans), `standing_charges_gbp`, `seg_credit_gbp`, and `total_annual_bill_gbp`. The
+production migration is FE-owned (Drizzle); `docs/migrations/` updated.
+
+## Consequences
+
+- `BillDerivation` is named for the operation, **no "Service" suffix** (user preference).
+- A `Fuel` enum + a SAP-code→`Fuel` mapping become first-class; `FuelRates` + `FuelRatesRepository`
+  + a committed snapshot file are new.
+- Carbon emissions are unaffected (they stay on Lodged/Effective Performance from the calculator's
+  CO2 factors); this ADR is about £ bills only.
+- The snapshot goes stale on the Ofgem-cap cadence (quarterly); the file records its period, and the
+  ETL that automates refresh is the deferred follow-up.
+
+## Considered alternatives
+
+- **Bill from `RenewableHeatIncentive` heating+HW kWh only** (CONTEXT's original scope) — rejected:
+  the user wants the whole-home bill, and heating+HW omits lighting/appliances/cooking, which only
+  the calculator supplies.
+- **Bill at SAP Table 32 prices** — rejected: standardised rating prices, ~half real electricity.
+- **JSON `bill_breakdown` block** — rejected: end-uses are fixed-cardinality, so flat columns are
+  clean and stay queryable (ADR-0004).

From 14b45a1b3e4e2bfcbcf18e298e4de0ea860ebbff Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Tue, 2 Jun 2026 09:29:07 +0000
Subject: [PATCH 05/12] feat(fuel-rates): FuelRates snapshot + repository
 foundation (ADR-0014)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Slice 1 of Bill Derivation — the reference-data foundation that later slices
price the calculator's per-end-use kWh against:

- Fuel enum (canonical billing fuels; the join key between the calculator's
  SAP-code fuels and the rates snapshot). COAL + HEAT_NETWORK are members with
  no national rate.
- FuelRates value object: unit_rate_p_per_kwh / standing_charge_p_per_day /
  seg_export_p_per_kwh; raises UnpricedFuel on a fuel it has no rate for rather
  than billing at a wrong default.
- FuelRatesRepository port (ADR-0011 Repo-reads-stored-reference-data) +
  StaticFileFuelRatesRepository reading a committed JSON snapshot.
- Snapshot fuel_rates_2026_q2.json: GB national, Apr-Jun 2026 Ofgem cap
  (gas/electricity) + DESNZ/NEP May 2026 (off-gas). Carries the full researched
  data; the value object exposes single-rate fuels this slice. Off-peak
  (day/night), house coal and heat network raise UnpricedFuel until later slices.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 domain/fuel_rates/__init__.py                 |  0
 domain/fuel_rates/fuel.py                     | 43 +++++++++++++++++
 domain/fuel_rates/fuel_rates.py               | 46 +++++++++++++++++++
 repositories/fuel_rates/__init__.py           |  0
 .../fuel_rates/data/fuel_rates_2026_q2.json   | 27 +++++++++++
 .../fuel_rates/fuel_rates_repository.py       | 17 +++++++
 .../static_file_fuel_rates_repository.py      | 43 +++++++++++++++++
 tests/domain/fuel_rates/__init__.py           |  0
 tests/domain/fuel_rates/test_fuel_rates.py    | 33 +++++++++++++
 tests/repositories/fuel_rates/__init__.py     |  0
 .../test_static_file_fuel_rates_repository.py | 45 ++++++++++++++++++
 11 files changed, 254 insertions(+)
 create mode 100644 domain/fuel_rates/__init__.py
 create mode 100644 domain/fuel_rates/fuel.py
 create mode 100644 domain/fuel_rates/fuel_rates.py
 create mode 100644 repositories/fuel_rates/__init__.py
 create mode 100644 repositories/fuel_rates/data/fuel_rates_2026_q2.json
 create mode 100644 repositories/fuel_rates/fuel_rates_repository.py
 create mode 100644 repositories/fuel_rates/static_file_fuel_rates_repository.py
 create mode 100644 tests/domain/fuel_rates/__init__.py
 create mode 100644 tests/domain/fuel_rates/test_fuel_rates.py
 create mode 100644 tests/repositories/fuel_rates/__init__.py
 create mode 100644 tests/repositories/fuel_rates/test_static_file_fuel_rates_repository.py

diff --git a/domain/fuel_rates/__init__.py b/domain/fuel_rates/__init__.py
new file mode 100644
index 00000000..e69de29b
diff --git a/domain/fuel_rates/fuel.py b/domain/fuel_rates/fuel.py
new file mode 100644
index 00000000..fff51f57
--- /dev/null
+++ b/domain/fuel_rates/fuel.py
@@ -0,0 +1,43 @@
+from __future__ import annotations
+
+from enum import Enum
+
+
+class Fuel(Enum):
+    """A canonical billing fuel — the join key between the calculator's
+    per-end-use fuel (mapped from SAP fuel codes) and the Fuel Rates snapshot
+    (ADR-0014). Member names match the snapshot's keys.
+
+    ``COAL`` (traditional house coal) and ``HEAT_NETWORK`` are carried as
+    members so a cert lodging them maps to a Fuel, but they have no national
+    rate — pricing them raises ``UnpricedFuel`` (house coal's domestic sale is
+    illegal in England; heat networks are scheme-specific).
+    """
+
+    MAINS_GAS = "MAINS_GAS"
+    ELECTRICITY = "ELECTRICITY"
+    ELECTRICITY_OFF_PEAK = "ELECTRICITY_OFF_PEAK"
+    OIL = "OIL"
+    LPG = "LPG"
+    COAL = "COAL"
+    SMOKELESS = "SMOKELESS"
+    WOOD_LOGS = "WOOD_LOGS"
+    WOOD_PELLETS = "WOOD_PELLETS"
+    HEAT_NETWORK = "HEAT_NETWORK"
+
+
+class UnpricedFuel(ValueError):
+    """Bill Derivation was asked for a rate on a fuel the current Fuel Rates
+    snapshot does not price (ADR-0014).
+
+    Raised rather than billing at a wrong default so the gap surfaces
+    immediately — house coal and heat networks have no national rate, and
+    off-peak electricity needs the day/night split that a later slice adds.
+    """
+
+    def __init__(self, fuel: Fuel) -> None:
+        super().__init__(
+            f"no rate for fuel {fuel.name} in the current Fuel Rates snapshot; "
+            f"add it to the snapshot or map this end use to a priced fuel"
+        )
+        self.fuel = fuel
diff --git a/domain/fuel_rates/fuel_rates.py b/domain/fuel_rates/fuel_rates.py
new file mode 100644
index 00000000..a5b2eb73
--- /dev/null
+++ b/domain/fuel_rates/fuel_rates.py
@@ -0,0 +1,46 @@
+from __future__ import annotations
+
+from collections.abc import Mapping
+from dataclasses import dataclass
+
+from domain.fuel_rates.fuel import Fuel, UnpricedFuel
+
+
+@dataclass(frozen=True)
+class FuelRate:
+    """One fuel's current tariff: unit price + daily standing charge.
+
+    Off-gas fuels (oil / LPG / solid / wood) carry a ``0.0`` standing charge —
+    they are delivered, not metered, so there is no daily charge.
+    """
+
+    unit_rate_p_per_kwh: float
+    standing_charge_p_per_day: float
+
+
+@dataclass(frozen=True)
+class FuelRates:
+    """A current Fuel Rates snapshot — the rate per billing Fuel plus the SEG
+    export credit (ADR-0014). ``period`` records which window it is for, since
+    a committed snapshot goes stale on the Ofgem-cap (quarterly) cadence.
+
+    Pricing a fuel the snapshot does not carry raises ``UnpricedFuel`` rather
+    than defaulting — see [[reference-unmapped-sap-code]] for the same strict
+    discipline on the calculator side.
+    """
+
+    period: str
+    seg_export_p_per_kwh: float
+    rates: Mapping[Fuel, FuelRate]
+
+    def unit_rate_p_per_kwh(self, fuel: Fuel) -> float:
+        return self._rate(fuel).unit_rate_p_per_kwh
+
+    def standing_charge_p_per_day(self, fuel: Fuel) -> float:
+        return self._rate(fuel).standing_charge_p_per_day
+
+    def _rate(self, fuel: Fuel) -> FuelRate:
+        rate = self.rates.get(fuel)
+        if rate is None:
+            raise UnpricedFuel(fuel)
+        return rate
diff --git a/repositories/fuel_rates/__init__.py b/repositories/fuel_rates/__init__.py
new file mode 100644
index 00000000..e69de29b
diff --git a/repositories/fuel_rates/data/fuel_rates_2026_q2.json b/repositories/fuel_rates/data/fuel_rates_2026_q2.json
new file mode 100644
index 00000000..2b81bd30
--- /dev/null
+++ b/repositories/fuel_rates/data/fuel_rates_2026_q2.json
@@ -0,0 +1,27 @@
+{
+  "period": "2026-04 to 2026-06",
+  "basis": "GB national average; Ofgem price cap (gas/electricity), DESNZ/NEP May 2026 (off-gas fuels)",
+  "sources": {
+    "gas_electricity": "Ofgem energy price cap unit rates and standing charges, announced 2026-02-25, cap period Apr-Jun 2026",
+    "off_gas": "DESNZ QEP petroleum table (oil, May 2026) + Nottingham Energy Partnership May 2026 comparison (LPG, smokeless, wood)",
+    "seg": "Solar Energy UK SEG league table, updated 2026-05-12"
+  },
+  "seg_export_p_per_kwh": 15.0,
+  "fuels": {
+    "MAINS_GAS": { "unit_rate_p_per_kwh": 5.74, "standing_charge_p_per_day": 29.09 },
+    "ELECTRICITY": { "unit_rate_p_per_kwh": 24.67, "standing_charge_p_per_day": 57.21 },
+    "ELECTRICITY_OFF_PEAK": { "day_p_per_kwh": 29.73, "night_p_per_kwh": 13.89, "standing_charge_p_per_day": 56.99 },
+    "OIL": { "unit_rate_p_per_kwh": 9.16, "standing_charge_p_per_day": 0.0 },
+    "LPG": { "unit_rate_p_per_kwh": 17.61, "standing_charge_p_per_day": 0.0 },
+    "SMOKELESS": { "unit_rate_p_per_kwh": 10.0, "standing_charge_p_per_day": 0.0 },
+    "WOOD_LOGS": { "unit_rate_p_per_kwh": 8.83, "standing_charge_p_per_day": 0.0 },
+    "WOOD_PELLETS": { "unit_rate_p_per_kwh": 7.99, "standing_charge_p_per_day": 0.0, "_note": "bagged pellets; blown bulk is 6.76 p/kWh" },
+    "COAL": null,
+    "HEAT_NETWORK": null
+  },
+  "_gaps": {
+    "COAL": "no standard domestic price (traditional house coal sale for domestic use is illegal in England)",
+    "HEAT_NETWORK": "scheme-specific; no national tariff or price-cap unit rate",
+    "ELECTRICITY_OFF_PEAK": "day/night split; priced once the off-peak slice adds the day/night accessor"
+  }
+}
diff --git a/repositories/fuel_rates/fuel_rates_repository.py b/repositories/fuel_rates/fuel_rates_repository.py
new file mode 100644
index 00000000..a6d2b2d2
--- /dev/null
+++ b/repositories/fuel_rates/fuel_rates_repository.py
@@ -0,0 +1,17 @@
+from __future__ import annotations
+
+from abc import ABC, abstractmethod
+
+from domain.fuel_rates.fuel_rates import FuelRates
+
+
+class FuelRatesRepository(ABC):
+    """Reads the current Fuel Rates used to price a Property's bill (ADR-0014).
+
+    A Repo, not a Fetcher (ADR-0011): it reads stored reference data, no live
+    API call. The adapter backs onto a committed static snapshot today; an
+    Ofgem-cap ETL is a future adapter behind this same port.
+    """
+
+    @abstractmethod
+    def get_current(self) -> FuelRates: ...
diff --git a/repositories/fuel_rates/static_file_fuel_rates_repository.py b/repositories/fuel_rates/static_file_fuel_rates_repository.py
new file mode 100644
index 00000000..cbfd5062
--- /dev/null
+++ b/repositories/fuel_rates/static_file_fuel_rates_repository.py
@@ -0,0 +1,43 @@
+from __future__ import annotations
+
+import json
+from pathlib import Path
+from typing import Any, Optional
+
+from domain.fuel_rates.fuel import Fuel
+from domain.fuel_rates.fuel_rates import FuelRate, FuelRates
+from repositories.fuel_rates.fuel_rates_repository import FuelRatesRepository
+
+_DEFAULT_SNAPSHOT = Path(__file__).parent / "data" / "fuel_rates_2026_q2.json"
+
+
+class StaticFileFuelRatesRepository(FuelRatesRepository):
+    """Reads Fuel Rates from a committed JSON snapshot (ADR-0014).
+
+    Only **single-rate** fuels (those lodging a ``unit_rate_p_per_kwh``) are
+    exposed. Off-peak (day/night) and the unpriced gaps (null entries — house
+    coal, heat network) are skipped, so pricing them raises ``UnpricedFuel``.
+    The day/night accessor for off-peak lands in a later slice.
+    """
+
+    def __init__(self, snapshot_path: Optional[Path] = None) -> None:
+        self._snapshot_path = snapshot_path or _DEFAULT_SNAPSHOT
+
+    def get_current(self) -> FuelRates:
+        payload: dict[str, Any] = json.loads(self._snapshot_path.read_text())
+        fuels: dict[str, Any] = payload["fuels"]
+        rates: dict[Fuel, FuelRate] = {}
+        for name, entry in fuels.items():
+            if entry is None:
+                continue  # an unpriced gap (house coal / heat network)
+            if "unit_rate_p_per_kwh" not in entry:
+                continue  # off-peak day/night — priced in a later slice
+            rates[Fuel[name]] = FuelRate(
+                unit_rate_p_per_kwh=float(entry["unit_rate_p_per_kwh"]),
+                standing_charge_p_per_day=float(entry["standing_charge_p_per_day"]),
+            )
+        return FuelRates(
+            period=str(payload["period"]),
+            seg_export_p_per_kwh=float(payload["seg_export_p_per_kwh"]),
+            rates=rates,
+        )
diff --git a/tests/domain/fuel_rates/__init__.py b/tests/domain/fuel_rates/__init__.py
new file mode 100644
index 00000000..e69de29b
diff --git a/tests/domain/fuel_rates/test_fuel_rates.py b/tests/domain/fuel_rates/test_fuel_rates.py
new file mode 100644
index 00000000..a7319274
--- /dev/null
+++ b/tests/domain/fuel_rates/test_fuel_rates.py
@@ -0,0 +1,33 @@
+from __future__ import annotations
+
+import pytest
+
+from domain.fuel_rates.fuel import Fuel, UnpricedFuel
+from domain.fuel_rates.fuel_rates import FuelRate, FuelRates
+
+
+def _rates() -> FuelRates:
+    return FuelRates(
+        period="test",
+        seg_export_p_per_kwh=15.0,
+        rates={Fuel.MAINS_GAS: FuelRate(unit_rate_p_per_kwh=5.74, standing_charge_p_per_day=29.09)},
+    )
+
+
+def test_unit_rate_and_standing_charge_read_back_for_a_priced_fuel() -> None:
+    # Arrange
+    rates = _rates()
+
+    # Act / Assert
+    assert rates.unit_rate_p_per_kwh(Fuel.MAINS_GAS) == 5.74
+    assert rates.standing_charge_p_per_day(Fuel.MAINS_GAS) == 29.09
+
+
+def test_a_fuel_absent_from_the_snapshot_raises_unpriced_fuel() -> None:
+    # Arrange — LPG is not in this snapshot.
+    rates = _rates()
+
+    # Act / Assert — the raise carries the offending fuel for the operator.
+    with pytest.raises(UnpricedFuel) as excinfo:
+        rates.unit_rate_p_per_kwh(Fuel.LPG)
+    assert excinfo.value.fuel is Fuel.LPG
diff --git a/tests/repositories/fuel_rates/__init__.py b/tests/repositories/fuel_rates/__init__.py
new file mode 100644
index 00000000..e69de29b
diff --git a/tests/repositories/fuel_rates/test_static_file_fuel_rates_repository.py b/tests/repositories/fuel_rates/test_static_file_fuel_rates_repository.py
new file mode 100644
index 00000000..38d3a0a6
--- /dev/null
+++ b/tests/repositories/fuel_rates/test_static_file_fuel_rates_repository.py
@@ -0,0 +1,45 @@
+from __future__ import annotations
+
+import pytest
+
+from domain.fuel_rates.fuel import Fuel, UnpricedFuel
+from repositories.fuel_rates.static_file_fuel_rates_repository import (
+    StaticFileFuelRatesRepository,
+)
+
+
+def test_get_current_loads_the_committed_snapshot_mains_gas_rate() -> None:
+    # Arrange
+    repository = StaticFileFuelRatesRepository()
+
+    # Act
+    rates = repository.get_current()
+
+    # Assert — the committed Apr–Jun 2026 snapshot prices mains gas at 5.74 p/kWh.
+    assert rates.unit_rate_p_per_kwh(Fuel.MAINS_GAS) == 5.74
+
+
+def test_snapshot_prices_metered_and_delivered_fuels_plus_seg() -> None:
+    # Arrange
+    rates = StaticFileFuelRatesRepository().get_current()
+
+    # Act / Assert — electricity carries a daily standing charge; oil is
+    # delivered (no meter) so its standing charge is 0; SEG is a flat credit.
+    assert rates.unit_rate_p_per_kwh(Fuel.ELECTRICITY) == 24.67
+    assert rates.standing_charge_p_per_day(Fuel.ELECTRICITY) == 57.21
+    assert rates.unit_rate_p_per_kwh(Fuel.OIL) == 9.16
+    assert rates.standing_charge_p_per_day(Fuel.OIL) == 0.0
+    assert rates.seg_export_p_per_kwh == 15.0
+
+
+@pytest.mark.parametrize(
+    "fuel", [Fuel.HEAT_NETWORK, Fuel.COAL, Fuel.ELECTRICITY_OFF_PEAK]
+)
+def test_unpriced_fuels_raise_rather_than_defaulting(fuel: Fuel) -> None:
+    # Arrange — house coal + heat network have no national rate, and off-peak
+    # needs the day/night split a later slice adds (ADR-0014).
+    rates = StaticFileFuelRatesRepository().get_current()
+
+    # Act / Assert
+    with pytest.raises(UnpricedFuel):
+        rates.unit_rate_p_per_kwh(fuel)

From 8ae3b56f41f3f01e733786720d2b126b2b393196 Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Tue, 2 Jun 2026 09:38:44 +0000
Subject: [PATCH 06/12] feat(baseline): BillDerivation prices an energy
 breakdown at Fuel Rates (ADR-0014)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Slice 2 of Bill Derivation. BillDerivation(fuel_rates).derive(breakdown) takes a
delivered-energy breakdown (per-section EnergyLine(section, fuel, kwh) +
exported_kwh) and produces a Bill: per-section kWh + cost, standing charges,
SEG credit, and total.

- Each end-use line billed at its fuel's unit rate.
- Standing charge added ONCE per distinct fuel used (a meter, not an end use);
  off-gas fuels carry 0 so contribute nothing — no metered/unmetered special case.
- SEG export credit subtracted.
- Deterministic (ADR-0006); raises UnpricedFuel (via FuelRates) on an unpriced
  fuel (e.g. heat network) rather than billing at a wrong default.

Pure domain — no calculator dependency; the SapResult->EnergyBreakdown adapter
is slice 3.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 domain/property_baseline/bill.py              | 58 +++++++++++
 domain/property_baseline/bill_derivation.py   | 71 ++++++++++++++
 .../property_baseline/test_bill_derivation.py | 95 +++++++++++++++++++
 3 files changed, 224 insertions(+)
 create mode 100644 domain/property_baseline/bill.py
 create mode 100644 domain/property_baseline/bill_derivation.py
 create mode 100644 tests/domain/property_baseline/test_bill_derivation.py

diff --git a/domain/property_baseline/bill.py b/domain/property_baseline/bill.py
new file mode 100644
index 00000000..fcc49329
--- /dev/null
+++ b/domain/property_baseline/bill.py
@@ -0,0 +1,58 @@
+from __future__ import annotations
+
+from collections.abc import Mapping, Sequence
+from dataclasses import dataclass
+from enum import Enum
+
+from domain.fuel_rates.fuel import Fuel
+
+
+class BillSection(Enum):
+    """A user-meaningful slice of the annual energy bill — the calculator's raw
+    end uses folded into the sections the UI shows (ADR-0014)."""
+
+    HEATING = "HEATING"
+    HOT_WATER = "HOT_WATER"
+    LIGHTING = "LIGHTING"
+    APPLIANCES = "APPLIANCES"
+    COOKING = "COOKING"
+    PUMPS_FANS = "PUMPS_FANS"
+
+
+@dataclass(frozen=True)
+class EnergyLine:
+    """One section's delivered energy on one fuel. A section may have more than
+    one line (e.g. gas main heating + electric secondary heating)."""
+
+    section: BillSection
+    fuel: Fuel
+    kwh: float
+
+
+@dataclass(frozen=True)
+class EnergyBreakdown:
+    """A Property's delivered energy per end use, the input to Bill Derivation —
+    produced from SAP10 Calculation in a later slice. ``exported_kwh`` is PV
+    generation exported to the grid, credited at the SEG rate."""
+
+    lines: Sequence[EnergyLine]
+    exported_kwh: float = 0.0
+
+
+@dataclass(frozen=True)
+class BillSectionCost:
+    """One section's rolled-up delivered kWh and annual cost (£)."""
+
+    kwh: float
+    cost_gbp: float
+
+
+@dataclass(frozen=True)
+class Bill:
+    """A Property's annual energy bill, composed per section plus the per-meter
+    standing charges and the SEG export credit, and the total (ADR-0014)."""
+
+    sections: Mapping[BillSection, BillSectionCost]
+    standing_charges_gbp: float
+    seg_credit_gbp: float
+    total_gbp: float
diff --git a/domain/property_baseline/bill_derivation.py b/domain/property_baseline/bill_derivation.py
new file mode 100644
index 00000000..2aceeeb3
--- /dev/null
+++ b/domain/property_baseline/bill_derivation.py
@@ -0,0 +1,71 @@
+from __future__ import annotations
+
+from collections import defaultdict
+from typing import Final
+
+from domain.fuel_rates.fuel import Fuel
+from domain.fuel_rates.fuel_rates import FuelRates
+from domain.property_baseline.bill import (
+    Bill,
+    BillSection,
+    BillSectionCost,
+    EnergyBreakdown,
+)
+
+_DAYS_PER_YEAR: Final[float] = 365.0
+_PENCE_PER_POUND: Final[float] = 100.0
+
+
+class BillDerivation:
+    """Derives a Property's annual energy Bill by pricing a delivered-energy
+    breakdown at current Fuel Rates (ADR-0014).
+
+    Each end-use line is billed at its fuel's unit rate; **standing charges are
+    added once per distinct fuel used** (a meter, not an end use — off-gas fuels
+    carry a 0 standing charge so they contribute nothing); the SEG export credit
+    is subtracted. Deterministic (ADR-0006). Raises ``UnpricedFuel`` (via
+    ``FuelRates``) on a fuel the snapshot does not price.
+    """
+
+    def __init__(self, fuel_rates: FuelRates) -> None:
+        self._rates = fuel_rates
+
+    def derive(self, breakdown: EnergyBreakdown) -> Bill:
+        section_kwh: defaultdict[BillSection, float] = defaultdict(float)
+        section_cost_p: defaultdict[BillSection, float] = defaultdict(float)
+        fuels_used: set[Fuel] = set()
+        for line in breakdown.lines:
+            section_kwh[line.section] += line.kwh
+            section_cost_p[line.section] += (
+                line.kwh * self._rates.unit_rate_p_per_kwh(line.fuel)
+            )
+            if line.kwh > 0:
+                fuels_used.add(line.fuel)
+
+        sections = {
+            section: BillSectionCost(
+                kwh=section_kwh[section], cost_gbp=section_cost_p[section] / _PENCE_PER_POUND
+            )
+            for section in section_kwh
+        }
+        standing_charges_gbp = (
+            sum(
+                (self._rates.standing_charge_p_per_day(fuel) * _DAYS_PER_YEAR for fuel in fuels_used),
+                0.0,
+            )
+            / _PENCE_PER_POUND
+        )
+        seg_credit_gbp = (
+            breakdown.exported_kwh * self._rates.seg_export_p_per_kwh / _PENCE_PER_POUND
+        )
+        total_gbp = (
+            sum((section.cost_gbp for section in sections.values()), 0.0)
+            + standing_charges_gbp
+            - seg_credit_gbp
+        )
+        return Bill(
+            sections=sections,
+            standing_charges_gbp=standing_charges_gbp,
+            seg_credit_gbp=seg_credit_gbp,
+            total_gbp=total_gbp,
+        )
diff --git a/tests/domain/property_baseline/test_bill_derivation.py b/tests/domain/property_baseline/test_bill_derivation.py
new file mode 100644
index 00000000..73239d0f
--- /dev/null
+++ b/tests/domain/property_baseline/test_bill_derivation.py
@@ -0,0 +1,95 @@
+from __future__ import annotations
+
+import pytest
+
+from domain.fuel_rates.fuel import Fuel, UnpricedFuel
+from domain.fuel_rates.fuel_rates import FuelRate, FuelRates
+from domain.property_baseline.bill import BillSection, EnergyBreakdown, EnergyLine
+from domain.property_baseline.bill_derivation import BillDerivation
+
+
+def _rates() -> FuelRates:
+    return FuelRates(
+        period="test",
+        seg_export_p_per_kwh=15.0,
+        rates={
+            Fuel.MAINS_GAS: FuelRate(unit_rate_p_per_kwh=5.74, standing_charge_p_per_day=29.09),
+            Fuel.ELECTRICITY: FuelRate(unit_rate_p_per_kwh=24.67, standing_charge_p_per_day=57.21),
+            Fuel.OIL: FuelRate(unit_rate_p_per_kwh=9.16, standing_charge_p_per_day=0.0),
+        },
+    )
+
+
+def test_derive_prices_a_single_gas_heating_line_with_its_standing_charge() -> None:
+    # Arrange — 10,000 kWh of mains-gas heating.
+    breakdown = EnergyBreakdown(
+        lines=[EnergyLine(section=BillSection.HEATING, fuel=Fuel.MAINS_GAS, kwh=10000.0)]
+    )
+    derivation = BillDerivation(_rates())
+
+    # Act
+    bill = derivation.derive(breakdown)
+
+    # Assert — heating = 10000 × 5.74p = £574; standing = 29.09p × 365 = £106.1785.
+    assert abs(bill.sections[BillSection.HEATING].cost_gbp - 574.0) <= 1e-9
+    assert abs(bill.standing_charges_gbp - 106.1785) <= 1e-9
+    assert abs(bill.total_gbp - 680.1785) <= 1e-9
+
+
+def test_two_sections_on_the_same_fuel_share_one_standing_charge() -> None:
+    # Arrange — gas heating + gas hot water are one meter, not two.
+    breakdown = EnergyBreakdown(
+        lines=[
+            EnergyLine(section=BillSection.HEATING, fuel=Fuel.MAINS_GAS, kwh=8000.0),
+            EnergyLine(section=BillSection.HOT_WATER, fuel=Fuel.MAINS_GAS, kwh=2000.0),
+        ]
+    )
+
+    # Act
+    bill = BillDerivation(_rates()).derive(breakdown)
+
+    # Assert — one gas standing charge (29.09p × 365 = £106.1785), not two.
+    assert abs(bill.standing_charges_gbp - 106.1785) <= 1e-9
+    assert abs(bill.sections[BillSection.HOT_WATER].cost_gbp - 114.8) <= 1e-9
+
+
+def test_distinct_fuels_each_add_their_own_standing_charge() -> None:
+    # Arrange — gas heating + electric lighting: two meters.
+    breakdown = EnergyBreakdown(
+        lines=[
+            EnergyLine(section=BillSection.HEATING, fuel=Fuel.MAINS_GAS, kwh=8000.0),
+            EnergyLine(section=BillSection.LIGHTING, fuel=Fuel.ELECTRICITY, kwh=500.0),
+        ]
+    )
+
+    # Act
+    bill = BillDerivation(_rates()).derive(breakdown)
+
+    # Assert — gas 29.09 + elec 57.21 = 86.30 p/day × 365 = £314.995.
+    assert abs(bill.standing_charges_gbp - 314.995) <= 1e-9
+
+
+def test_exported_pv_is_credited_at_the_seg_rate() -> None:
+    # Arrange — 1000 kWh exported at 15p, against a single gas heating line.
+    breakdown = EnergyBreakdown(
+        lines=[EnergyLine(section=BillSection.HEATING, fuel=Fuel.MAINS_GAS, kwh=10000.0)],
+        exported_kwh=1000.0,
+    )
+
+    # Act
+    bill = BillDerivation(_rates()).derive(breakdown)
+
+    # Assert — SEG credit £150 subtracted from the £680.1785 gross.
+    assert abs(bill.seg_credit_gbp - 150.0) <= 1e-9
+    assert abs(bill.total_gbp - 530.1785) <= 1e-9
+
+
+def test_an_unpriced_fuel_in_a_line_raises() -> None:
+    # Arrange — a heat-network line; the snapshot prices no heat network.
+    breakdown = EnergyBreakdown(
+        lines=[EnergyLine(section=BillSection.HEATING, fuel=Fuel.HEAT_NETWORK, kwh=5000.0)]
+    )
+
+    # Act / Assert
+    with pytest.raises(UnpricedFuel):
+        BillDerivation(_rates()).derive(breakdown)

From 5f65b9be62a6c84a97133a44851ca9452e2b9154 Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Tue, 2 Jun 2026 09:50:10 +0000
Subject: [PATCH 07/12] feat(baseline): SAP fuel-code -> Fuel mapping for
 billing (ADR-0014)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Slice 3 of Bill Derivation. sap_code_to_fuel(code) maps a SAP 10.2 / Table 32
fuel code to the canonical billing Fuel — bounded to the ~47 Table 32 codes (the
carrier, orthogonal to the PCDB product index, so all PCDB heat pumps share one
electricity code). Mains gas / LPG / oil+bioliquids / coal / smokeless / wood /
electricity (standard + off-peak) / heat-network groupings; an unmapped code
(dual fuel, grid-export) raises UnmappedSapCode rather than guessing.

Also: ADR-0014 deferred/TODO section records the stubbed appliances+cooking
(pending the SapResult fields), the off-peak day/night split, the heat-network
rate gap, and regional rates / ETL.

The SapResult -> EnergyBreakdown adapter (next slice) is gated on the
appliances/cooking fields landing on SapResult.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 ...14-bill-derivation-from-real-fuel-rates.md | 14 +++++++
 domain/property_baseline/sap_fuel.py          | 41 ++++++++++++++++++
 .../domain/property_baseline/test_sap_fuel.py | 42 +++++++++++++++++++
 3 files changed, 97 insertions(+)
 create mode 100644 domain/property_baseline/sap_fuel.py
 create mode 100644 tests/domain/property_baseline/test_sap_fuel.py

diff --git a/docs/adr/0014-bill-derivation-from-real-fuel-rates.md b/docs/adr/0014-bill-derivation-from-real-fuel-rates.md
index 7c033085..cf01b02a 100644
--- a/docs/adr/0014-bill-derivation-from-real-fuel-rates.md
+++ b/docs/adr/0014-bill-derivation-from-real-fuel-rates.md
@@ -79,6 +79,20 @@ production migration is FE-owned (Drizzle); `docs/migrations/` updated.
 - The snapshot goes stale on the Ofgem-cap cadence (quarterly); the file records its period, and the
   ETL that automates refresh is the deferred follow-up.
 
+## Deferred / TODO
+
+- **Appliances + cooking kWh** are computed inside `cert_to_inputs` (Appendix L L13-L20) but not
+  yet threaded onto `SapResult`. Until they are, the `SapResult` → `EnergyBreakdown` adapter
+  **stubs them at 0 kWh**, so the bill total currently understates by the unregulated electricity
+  load. Khalim is adding the fields to `SapResult` directly; the adapter wires the
+  `APPLIANCES`/`COOKING` sections in as soon as they land.
+- **Off-peak (Economy 7) day/night split** — the snapshot carries the E7 day/night rates, but
+  `FuelRates` exposes single-rate fuels only; the day/night accessor + the calculator's Table 12a
+  high/low-rate split land in a later slice.
+- **Heat-network rate model** — heat-network certs raise `UnpricedFuel` for now (the one common gap).
+- **Regional rates + Ofgem-cap ETL** — national snapshot now; both are later refinements behind the
+  same `FuelRatesRepository` port.
+
 ## Considered alternatives
 
 - **Bill from `RenewableHeatIncentive` heating+HW kWh only** (CONTEXT's original scope) — rejected:
diff --git a/domain/property_baseline/sap_fuel.py b/domain/property_baseline/sap_fuel.py
new file mode 100644
index 00000000..cd7c6efc
--- /dev/null
+++ b/domain/property_baseline/sap_fuel.py
@@ -0,0 +1,41 @@
+from __future__ import annotations
+
+from typing import Final
+
+from domain.fuel_rates.fuel import Fuel
+from domain.sap10_calculator.exceptions import UnmappedSapCode
+
+# SAP 10.2 / Table 32 fuel code -> canonical billing Fuel (ADR-0014). Bounded to
+# the ~47 Table 32 fuel codes (the keys of `table_12.UNIT_PRICE_P_PER_KWH`) — the
+# carrier, NOT the PCDB product, so a thousand PCDB heat pumps all share one code.
+# Input is a normalised Table 32 fuel code (the calculator sets `main_fuel_type`
+# to Table 32 codes); an unmapped code raises `UnmappedSapCode` rather than
+# guessing — a bounded, self-surfacing backlog [[reference-unmapped-sap-code]].
+_CODE_TO_FUEL: Final[dict[int, Fuel]] = {
+    **dict.fromkeys([1, 7], Fuel.MAINS_GAS),  # mains gas, grid biogas
+    **dict.fromkeys([2, 3, 5, 9], Fuel.LPG),
+    **dict.fromkeys([4, 71, 73, 75, 76], Fuel.OIL),  # heating oil + bio-liquids
+    **dict.fromkeys([11, 15], Fuel.COAL),  # house coal, anthracite
+    **dict.fromkeys([12], Fuel.SMOKELESS),
+    **dict.fromkeys([20, 21], Fuel.WOOD_LOGS),  # logs, chips
+    **dict.fromkeys([22, 23], Fuel.WOOD_PELLETS),
+    **dict.fromkeys([30], Fuel.ELECTRICITY),  # standard tariff
+    # 7/10/18-hour off-peak tariffs + 24-hour heating tariff — priced once the
+    # off-peak day/night slice lands; ELECTRICITY_OFF_PEAK is unpriced until then.
+    **dict.fromkeys([31, 32, 33, 34, 35, 38, 40], Fuel.ELECTRICITY_OFF_PEAK),
+    # "heat from ..." community/heat-network + distribution codes (41-58).
+    **dict.fromkeys(range(41, 59), Fuel.HEAT_NETWORK),
+}
+
+
+def sap_code_to_fuel(code: int) -> Fuel:
+    """Map a SAP 10.2 / Table 32 fuel code to its canonical billing Fuel.
+
+    Raises ``UnmappedSapCode`` on a code with no single billing carrier — e.g.
+    dual fuel (10) or the grid-export codes (36/60), which are not an end use's
+    input fuel.
+    """
+    fuel = _CODE_TO_FUEL.get(code)
+    if fuel is None:
+        raise UnmappedSapCode("fuel_code", code)
+    return fuel
diff --git a/tests/domain/property_baseline/test_sap_fuel.py b/tests/domain/property_baseline/test_sap_fuel.py
new file mode 100644
index 00000000..24dcf193
--- /dev/null
+++ b/tests/domain/property_baseline/test_sap_fuel.py
@@ -0,0 +1,42 @@
+from __future__ import annotations
+
+import pytest
+
+from domain.fuel_rates.fuel import Fuel
+from domain.property_baseline.sap_fuel import sap_code_to_fuel
+from domain.sap10_calculator.exceptions import UnmappedSapCode
+
+
+def test_mains_gas_code_maps_to_mains_gas() -> None:
+    # Arrange / Act / Assert — Table 32 code 1 is mains gas.
+    assert sap_code_to_fuel(1) == Fuel.MAINS_GAS
+
+
+@pytest.mark.parametrize(
+    ("code", "fuel"),
+    [
+        (1, Fuel.MAINS_GAS),
+        (2, Fuel.LPG),
+        (4, Fuel.OIL),
+        (76, Fuel.OIL),  # bioethanol — a liquid fuel row
+        (11, Fuel.COAL),  # house coal
+        (15, Fuel.COAL),  # anthracite
+        (12, Fuel.SMOKELESS),
+        (20, Fuel.WOOD_LOGS),
+        (23, Fuel.WOOD_PELLETS),
+        (30, Fuel.ELECTRICITY),  # standard tariff
+        (32, Fuel.ELECTRICITY_OFF_PEAK),  # 7-hour tariff
+        (41, Fuel.HEAT_NETWORK),  # heat from electric heat pump (community)
+        (50, Fuel.HEAT_NETWORK),  # electricity for distribution pumping
+    ],
+)
+def test_table_32_codes_map_to_their_billing_fuel(code: int, fuel: Fuel) -> None:
+    # Arrange / Act / Assert
+    assert sap_code_to_fuel(code) == fuel
+
+
+def test_an_unmapped_code_raises_rather_than_guessing() -> None:
+    # Arrange — code 10 (dual fuel) has no single billing fuel.
+    # Act / Assert
+    with pytest.raises(UnmappedSapCode):
+        sap_code_to_fuel(10)

From 15da2d39702bb42ef592dd8c99a55fb4cb197c6d Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Tue, 2 Jun 2026 10:04:24 +0000
Subject: [PATCH 08/12] =?UTF-8?q?feat(baseline):=20CalculatorRebaseliner?=
 =?UTF-8?q?=20=E2=80=94=20calculator=20goes=20load-bearing=20(ADR-0013=20a?=
 =?UTF-8?q?mend)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Slice 5a: the promotion. Replaces StubRebaseliner in production and collapses the
shadow runner into the rebaseliner (ADR-0013 amendment).

- CalculatorRebaseliner runs Sap10Calculator on every Property:
  * sap_version < 10.2 -> Effective Performance IS the calculator output
    (band via Epc.from_sap_score, CO2 kg->t, PEUI rounded), reason "pre_sap10".
  * sap_version >= 10.2 -> Effective = lodged (API figures on-target), and the
    calculator only logs divergence (SAP>0.5, PEUI/CO2 1%) as a validation signal.
  * a calculator raise propagates -> batch aborts (ADR-0012); fix the cert at once.
- Rebaseliner.rebaseline gains property_id (for the divergence log).
- LoggingCalculatorShadow / the calculator_shadow seam removed from the
  orchestrator; its divergence-comparison logic now lives in the rebaseliner.
- StubRebaseliner kept (signature updated) for orchestrator/repo unit tests.

The SapResult->EnergyBreakdown adapter + BillDerivation wiring (to populate the
bill block) follow once the appliances/cooking SapResult fields land.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 applications/ara_first_run/handler.py         |  11 +-
 .../calculator_rebaseliner.py                 | 113 ++++++++++++
 domain/property_baseline/calculator_shadow.py | 141 ---------------
 domain/property_baseline/rebaseliner.py       |  12 +-
 .../property_baseline_orchestrator.py         |  13 +-
 .../test_calculator_rebaseliner.py            | 134 ++++++++++++++
 .../test_calculator_shadow.py                 | 166 ------------------
 .../property_baseline/test_rebaseliner.py     |   4 +-
 tests/orchestration/fakes.py                  |  19 --
 ...test_ara_first_run_pipeline_integration.py |   2 -
 .../test_property_baseline_orchestrator.py    |  31 ----
 11 files changed, 262 insertions(+), 384 deletions(-)
 create mode 100644 domain/property_baseline/calculator_rebaseliner.py
 delete mode 100644 domain/property_baseline/calculator_shadow.py
 create mode 100644 tests/domain/property_baseline/test_calculator_rebaseliner.py
 delete mode 100644 tests/domain/property_baseline/test_calculator_shadow.py

diff --git a/applications/ara_first_run/handler.py b/applications/ara_first_run/handler.py
index 8aca4fea..e82da40f 100644
--- a/applications/ara_first_run/handler.py
+++ b/applications/ara_first_run/handler.py
@@ -10,8 +10,7 @@ from sqlmodel import Session
 from applications.ara_first_run.ara_first_run_trigger_body import (
     AraFirstRunTriggerBody,
 )
-from domain.property_baseline.calculator_shadow import LoggingCalculatorShadow
-from domain.property_baseline.rebaseliner import StubRebaseliner
+from domain.property_baseline.calculator_rebaseliner import CalculatorRebaseliner
 from domain.sap10_calculator.calculator import Sap10Calculator
 from infrastructure.postgres.config import PostgresConfig
 from infrastructure.postgres.engine import make_engine
@@ -82,10 +81,10 @@ def build_first_run_pipeline(
         ),
         baseline=PropertyBaselineOrchestrator(
             unit_of_work=unit_of_work,
-            rebaseliner=StubRebaseliner(),
-            # Shadow only: validates the calculator over the wild cohort without
-            # gating the load-bearing baseline write (ADR-0013).
-            calculator_shadow=LoggingCalculatorShadow(Sap10Calculator()),
+            # The calculator is load-bearing: effective=calculated for pre-10.2
+            # certs, lodged + divergence-logged at/above 10.2; a raise aborts the
+            # batch (ADR-0013 amendment).
+            rebaseliner=CalculatorRebaseliner(Sap10Calculator()),
         ),
         modelling=ModellingOrchestrator(
             scenario_repo=ScenarioRepository(),
diff --git a/domain/property_baseline/calculator_rebaseliner.py b/domain/property_baseline/calculator_rebaseliner.py
new file mode 100644
index 00000000..b2443784
--- /dev/null
+++ b/domain/property_baseline/calculator_rebaseliner.py
@@ -0,0 +1,113 @@
+from __future__ import annotations
+
+import logging
+from typing import TYPE_CHECKING, Optional, Protocol
+
+from datatypes.epc.domain.epc import Epc
+from domain.property_baseline.performance import Performance
+from domain.property_baseline.rebaseliner import Rebaseliner, RebaselineReason
+
+if TYPE_CHECKING:
+    from datatypes.epc.domain.epc_property_data import EpcPropertyData
+    from domain.sap10_calculator.calculator import SapResult
+
+logger = logging.getLogger(__name__)
+
+# The calculator targets SAP 10.2 (14-03-2025). A cert lodged below this carries
+# a superseded methodology and is rebaselined to the calculator's output; at or
+# above it, the API's lodged figures are kept and the calculator only validates.
+_SAP10_2_FLOOR = 10.2
+_SAP_ABS_TOL = 0.5
+_REL_TOL = 0.01
+_KG_PER_TONNE = 1000.0
+
+
+class Calculator(Protocol):
+    """The slice of `Sap10Calculator` the rebaseliner needs — `Sap10Calculator`
+    satisfies it structurally, so this module does not import the calculator."""
+
+    def calculate(self, epc: "EpcPropertyData") -> "SapResult": ...
+
+
+def performance_from_sap_result(result: "SapResult") -> Performance:
+    """The four rated quantities, read off a `SapResult`: band derived from the
+    score, CO2 converted kg→tonnes, PEUI rounded to the lodged integer scale."""
+    return Performance(
+        sap_score=result.sap_score,
+        epc_band=Epc.from_sap_score(result.sap_score),
+        co2_emissions=result.co2_kg_per_yr / _KG_PER_TONNE,
+        primary_energy_intensity=round(result.primary_energy_kwh_per_m2),
+    )
+
+
+def _relative_diff(calculated: float, lodged: float) -> float:
+    if lodged == 0:
+        return 0.0 if calculated == 0 else float("inf")
+    return abs(calculated - lodged) / abs(lodged)
+
+
+class CalculatorRebaseliner(Rebaseliner):
+    """Produces Effective Performance from the deterministic `Sap10Calculator`
+    (ADR-0013 amendment — the calculator is load-bearing).
+
+    Runs the calculator on every Property. For a cert lodged under a superseded
+    methodology (``sap_version < 10.2``) the calculator's output **is** Effective
+    Performance. At or above 10.2 the API's lodged figures are kept and the
+    calculator only **logs divergence** (a validation signal). A calculator
+    strict-raise propagates — the batch aborts (ADR-0012) and the un-mapped cert
+    is fixed immediately.
+    """
+
+    def __init__(self, calculator: Calculator) -> None:
+        self._calculator = calculator
+
+    def rebaseline(
+        self, property_id: int, effective_epc: "EpcPropertyData", lodged: Performance
+    ) -> tuple[Performance, RebaselineReason]:
+        # A raise (UnmappedSapCode, etc.) propagates: the calculator is
+        # load-bearing, so the batch aborts and the cert is fixed at once.
+        result = self._calculator.calculate(effective_epc)
+        sap_version = effective_epc.sap_version
+        if sap_version is not None and sap_version < _SAP10_2_FLOOR:
+            return performance_from_sap_result(result), "pre_sap10"
+        self._log_divergence(
+            property_id=property_id, sap_version=sap_version, result=result, lodged=lodged
+        )
+        return lodged, "none"
+
+    def _log_divergence(
+        self,
+        *,
+        property_id: int,
+        sap_version: Optional[float],
+        result: "SapResult",
+        lodged: Performance,
+    ) -> None:
+        if abs(result.sap_score_continuous - lodged.sap_score) > _SAP_ABS_TOL:
+            self._warn(property_id, sap_version, "sap_score", lodged.sap_score, result.sap_score_continuous)
+        if _relative_diff(result.primary_energy_kwh_per_m2, lodged.primary_energy_intensity) > _REL_TOL:
+            self._warn(
+                property_id, sap_version, "primary_energy_intensity",
+                lodged.primary_energy_intensity, result.primary_energy_kwh_per_m2,
+            )
+        calculated_co2_t = result.co2_kg_per_yr / _KG_PER_TONNE
+        if _relative_diff(calculated_co2_t, lodged.co2_emissions) > _REL_TOL:
+            self._warn(property_id, sap_version, "co2_emissions", lodged.co2_emissions, calculated_co2_t)
+
+    def _warn(
+        self,
+        property_id: int,
+        sap_version: Optional[float],
+        quantity: str,
+        lodged: float,
+        calculated: float,
+    ) -> None:
+        logger.warning(
+            "SAP10 calculator divergence on %s for property_id=%s sap_version=%s: "
+            "lodged=%s calculated=%s",
+            quantity,
+            property_id,
+            sap_version,
+            lodged,
+            calculated,
+        )
diff --git a/domain/property_baseline/calculator_shadow.py b/domain/property_baseline/calculator_shadow.py
deleted file mode 100644
index ba7927d8..00000000
--- a/domain/property_baseline/calculator_shadow.py
+++ /dev/null
@@ -1,141 +0,0 @@
-from __future__ import annotations
-
-import logging
-from abc import ABC, abstractmethod
-from typing import TYPE_CHECKING, Optional, Protocol
-
-from domain.property_baseline.performance import Performance
-
-if TYPE_CHECKING:
-    from datatypes.epc.domain.epc_property_data import EpcPropertyData
-    from domain.sap10_calculator.calculator import SapResult
-
-logger = logging.getLogger(__name__)
-
-# A continuous SAP this far from the lodged integer would round to a different
-# band-driving score; PEUI / CO2 scale with dwelling size so they use a relative
-# tolerance (ADR-0013). Starting dials — tune against the wild-cohort logs.
-_SAP_ABS_TOL = 0.5
-_REL_TOL = 0.01
-_KG_PER_TONNE = 1000.0
-
-
-class CalculatorShadow(ABC):
-    """Runs SAP10 Calculation in shadow beside the load-bearing baseline write
-    and reports divergence from Lodged Performance (ADR-0013).
-
-    The calculator is not yet load-bearing — it is still being hardened, and a
-    large test cohort is about to flow through baselining. So an implementation
-    **must never raise**: a shadow failure may not abort the batch (ADR-0012's
-    all-or-nothing governs only the load-bearing Lodged/Effective write). It
-    observes, compares against Lodged, and logs; it does not feed Effective
-    Performance. The seam is retired when the calculator is promoted to the
-    Rebaseliner and its output *becomes* Effective Performance.
-    """
-
-    @abstractmethod
-    def observe(
-        self,
-        *,
-        property_id: int,
-        effective_epc: "EpcPropertyData",
-        lodged: Performance,
-    ) -> None: ...
-
-
-def _relative_diff(calculated: float, lodged: float) -> float:
-    """|calculated − lodged| / |lodged|; a zero lodged value diverges iff
-    calculated is non-zero (avoids a divide-by-zero on degenerate certs)."""
-    if lodged == 0:
-        return 0.0 if calculated == 0 else float("inf")
-    return abs(calculated - lodged) / abs(lodged)
-
-
-class Calculator(Protocol):
-    """The slice of `Sap10Calculator` the shadow needs: cert in, result out.
-    `Sap10Calculator` satisfies it structurally — no coupling to its module."""
-
-    def calculate(self, epc: "EpcPropertyData") -> "SapResult": ...
-
-
-class LoggingCalculatorShadow(CalculatorShadow):
-    """Runs the calculator and logs, never persists, never raises (ADR-0013).
-
-    A strict-raise (an un-mapped cert) is caught and logged at ``error`` so the
-    wild-cohort gap is greppable; a successful result whose SAP / PEUI / CO2
-    diverges from Lodged beyond tolerance is logged at ``warning``. Every line
-    is tagged with ``property_id`` and the cert's ``sap_version`` so SAP-10.2
-    divergence (a real calculator signal) is separable from older-spec drift.
-    """
-
-    def __init__(self, calculator: Calculator) -> None:
-        self._calculator = calculator
-
-    def observe(
-        self,
-        *,
-        property_id: int,
-        effective_epc: "EpcPropertyData",
-        lodged: Performance,
-    ) -> None:
-        sap_version = effective_epc.sap_version
-        try:
-            # Broad by design: the point is to discover *what* breaks in the
-            # wild, and a shadow failure must never abort the batch (ADR-0013).
-            result = self._calculator.calculate(effective_epc)
-        except Exception as exc:
-            logger.error(
-                "SAP10 shadow calculation failed for property_id=%s "
-                "sap_version=%s: %r",
-                property_id,
-                sap_version,
-                exc,
-            )
-            return
-        if abs(result.sap_score_continuous - lodged.sap_score) > _SAP_ABS_TOL:
-            self._warn_divergence(
-                quantity="sap_score",
-                property_id=property_id,
-                sap_version=sap_version,
-                lodged=lodged.sap_score,
-                calculated=result.sap_score_continuous,
-            )
-        if _relative_diff(
-            result.primary_energy_kwh_per_m2, lodged.primary_energy_intensity
-        ) > _REL_TOL:
-            self._warn_divergence(
-                quantity="primary_energy_intensity",
-                property_id=property_id,
-                sap_version=sap_version,
-                lodged=lodged.primary_energy_intensity,
-                calculated=result.primary_energy_kwh_per_m2,
-            )
-        # Lodged CO2 is tonnes/yr; the calculator emits kg/yr (ADR-0013).
-        calculated_co2_t = result.co2_kg_per_yr / _KG_PER_TONNE
-        if _relative_diff(calculated_co2_t, lodged.co2_emissions) > _REL_TOL:
-            self._warn_divergence(
-                quantity="co2_emissions",
-                property_id=property_id,
-                sap_version=sap_version,
-                lodged=lodged.co2_emissions,
-                calculated=calculated_co2_t,
-            )
-
-    def _warn_divergence(
-        self,
-        *,
-        quantity: str,
-        property_id: int,
-        sap_version: Optional[float],
-        lodged: float,
-        calculated: float,
-    ) -> None:
-        logger.warning(
-            "SAP10 shadow divergence on %s for property_id=%s sap_version=%s: "
-            "lodged=%s calculated=%s",
-            quantity,
-            property_id,
-            sap_version,
-            lodged,
-            calculated,
-        )
diff --git a/domain/property_baseline/rebaseliner.py b/domain/property_baseline/rebaseliner.py
index a80552ea..2fd60df9 100644
--- a/domain/property_baseline/rebaseliner.py
+++ b/domain/property_baseline/rebaseliner.py
@@ -36,20 +36,22 @@ class Rebaseliner(ABC):
 
     @abstractmethod
     def rebaseline(
-        self, effective_epc: EpcPropertyData, lodged: Performance
+        self, property_id: int, effective_epc: EpcPropertyData, lodged: Performance
     ) -> tuple[Performance, RebaselineReason]: ...
 
 
 class StubRebaseliner(Rebaseliner):
-    """The no-ML stub for the validation phase.
+    """A no-calculator stub for tests that don't want the real calculator.
 
     SAP10 certs pass through untouched — Effective Performance equals Lodged,
-    reason ``"none"``. A pre-SAP10 cert genuinely needs ML rebaselining, which is
-    not implemented yet (#1135), so it raises rather than fabricating a "none".
+    reason ``"none"``. A pre-SAP10 cert genuinely needs rebaselining, which this
+    stub does not do, so it raises rather than fabricating a "none". Production
+    uses ``CalculatorRebaseliner`` (the calculator is load-bearing — ADR-0013
+    amendment); this stub stays for orchestrator/repo unit tests.
     """
 
     def rebaseline(
-        self, effective_epc: EpcPropertyData, lodged: Performance
+        self, property_id: int, effective_epc: EpcPropertyData, lodged: Performance
     ) -> tuple[Performance, RebaselineReason]:
         sap_version = effective_epc.sap_version
         if sap_version is not None and sap_version < _SAP10_FLOOR:
diff --git a/orchestration/property_baseline_orchestrator.py b/orchestration/property_baseline_orchestrator.py
index 119889bd..bf82a514 100644
--- a/orchestration/property_baseline_orchestrator.py
+++ b/orchestration/property_baseline_orchestrator.py
@@ -6,7 +6,6 @@ from datatypes.epc.domain.epc_property_data import (
     EpcPropertyData,
     RenewableHeatIncentive,
 )
-from domain.property_baseline.calculator_shadow import CalculatorShadow
 from domain.property_baseline.property_baseline_performance import PropertyBaselinePerformance
 from domain.property_baseline.performance import lodged_performance
 from domain.property_baseline.rebaseliner import Rebaseliner
@@ -33,11 +32,9 @@ class PropertyBaselineOrchestrator:
         *,
         unit_of_work: Callable[[], UnitOfWork],
         rebaseliner: Rebaseliner,
-        calculator_shadow: CalculatorShadow,
     ) -> None:
         self._unit_of_work = unit_of_work
         self._rebaseliner = rebaseliner
-        self._calculator_shadow = calculator_shadow
 
     def run(self, property_ids: list[int]) -> None:
         with self._unit_of_work() as uow:
@@ -46,7 +43,7 @@ class PropertyBaselineOrchestrator:
                 effective_epc = prop.effective_epc
                 lodged = lodged_performance(effective_epc)
                 effective, reason = self._rebaseliner.rebaseline(
-                    effective_epc, lodged
+                    property_id, effective_epc, lodged
                 )
                 rhi = _require_rhi(effective_epc)
                 baseline = PropertyBaselinePerformance(
@@ -57,14 +54,6 @@ class PropertyBaselineOrchestrator:
                     water_heating_kwh=rhi.water_heating_kwh,
                 )
                 uow.property_baseline.save(baseline, property_id)
-                # Shadow only: validate the calculator in the wild without
-                # gating the load-bearing write above (ADR-0013). `observe`
-                # never raises, so it cannot abort the batch.
-                self._calculator_shadow.observe(
-                    property_id=property_id,
-                    effective_epc=effective_epc,
-                    lodged=lodged,
-                )
             uow.commit()
 
 
diff --git a/tests/domain/property_baseline/test_calculator_rebaseliner.py b/tests/domain/property_baseline/test_calculator_rebaseliner.py
new file mode 100644
index 00000000..ea1230fc
--- /dev/null
+++ b/tests/domain/property_baseline/test_calculator_rebaseliner.py
@@ -0,0 +1,134 @@
+from __future__ import annotations
+
+import logging
+from typing import Optional
+
+import pytest
+
+from datatypes.epc.domain.epc import Epc
+from datatypes.epc.domain.epc_property_data import EpcPropertyData
+from domain.property_baseline.calculator_rebaseliner import CalculatorRebaseliner
+from domain.property_baseline.performance import Performance
+from domain.sap10_calculator.calculator import SapResult
+from domain.sap10_calculator.exceptions import UnmappedSapCode
+
+
+def _epc(*, sap_version: Optional[float]) -> EpcPropertyData:
+    epc = object.__new__(EpcPropertyData)
+    epc.sap_version = sap_version
+    return epc
+
+
+def _lodged() -> Performance:
+    return Performance(
+        sap_score=72, epc_band=Epc.C, co2_emissions=1.8, primary_energy_intensity=180
+    )
+
+
+def _sap_result(
+    *,
+    sap_score: int = 72,
+    co2_kg_per_yr: float = 1800.0,
+    primary_energy_kwh_per_m2: float = 180.0,
+) -> SapResult:
+    return SapResult(
+        sap_score=sap_score,
+        sap_score_continuous=float(sap_score),
+        ecf=0.0,
+        total_fuel_cost_gbp=0.0,
+        co2_kg_per_yr=co2_kg_per_yr,
+        space_heating_kwh_per_yr=0.0,
+        space_cooling_kwh_per_yr=0.0,
+        fabric_energy_efficiency_kwh_per_m2_yr=0.0,
+        main_heating_fuel_kwh_per_yr=0.0,
+        main_2_heating_fuel_kwh_per_yr=0.0,
+        secondary_heating_fuel_kwh_per_yr=0.0,
+        space_cooling_fuel_kwh_per_yr=0.0,
+        hot_water_kwh_per_yr=0.0,
+        pumps_fans_kwh_per_yr=0.0,
+        lighting_kwh_per_yr=0.0,
+        primary_energy_kwh_per_yr=0.0,
+        primary_energy_kwh_per_m2=primary_energy_kwh_per_m2,
+        monthly=(),
+        intermediate={},
+    )
+
+
+class _StubCalculator:
+    def __init__(self, result: SapResult) -> None:
+        self._result = result
+
+    def calculate(self, epc: EpcPropertyData) -> SapResult:
+        return self._result
+
+
+def test_pre_10_2_cert_is_rebaselined_to_the_calculator_output() -> None:
+    # Arrange — a SAP 10.0 cert: lodged figures are a superseded methodology, so
+    # the calculator's output becomes Effective Performance (ADR-0013 amendment).
+    calculator = _StubCalculator(
+        _sap_result(sap_score=70, co2_kg_per_yr=1900.0, primary_energy_kwh_per_m2=185.4)
+    )
+    rebaseliner = CalculatorRebaseliner(calculator)
+    epc = _epc(sap_version=10.0)
+
+    # Act
+    effective, reason = rebaseliner.rebaseline(
+        property_id=10, effective_epc=epc, lodged=_lodged()
+    )
+
+    # Assert — calculated Performance: band from the score, CO2 kg->t, PEUI rounded.
+    assert effective == Performance(
+        sap_score=70, epc_band=Epc.C, co2_emissions=1.9, primary_energy_intensity=185
+    )
+    assert reason == "pre_sap10"
+
+
+def test_a_10_2_cert_keeps_the_lodged_figures() -> None:
+    # Arrange — a SAP 10.2 cert: the API's lodged figures are on-target, so they
+    # stand; the calculator runs only to validate.
+    calculator = _StubCalculator(_sap_result(sap_score=72))
+    rebaseliner = CalculatorRebaseliner(calculator)
+    epc = _epc(sap_version=10.2)
+
+    # Act
+    effective, reason = rebaseliner.rebaseline(
+        property_id=10, effective_epc=epc, lodged=_lodged()
+    )
+
+    # Assert
+    assert effective == _lodged()
+    assert reason == "none"
+
+
+def test_a_10_2_cert_logs_divergence_when_the_calculator_disagrees(
+    caplog: pytest.LogCaptureFixture,
+) -> None:
+    # Arrange — calculated SAP 76 vs lodged 72 (> 0.5 out) on a 10.2 cert.
+    calculator = _StubCalculator(_sap_result(sap_score=76))
+    rebaseliner = CalculatorRebaseliner(calculator)
+    epc = _epc(sap_version=10.2)
+
+    # Act
+    with caplog.at_level(logging.WARNING):
+        rebaseliner.rebaseline(property_id=42, effective_epc=epc, lodged=_lodged())
+
+    # Assert — a divergence warning, tagged with property_id + sap_version.
+    assert len(caplog.records) == 1
+    message = caplog.records[0].getMessage()
+    assert "sap_score" in message
+    assert "property_id=42" in message
+    assert "sap_version=10.2" in message
+
+
+def test_a_calculator_raise_propagates_and_aborts() -> None:
+    # Arrange — the calculator is load-bearing, so a raise is not swallowed.
+    class _Raising:
+        def calculate(self, epc: EpcPropertyData) -> SapResult:
+            raise UnmappedSapCode("heat_emitter_type", 99)
+
+    rebaseliner = CalculatorRebaseliner(_Raising())
+    epc = _epc(sap_version=10.0)
+
+    # Act / Assert
+    with pytest.raises(UnmappedSapCode):
+        rebaseliner.rebaseline(property_id=10, effective_epc=epc, lodged=_lodged())
diff --git a/tests/domain/property_baseline/test_calculator_shadow.py b/tests/domain/property_baseline/test_calculator_shadow.py
deleted file mode 100644
index 81718b72..00000000
--- a/tests/domain/property_baseline/test_calculator_shadow.py
+++ /dev/null
@@ -1,166 +0,0 @@
-from __future__ import annotations
-
-import logging
-from typing import Optional
-
-import pytest
-
-from datatypes.epc.domain.epc import Epc
-from datatypes.epc.domain.epc_property_data import EpcPropertyData
-from domain.property_baseline.calculator_shadow import LoggingCalculatorShadow
-from domain.property_baseline.performance import Performance
-from domain.sap10_calculator.calculator import SapResult
-from domain.sap10_calculator.exceptions import UnmappedSapCode
-
-
-def _epc(*, sap_version: Optional[float]) -> EpcPropertyData:
-    epc = object.__new__(EpcPropertyData)
-    epc.sap_version = sap_version
-    return epc
-
-
-def _lodged() -> Performance:
-    return Performance(
-        sap_score=72, epc_band=Epc.C, co2_emissions=1.8, primary_energy_intensity=180
-    )
-
-
-def _sap_result(
-    *,
-    sap_score_continuous: float = 72.0,
-    primary_energy_kwh_per_m2: float = 180.0,
-    co2_kg_per_yr: float = 1800.0,
-) -> SapResult:
-    """A `SapResult` whose three compared quantities default to *matching*
-    `_lodged()`; each test perturbs one axis."""
-    return SapResult(
-        sap_score=round(sap_score_continuous),
-        sap_score_continuous=sap_score_continuous,
-        ecf=0.0,
-        total_fuel_cost_gbp=0.0,
-        co2_kg_per_yr=co2_kg_per_yr,
-        space_heating_kwh_per_yr=0.0,
-        space_cooling_kwh_per_yr=0.0,
-        fabric_energy_efficiency_kwh_per_m2_yr=0.0,
-        main_heating_fuel_kwh_per_yr=0.0,
-        main_2_heating_fuel_kwh_per_yr=0.0,
-        secondary_heating_fuel_kwh_per_yr=0.0,
-        space_cooling_fuel_kwh_per_yr=0.0,
-        hot_water_kwh_per_yr=0.0,
-        pumps_fans_kwh_per_yr=0.0,
-        lighting_kwh_per_yr=0.0,
-        primary_energy_kwh_per_yr=0.0,
-        primary_energy_kwh_per_m2=primary_energy_kwh_per_m2,
-        monthly=(),
-        intermediate={},
-    )
-
-
-class _RaisingCalculator:
-    def calculate(self, epc: EpcPropertyData) -> SapResult:
-        raise UnmappedSapCode("heat_emitter_type", 99)
-
-
-class _StubCalculator:
-    def __init__(self, result: SapResult) -> None:
-        self._result = result
-
-    def calculate(self, epc: EpcPropertyData) -> SapResult:
-        return self._result
-
-
-def test_observe_swallows_a_calculator_raise_and_logs_error(
-    caplog: pytest.LogCaptureFixture,
-) -> None:
-    # Arrange — the calculator strict-raises on a cert it cannot yet map.
-    shadow = LoggingCalculatorShadow(_RaisingCalculator())
-    epc = _epc(sap_version=10.2)
-
-    # Act — observe must not propagate the raise (ADR-0013: shadow is not
-    # load-bearing, so it cannot abort the batch).
-    with caplog.at_level(logging.ERROR):
-        shadow.observe(property_id=42, effective_epc=epc, lodged=_lodged())
-
-    # Assert — exactly one error record, tagged with property_id + sap_version
-    # and carrying the exception so the wild-cohort gap is greppable.
-    assert len(caplog.records) == 1
-    message = caplog.records[0].getMessage()
-    assert caplog.records[0].levelno == logging.ERROR
-    assert "property_id=42" in message
-    assert "sap_version=10.2" in message
-    assert "heat_emitter_type" in message
-
-
-def test_observe_warns_when_sap_diverges_beyond_half_a_point(
-    caplog: pytest.LogCaptureFixture,
-) -> None:
-    # Arrange — calculated SAP 75.0 vs lodged 72 is 3.0 out (> 0.5).
-    shadow = LoggingCalculatorShadow(
-        _StubCalculator(_sap_result(sap_score_continuous=75.0))
-    )
-    epc = _epc(sap_version=10.2)
-
-    # Act
-    with caplog.at_level(logging.WARNING):
-        shadow.observe(property_id=42, effective_epc=epc, lodged=_lodged())
-
-    # Assert — one warning, naming the diverging quantity + the tags.
-    assert len(caplog.records) == 1
-    message = caplog.records[0].getMessage()
-    assert caplog.records[0].levelno == logging.WARNING
-    assert "sap_score" in message
-    assert "property_id=42" in message
-    assert "sap_version=10.2" in message
-
-
-def test_observe_warns_when_peui_diverges_beyond_one_percent(
-    caplog: pytest.LogCaptureFixture,
-) -> None:
-    # Arrange — calculated PEUI 200 vs lodged 180 is ~11% out (> 1%).
-    shadow = LoggingCalculatorShadow(
-        _StubCalculator(_sap_result(primary_energy_kwh_per_m2=200.0))
-    )
-    epc = _epc(sap_version=10.2)
-
-    # Act
-    with caplog.at_level(logging.WARNING):
-        shadow.observe(property_id=42, effective_epc=epc, lodged=_lodged())
-
-    # Assert
-    assert len(caplog.records) == 1
-    assert "primary_energy_intensity" in caplog.records[0].getMessage()
-
-
-def test_observe_warns_when_co2_diverges_beyond_one_percent_after_kg_to_tonnes(
-    caplog: pytest.LogCaptureFixture,
-) -> None:
-    # Arrange — calculator emits kg/yr; 2000 kg = 2.0 t vs lodged 1.8 t (~11%).
-    shadow = LoggingCalculatorShadow(
-        _StubCalculator(_sap_result(co2_kg_per_yr=2000.0))
-    )
-    epc = _epc(sap_version=10.2)
-
-    # Act
-    with caplog.at_level(logging.WARNING):
-        shadow.observe(property_id=42, effective_epc=epc, lodged=_lodged())
-
-    # Assert — the kg→tonnes conversion is applied before comparison, so a
-    # matching 1800 kg would *not* fire (guarded by the silent-when-aligned test).
-    assert len(caplog.records) == 1
-    assert "co2_emissions" in caplog.records[0].getMessage()
-
-
-def test_observe_is_silent_when_the_calculator_agrees_with_lodged(
-    caplog: pytest.LogCaptureFixture,
-) -> None:
-    # Arrange — all three quantities at the matching defaults (SAP 72, PEUI 180,
-    # 1800 kg ≡ 1.8 t): nothing should be logged.
-    shadow = LoggingCalculatorShadow(_StubCalculator(_sap_result()))
-    epc = _epc(sap_version=10.2)
-
-    # Act
-    with caplog.at_level(logging.WARNING):
-        shadow.observe(property_id=42, effective_epc=epc, lodged=_lodged())
-
-    # Assert
-    assert caplog.records == []
diff --git a/tests/domain/property_baseline/test_rebaseliner.py b/tests/domain/property_baseline/test_rebaseliner.py
index 8f669aed..f760dbf0 100644
--- a/tests/domain/property_baseline/test_rebaseliner.py
+++ b/tests/domain/property_baseline/test_rebaseliner.py
@@ -29,7 +29,7 @@ def test_sap10_epc_is_not_rebaselined_so_effective_equals_lodged() -> None:
     rebaseliner = StubRebaseliner()
 
     # Act
-    effective, reason = rebaseliner.rebaseline(epc, lodged)
+    effective, reason = rebaseliner.rebaseline(10, epc, lodged)
 
     # Assert — Effective Performance equals Lodged, reason "none".
     assert effective == lodged
@@ -45,4 +45,4 @@ def test_pre_sap10_epc_raises_because_rebaselining_is_not_implemented() -> None:
 
     # Act / Assert
     with pytest.raises(RebaselineNotImplemented):
-        rebaseliner.rebaseline(epc, _lodged())
+        rebaseliner.rebaseline(10, epc, _lodged())
diff --git a/tests/orchestration/fakes.py b/tests/orchestration/fakes.py
index 23b1fc90..3e2feef0 100644
--- a/tests/orchestration/fakes.py
+++ b/tests/orchestration/fakes.py
@@ -10,8 +10,6 @@ from types import TracebackType
 from typing import Any, Optional
 
 from datatypes.epc.domain.epc_property_data import EpcPropertyData
-from domain.property_baseline.calculator_shadow import CalculatorShadow
-from domain.property_baseline.performance import Performance
 from domain.property_baseline.property_baseline_performance import PropertyBaselinePerformance
 from domain.property.properties import Properties
 from domain.property.property import Property
@@ -90,23 +88,6 @@ class FakePropertyBaselineRepo(PropertyBaselineRepository):
         raise NotImplementedError
 
 
-class FakeCalculatorShadow(CalculatorShadow):
-    """Records each `observe` call so a test can assert the orchestrator runs
-    the shadow per property without dragging in the real calculator."""
-
-    def __init__(self) -> None:
-        self.observed: list[tuple[int, EpcPropertyData, Performance]] = []
-
-    def observe(
-        self,
-        *,
-        property_id: int,
-        effective_epc: EpcPropertyData,
-        lodged: Performance,
-    ) -> None:
-        self.observed.append((property_id, effective_epc, lodged))
-
-
 class FakeUnitOfWork(UnitOfWork):
     """A unit that holds in-memory repos and counts commits."""
 
diff --git a/tests/orchestration/test_ara_first_run_pipeline_integration.py b/tests/orchestration/test_ara_first_run_pipeline_integration.py
index 357ea7f2..e60ac716 100644
--- a/tests/orchestration/test_ara_first_run_pipeline_integration.py
+++ b/tests/orchestration/test_ara_first_run_pipeline_integration.py
@@ -36,7 +36,6 @@ from repositories.geospatial.geospatial_repository import GeospatialRepository
 from repositories.materials.materials_repository import MaterialsRepository
 from repositories.postgres_unit_of_work import PostgresUnitOfWork
 from repositories.scenario.scenario_repository import ScenarioRepository
-from tests.orchestration.fakes import FakeCalculatorShadow
 
 _JSON_SAMPLES = Path(__file__).resolve().parents[2] / "backend/epc_api/json_samples"
 
@@ -114,7 +113,6 @@ def test_first_run_baselines_through_repos_and_is_idempotent_on_rerun(
         baseline=PropertyBaselineOrchestrator(
             unit_of_work=unit_of_work,
             rebaseliner=StubRebaseliner(),
-            calculator_shadow=FakeCalculatorShadow(),
         ),
         modelling=ModellingOrchestrator(
             scenario_repo=ScenarioRepository(),
diff --git a/tests/orchestration/test_property_baseline_orchestrator.py b/tests/orchestration/test_property_baseline_orchestrator.py
index b14574f0..12c3d660 100644
--- a/tests/orchestration/test_property_baseline_orchestrator.py
+++ b/tests/orchestration/test_property_baseline_orchestrator.py
@@ -13,7 +13,6 @@ from domain.property_baseline.rebaseliner import RebaselineNotImplemented, StubR
 from domain.property.property import Property, PropertyIdentity
 from orchestration.property_baseline_orchestrator import PropertyBaselineOrchestrator
 from tests.orchestration.fakes import (
-    FakeCalculatorShadow,
     FakePropertyBaselineRepo,
     FakePropertyRepo,
     FakeUnitOfWork,
@@ -38,34 +37,6 @@ def _property(*, sap_version: float) -> Property:
     )
 
 
-def test_run_invokes_the_calculator_shadow_per_property_and_still_persists() -> None:
-    # Arrange
-    property_baseline_repo = FakePropertyBaselineRepo()
-    shadow = FakeCalculatorShadow()
-    prop = _property(sap_version=10.2)
-    uow = FakeUnitOfWork(
-        property=FakePropertyRepo({10: prop}),
-        property_baseline=property_baseline_repo,
-    )
-    orchestrator = PropertyBaselineOrchestrator(
-        unit_of_work=lambda: uow,
-        rebaseliner=StubRebaseliner(),
-        calculator_shadow=shadow,
-    )
-
-    # Act
-    orchestrator.run([10])
-
-    # Assert — the load-bearing write + single commit are unchanged, and the
-    # shadow observed the Effective EPC + Lodged Performance once (ADR-0013).
-    lodged = Performance(
-        sap_score=72, epc_band=Epc.C, co2_emissions=1.8, primary_energy_intensity=180
-    )
-    assert len(property_baseline_repo.saved) == 1
-    assert uow.commits == 1
-    assert shadow.observed == [(10, prop.effective_epc, lodged)]
-
-
 def test_run_establishes_persists_and_commits_the_batch_once() -> None:
     # Arrange
     property_baseline_repo = FakePropertyBaselineRepo()
@@ -76,7 +47,6 @@ def test_run_establishes_persists_and_commits_the_batch_once() -> None:
     orchestrator = PropertyBaselineOrchestrator(
         unit_of_work=lambda: uow,
         rebaseliner=StubRebaseliner(),
-        calculator_shadow=FakeCalculatorShadow(),
     )
 
     # Act
@@ -112,7 +82,6 @@ def test_run_raises_on_a_pre_sap10_property_and_does_not_commit() -> None:
     orchestrator = PropertyBaselineOrchestrator(
         unit_of_work=lambda: uow,
         rebaseliner=StubRebaseliner(),
-        calculator_shadow=FakeCalculatorShadow(),
     )
 
     # Act / Assert — the raise propagates; the batch is neither persisted nor

From 2c8c299fde6c6ea142f365fc7ff2b33404c98d32 Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Tue, 2 Jun 2026 10:13:23 +0000
Subject: [PATCH 09/12] docs(migration): add the Bill Derivation block to the
 property_baseline table (ADR-0014)

Slice 5b: update the FE-owned migration spec so the other repo can create the
bill columns in parallel.

- Bill block: per-section delivered kWh + cost (heating, hot water, lighting,
  appliances, cooking, pumps/fans, cooling) + standing_charges_gbp,
  seg_credit_gbp, total_annual_bill_gbp, fuel_rates_period.
- space_heating_kwh / water_heating_kwh (RHI recorded demand) marked SUPERSEDED
  by heating_kwh / hot_water_kwh (calculator delivered fuel); kept until the bill
  populates, then dropped.
- Cooling section kept (mostly 0 but affects the bill, cheap to store).
- Records the calculator-load-bearing posture (effective_* may differ from
  lodged_* for pre-10.2) and that columns are defined now / populated when the
  SapResult->EnergyBreakdown adapter + BillDerivation wiring land.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 .../property-baseline-performance-table.md    | 50 +++++++++++++++----
 1 file changed, 39 insertions(+), 11 deletions(-)

diff --git a/docs/migrations/property-baseline-performance-table.md b/docs/migrations/property-baseline-performance-table.md
index 33e2171a..d4846843 100644
--- a/docs/migrations/property-baseline-performance-table.md
+++ b/docs/migrations/property-baseline-performance-table.md
@@ -27,17 +27,45 @@ straight lift-and-shift of the columns below.
 | `effective_co2_emissions_t_per_yr` | float | tonnes CO₂/yr (whole dwelling) |
 | `effective_primary_energy_intensity_kwh_per_m2_yr` | int | kWh/m²/yr |
 | `rebaseline_reason` | text | `none` \| `pre_sap10` \| `physical_state_changed` \| `both` |
-| `space_heating_kwh` | float | off `renewable_heat_incentive`; deterministic (ADR-0006) |
-| `water_heating_kwh` | float | off `renewable_heat_incentive` |
+| `space_heating_kwh` | float | EPC `renewable_heat_incentive` recorded demand. **Superseded** by `heating_kwh` (delivered) when the bill block populates; kept until then to avoid an empty-kWh gap, dropped in the population slice. |
+| `water_heating_kwh` | float | EPC `renewable_heat_incentive`; **superseded** by `hot_water_kwh`. |
 
-This slice has no ML rebaselining, so `effective_* == lodged_*` and `rebaseline_reason = 'none'`
-for every row written (a pre-SAP10 cert raises rather than persisting a wrong-but-plausible row —
-see #1135). The `effective_*` columns exist now so the table shape is stable when ML lands.
+### Bill block (ADR-0014) — the energy bill, composed per section
 
-## Deferred (follow-up — EPC Energy Derivation + Fuel Rates)
+Produced by **Bill Derivation**: the calculator's **delivered** kWh per end use priced at current
+**Fuel Rates** (a committed snapshot, not SAP's standardised prices), per section + the total.
+Per-section kWh is *delivered fuel* (demand ÷ efficiency — what the household pays for), distinct
+from the recorded-demand `space_heating_kwh`/`water_heating_kwh` above which it supersedes.
 
-`fuel_split` and `bills` are **not** in this table yet. They are produced by
-`EpcEnergyDerivationService`, which needs a current **Fuel Rates** source (Ofgem-cap ETL) that does
-not exist yet. They land together in the follow-up so this table is not migrated twice. Likely
-shape: a `bills`-style block (per-fuel kWh + standing charge + SEG) — to be specified in that
-slice's migration note.
+| Column | Type | Notes |
+|---|---|---|
+| `fuel_rates_period` | text | which Fuel Rates snapshot priced this bill (e.g. `"2026-04 to 2026-06"`) — provenance |
+| `heating_kwh` | float | delivered fuel kWh (main + secondary heating) |
+| `heating_cost_gbp` | float | priced at the heating fuel's current rate |
+| `hot_water_kwh` | float | |
+| `hot_water_cost_gbp` | float | |
+| `lighting_kwh` | float | |
+| `lighting_cost_gbp` | float | |
+| `appliances_kwh` | float | unregulated load — **0 until the appliances/cooking fields land on `SapResult`** (ADR-0014 TODO) |
+| `appliances_cost_gbp` | float | |
+| `cooking_kwh` | float | unregulated load — 0 until `SapResult` carries it |
+| `cooking_cost_gbp` | float | |
+| `pumps_fans_kwh` | float | |
+| `pumps_fans_cost_gbp` | float | |
+| `cooling_kwh` | float | mostly 0 in UK homes; carried for completeness as it affects the bill |
+| `cooling_cost_gbp` | float | |
+| `standing_charges_gbp` | float | daily standing charge × 365, once per distinct metered fuel (off-gas fuels have none) |
+| `seg_credit_gbp` | float | SEG export credit on PV (subtracted) |
+| `total_annual_bill_gbp` | float | Σ section costs + standing charges − SEG |
+
+The calculator is **load-bearing** (ADR-0013 amendment): for `sap_version < 10.2` the `effective_*`
+columns hold the calculator's output (so `effective_* != lodged_*` legitimately); at/above 10.2 they
+mirror the lodged figures and divergence is logged. A cert the calculator cannot score aborts the
+batch rather than persisting a wrong row.
+
+### Population timing
+
+The bill columns are **defined now so the FE can create them**, but are populated only once the
+`SapResult` → `EnergyBreakdown` adapter + `BillDerivation` wiring land (gated on the appliances /
+cooking `SapResult` fields). Until then the SQLModel mirror in `infrastructure/postgres/` adds these
+columns as nullable; the Drizzle migration can create them nullable in parallel.

From bce4a9f7ec6e23626034b5c2f057dc8039bccec4 Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Tue, 2 Jun 2026 13:45:48 +0000
Subject: [PATCH 10/12] refactor(baseline): SapCalculator ABC replaces the
 Calculator Protocol
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

PR feedback: prefer an abstract base the calculator inherits from over a
structural Protocol. Define `SapCalculator(ABC)` in the calculator package
(the engine owns its own contract) and have `Sap10Calculator` inherit it;
a future methodology is another subclass. Placing the ABC with the engine —
not in property_baseline — keeps the dependency pointing consumer -> engine
(sap10_calculator imports nothing from property_baseline). Consistent with
the repo's existing port convention (FuelRatesRepository(ABC)).

CalculatorRebaseliner keeps its reference to SapCalculator type-only (under
TYPE_CHECKING), so the module still does not import the calculator at
runtime. Test fakes now inherit the ABC since structural conformance no
longer applies.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 .../property_baseline/calculator_rebaseliner.py | 13 +++----------
 domain/sap10_calculator/calculator.py           | 17 ++++++++++++++++-
 .../test_calculator_rebaseliner.py              |  6 +++---
 3 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/domain/property_baseline/calculator_rebaseliner.py b/domain/property_baseline/calculator_rebaseliner.py
index b2443784..cbfaace7 100644
--- a/domain/property_baseline/calculator_rebaseliner.py
+++ b/domain/property_baseline/calculator_rebaseliner.py
@@ -1,7 +1,7 @@
 from __future__ import annotations
 
 import logging
-from typing import TYPE_CHECKING, Optional, Protocol
+from typing import TYPE_CHECKING, Optional
 
 from datatypes.epc.domain.epc import Epc
 from domain.property_baseline.performance import Performance
@@ -9,7 +9,7 @@ from domain.property_baseline.rebaseliner import Rebaseliner, RebaselineReason
 
 if TYPE_CHECKING:
     from datatypes.epc.domain.epc_property_data import EpcPropertyData
-    from domain.sap10_calculator.calculator import SapResult
+    from domain.sap10_calculator.calculator import SapCalculator, SapResult
 
 logger = logging.getLogger(__name__)
 
@@ -22,13 +22,6 @@ _REL_TOL = 0.01
 _KG_PER_TONNE = 1000.0
 
 
-class Calculator(Protocol):
-    """The slice of `Sap10Calculator` the rebaseliner needs — `Sap10Calculator`
-    satisfies it structurally, so this module does not import the calculator."""
-
-    def calculate(self, epc: "EpcPropertyData") -> "SapResult": ...
-
-
 def performance_from_sap_result(result: "SapResult") -> Performance:
     """The four rated quantities, read off a `SapResult`: band derived from the
     score, CO2 converted kg→tonnes, PEUI rounded to the lodged integer scale."""
@@ -58,7 +51,7 @@ class CalculatorRebaseliner(Rebaseliner):
     is fixed immediately.
     """
 
-    def __init__(self, calculator: Calculator) -> None:
+    def __init__(self, calculator: "SapCalculator") -> None:
         self._calculator = calculator
 
     def rebaseline(
diff --git a/domain/sap10_calculator/calculator.py b/domain/sap10_calculator/calculator.py
index 47366741..43226da1 100644
--- a/domain/sap10_calculator/calculator.py
+++ b/domain/sap10_calculator/calculator.py
@@ -41,6 +41,7 @@ Appendix L + U. RdSAP10 Table 32 (p.95) for fuel prices/CO2/PE factors.
 
 from __future__ import annotations
 
+from abc import ABC, abstractmethod
 from dataclasses import dataclass, field
 from typing import Final, Optional, TYPE_CHECKING
 
@@ -751,7 +752,21 @@ def calculate_sap_from_inputs(inputs: CalculatorInputs) -> SapResult:
     )
 
 
-class Sap10Calculator:
+class SapCalculator(ABC):
+    """The contract a SAP calculator satisfies: an `EpcPropertyData` in, a
+    typed `SapResult` out. `Sap10Calculator` is the SAP 10.2 implementation;
+    a future methodology (e.g. SAP 10.3 / a successor) is another subclass.
+
+    Consumers (e.g. `CalculatorRebaseliner`) depend on this abstraction, not
+    on a concrete calculator — so the engine can be swapped without touching
+    them.
+    """
+
+    @abstractmethod
+    def calculate(self, epc: "EpcPropertyData") -> SapResult: ...
+
+
+class Sap10Calculator(SapCalculator):
     """Deterministic SAP 10.2 calculator entry point. Maps an
     `EpcPropertyData` to typed `CalculatorInputs` via the RdSAP-driven
     `cert_to_inputs` mapper and runs the 12-month worksheet loop.
diff --git a/tests/domain/property_baseline/test_calculator_rebaseliner.py b/tests/domain/property_baseline/test_calculator_rebaseliner.py
index ea1230fc..f22e152f 100644
--- a/tests/domain/property_baseline/test_calculator_rebaseliner.py
+++ b/tests/domain/property_baseline/test_calculator_rebaseliner.py
@@ -9,7 +9,7 @@ from datatypes.epc.domain.epc import Epc
 from datatypes.epc.domain.epc_property_data import EpcPropertyData
 from domain.property_baseline.calculator_rebaseliner import CalculatorRebaseliner
 from domain.property_baseline.performance import Performance
-from domain.sap10_calculator.calculator import SapResult
+from domain.sap10_calculator.calculator import SapCalculator, SapResult
 from domain.sap10_calculator.exceptions import UnmappedSapCode
 
 
@@ -54,7 +54,7 @@ def _sap_result(
     )
 
 
-class _StubCalculator:
+class _StubCalculator(SapCalculator):
     def __init__(self, result: SapResult) -> None:
         self._result = result
 
@@ -122,7 +122,7 @@ def test_a_10_2_cert_logs_divergence_when_the_calculator_disagrees(
 
 def test_a_calculator_raise_propagates_and_aborts() -> None:
     # Arrange — the calculator is load-bearing, so a raise is not swallowed.
-    class _Raising:
+    class _Raising(SapCalculator):
         def calculate(self, epc: EpcPropertyData) -> SapResult:
             raise UnmappedSapCode("heat_emitter_type", 99)
 

From 389e39012dd4e039b6fa16427c012cbd333ef28c Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Tue, 2 Jun 2026 13:49:34 +0000
Subject: [PATCH 11/12] style(baseline): typehint call-return locals in
 CalculatorRebaseliner
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

PR feedback: annotate locals assigned from a method-call return or
attribute access, even though pyright infers them — the type is visible at
the assignment without chasing the callee. `result: SapResult` and
`sap_version: Optional[float]` in rebaseline(). Local annotations are not
evaluated at runtime, so the TYPE_CHECKING-only SapResult import stands.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 domain/property_baseline/calculator_rebaseliner.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/domain/property_baseline/calculator_rebaseliner.py b/domain/property_baseline/calculator_rebaseliner.py
index cbfaace7..c6519c83 100644
--- a/domain/property_baseline/calculator_rebaseliner.py
+++ b/domain/property_baseline/calculator_rebaseliner.py
@@ -59,8 +59,8 @@ class CalculatorRebaseliner(Rebaseliner):
     ) -> tuple[Performance, RebaselineReason]:
         # A raise (UnmappedSapCode, etc.) propagates: the calculator is
         # load-bearing, so the batch aborts and the cert is fixed at once.
-        result = self._calculator.calculate(effective_epc)
-        sap_version = effective_epc.sap_version
+        result: SapResult = self._calculator.calculate(effective_epc)
+        sap_version: Optional[float] = effective_epc.sap_version
         if sap_version is not None and sap_version < _SAP10_2_FLOOR:
             return performance_from_sap_result(result), "pre_sap10"
         self._log_divergence(

From 0fb5da2f79094c0939f16078e3c584266bac96f6 Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Tue, 2 Jun 2026 13:59:25 +0000
Subject: [PATCH 12/12] refactor(baseline): Performance.from_sap_result
 replaces the loose mapper

PR feedback: the SapResult -> Performance mapping should be a method, not a
free function you must know exists in the rebaseliner. Put the factory on
the target as `Performance.from_sap_result`, beside its sibling
`lodged_performance` and mirroring `Epc.from_sap_score` (the factory this
mapping already calls).

Not a `SapResult.to_performance()`: that would make the SAP calculator
import `Performance` (a property_baseline type), re-introducing the
engine->consumer coupling removed by the SapCalculator ABC. SapResult is a
TYPE_CHECKING-only import in performance.py (the body only reads attributes),
so the calculator module is not pulled in at runtime.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 .../calculator_rebaseliner.py                 | 14 +------------
 domain/property_baseline/performance.py       | 20 ++++++++++++++++++-
 2 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/domain/property_baseline/calculator_rebaseliner.py b/domain/property_baseline/calculator_rebaseliner.py
index c6519c83..184f56b0 100644
--- a/domain/property_baseline/calculator_rebaseliner.py
+++ b/domain/property_baseline/calculator_rebaseliner.py
@@ -3,7 +3,6 @@ from __future__ import annotations
 import logging
 from typing import TYPE_CHECKING, Optional
 
-from datatypes.epc.domain.epc import Epc
 from domain.property_baseline.performance import Performance
 from domain.property_baseline.rebaseliner import Rebaseliner, RebaselineReason
 
@@ -22,17 +21,6 @@ _REL_TOL = 0.01
 _KG_PER_TONNE = 1000.0
 
 
-def performance_from_sap_result(result: "SapResult") -> Performance:
-    """The four rated quantities, read off a `SapResult`: band derived from the
-    score, CO2 converted kg→tonnes, PEUI rounded to the lodged integer scale."""
-    return Performance(
-        sap_score=result.sap_score,
-        epc_band=Epc.from_sap_score(result.sap_score),
-        co2_emissions=result.co2_kg_per_yr / _KG_PER_TONNE,
-        primary_energy_intensity=round(result.primary_energy_kwh_per_m2),
-    )
-
-
 def _relative_diff(calculated: float, lodged: float) -> float:
     if lodged == 0:
         return 0.0 if calculated == 0 else float("inf")
@@ -62,7 +50,7 @@ class CalculatorRebaseliner(Rebaseliner):
         result: SapResult = self._calculator.calculate(effective_epc)
         sap_version: Optional[float] = effective_epc.sap_version
         if sap_version is not None and sap_version < _SAP10_2_FLOOR:
-            return performance_from_sap_result(result), "pre_sap10"
+            return Performance.from_sap_result(result), "pre_sap10"
         self._log_divergence(
             property_id=property_id, sap_version=sap_version, result=result, lodged=lodged
         )
diff --git a/domain/property_baseline/performance.py b/domain/property_baseline/performance.py
index 1db38846..b2ab45ce 100644
--- a/domain/property_baseline/performance.py
+++ b/domain/property_baseline/performance.py
@@ -1,12 +1,16 @@
 from __future__ import annotations
 
 from dataclasses import dataclass
-from typing import Optional, TypeVar
+from typing import Optional, TYPE_CHECKING, TypeVar
 
 from datatypes.epc.domain.epc import Epc
 from datatypes.epc.domain.epc_property_data import EpcPropertyData
 
+if TYPE_CHECKING:
+    from domain.sap10_calculator.calculator import SapResult
+
 _T = TypeVar("_T")
+_KG_PER_TONNE = 1000.0
 
 
 @dataclass(frozen=True)
@@ -24,6 +28,20 @@ class Performance:
     co2_emissions: float
     primary_energy_intensity: int
 
+    @classmethod
+    def from_sap_result(cls, result: "SapResult") -> "Performance":
+        """The four rated quantities, read off a calculator `SapResult`
+        (ADR-0013): band derived from the score, CO2 converted kg→tonnes, PEUI
+        rounded to the lodged integer scale. The `from_*` factory mirrors
+        `Epc.from_sap_score`; living on the target keeps the SAP calculator
+        free of any `property_baseline` dependency."""
+        return cls(
+            sap_score=result.sap_score,
+            epc_band=Epc.from_sap_score(result.sap_score),
+            co2_emissions=result.co2_kg_per_yr / _KG_PER_TONNE,
+            primary_energy_intensity=round(result.primary_energy_kwh_per_m2),
+        )
+
 
 def _require(value: Optional[_T], field: str) -> _T:
     if value is None: