docs(modelling): design the plan_recommendations retirement (ADR-0017 amendment)

Rewrite the migration spec into the full expand/contract sequence (add plan_id → backfill → dual-write → cut reads → drop) with the two load-bearing rules: backfill before any read cuts over, and dual-write the m2m until all reads are off it (the Drizzle FE reads the tables directly, so the repos can't deploy atomically). Amend ADR-0017 from "m2m retired for new writes" to "m2m dropped + one SQLModel definition per table under infrastructure/postgres/modelling/". Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 11:17:27 +00:00 · 2026-06-03 20:24:30 +00:00 · 2026-06-03 20:24:30 +00:00 · b76d0f814b
commit b76d0f814b
parent ae5bbd0646
2 changed files with 56 additions and 14 deletions
--- a/docs/adr/0017-plan-persistence-evolve-live-tables.md
+++ b/docs/adr/0017-plan-persistence-evolve-live-tables.md
@ -24,6 +24,14 @@ The rebuild's persistence convention is SQLModel `table=True` rows in `infrastru
 - **Keep the `plan_recommendations` m2m** — rejected: the join's cascade delete is the known performance killer this change exists to remove.
 - **JSONB blob for the package** — rejected: the FE queries per-measure columns; flat typed columns are the existing contract.

+## Amendment (2026-06-03) — retire `plan_recommendations`, consolidate the models
+
+The original decision *retired the m2m for new writes* but left it in place. This amendment **drops it** and **consolidates the model definitions**, decided in a `/grill-with-docs` session:
+
+- **`plan_recommendations` is dropped.** All readers (`portfolio_functions`, `Outputs`, `export/property_scenarios`) and writers are cut onto `recommendation.plan_id`. The m2m is one-to-many in practice (a measure is never shared across Plans), so a single FK models it faithfully. The cross-repo expand/contract sequence (add → backfill → dual-write → cut reads → drop) is specified in [docs/migrations/recommendation-plan-id.md](../migrations/recommendation-plan-id.md); the two load-bearing rules are **backfill before any read cuts over** and **dual-write the m2m until all reads are off it** (the FE reads via Drizzle directly, so the repos cannot deploy atomically).
+- **One model per physical table, in `infrastructure/postgres/modelling/`.** The drift hazard the original ADR accepted (two ORM definitions of `plan`/`recommendation`) is resolved by **consolidating the whole `backend/app/db/models/recommendations.py` cluster** (`plan`, `recommendation`, `recommendation_materials`, `scenario`, `installed_measure`, the `PlanType`/`MeasureType` enums) into single **SQLModel** `…Row` definitions in a new `infrastructure/postgres/modelling/` subpackage, carrying full legacy column parity plus `recommendation.plan_id`. `backend/app/db/models/recommendations.py` becomes a **re-export shim** (the established `epc_property.py` pattern), aliasing the legacy names to the canonical `…Row` classes so `backend/` callers keep working. The rebuild's partial `PlanRow`/`RecommendationRow`/`ScenarioRow` mirrors are absorbed into these.
+- **Scope:** `backend/` + the rebuild only. The `etl/` and `sfr/` reporting scripts that read the m2m are deferred to a later pass.
+
 ## Consequences

 - **Two ORM definitions of `plan`/`recommendation`** coexist (legacy SQLAlchemy + new SQLModel mirror), a drift hazard — mitigated by this being the established mirror pattern and the physical table being the single contract. Retiring the legacy models is later, separate work.
--- a/docs/migrations/recommendation-plan-id.md
+++ b/docs/migrations/recommendation-plan-id.md
@ -1,28 +1,62 @@
-# `recommendation.plan_id` — FE-owned migration
+# Retire `plan_recommendations` — link measures by `recommendation.plan_id`

-**Context:** #1157 of the Modelling-stage rebuild. The `ModellingOrchestrator` persists a **Plan** and its selected **Plan Measures** (rows of the live `recommendation` table). To link a measure to its Plan it adds **`recommendation.plan_id`**, replacing the `plan_recommendations` many-to-many join for new writes (the m2m's cascade delete is pathologically slow — see [ADR-0017](../adr/0017-plan-persistence-evolve-live-tables.md)).
+**Context:** Modelling-stage rebuild. The `ModellingOrchestrator` persists a **Plan** and its selected **Plan Measures** (rows of the live `recommendation` table). A measure belongs to exactly one Plan, so the `plan_recommendations` many-to-many is replaced by a direct **`recommendation.plan_id`** FK and then **dropped**. The m2m's cascade delete is the known performance killer this change removes (see [ADR-0017](../adr/0017-plan-persistence-evolve-live-tables.md)).

-The SQLModel mirror is defined in `infrastructure/postgres/` so the ephemeral-Postgres tests build it via `SQLModel.metadata.create_all`. The **production migration is FE-owned (Drizzle ORM)**.
+The plan/recommendation/scenario tables are read **directly by the Drizzle FE** and written by both the legacy `engine.py` path and the rebuild. So this is an expand/contract migration on a live, **two-repo** (Python backend + Drizzle FE) schema. The **DB migrations are FE-owned (Drizzle)**; this doc is their spec and pins the ordering so the repos stay in step.

-## Change
+## Cardinality

-Add one column to the existing `recommendation` table:
+`plan_recommendations` is **one-to-many in practice, never many-to-many**: both writers (`upload_recommendations`, `bulk_upload_recommendations_and_materials`) create *fresh* `recommendation` rows per Plan and link each to a single `plan_id`. A recommendation is never shared across Plans, so a single `recommendation.plan_id` FK models reality faithfully and the backfill is a clean 1:1.
+
+## Sequence (expand → backfill → migrate reads → contract)
+
+The two hard rules: **backfill before any reader cuts to `plan_id`** (else every historical Plan — all `plan_id = NULL`, linked only via the m2m — vanishes from the FE), and **dual-write the m2m through the transition** (so backend and FE reads can each cut to `plan_id` independently, in any order, with zero breakage; the m2m write is removed only at the end).
+
+| # | Step | Owner | Safe because |
+|---|---|---|---|
+| 1 | **Add `recommendation.plan_id`** — `bigint`, FK → `plan.id`, **`ON DELETE CASCADE`**, indexed, **nullable** | FE (Drizzle) | additive; legacy rows keep `NULL` |
+| 2 | **Backfill** `plan_id` from the m2m (see SQL below) | FE (Drizzle data migration) | every existing measure gets its Plan before any read cuts over |
+| 3 | **Dual-write**: writers set `plan_id` **and** keep writing the m2m | backend | both old (m2m) and new (`plan_id`) readers work |
+| 4 | **Cut reads to `plan_id`** — backend (`portfolio_functions`, `Outputs`, `export/property_scenarios`) **and** the Drizzle FE | backend + FE | backfill (2) means no NULLs; dual-write (3) means order between repos is free |
+| 5 | **Stop writing the m2m** | backend | no reader uses it after (4) |
+| 6 | **Drop `plan_recommendations`** | FE (Drizzle) + backend (remove model) | unreferenced after (5) |
+
+### Backfill SQL (step 2)
+
+```sql
+UPDATE recommendation r
+SET    plan_id = pr.plan_id
+FROM   plan_recommendations pr
+WHERE  pr.recommendation_id = r.id
+  AND  r.plan_id IS NULL;
+```
+
+Guard before dropping the m2m: assert no recommendation maps to more than one Plan (a data anomaly the writers can't produce, but worth checking on real data):
+
+```sql
+SELECT recommendation_id, count(*)
+FROM   plan_recommendations
+GROUP  BY recommendation_id
+HAVING count(*) > 1;
+-- expect zero rows
+```
+
+## Step 1 — column definition

 | Column | Type | Notes |
 |---|---|---|
-| `plan_id` | bigint, FK → `plan.id`, **`ON DELETE CASCADE`**, indexed | the Plan this measure belongs to. Nullable during transition (legacy rows predate it); new writes always set it. |
+| `plan_id` | bigint, FK → `plan.id`, **`ON DELETE CASCADE`**, indexed, nullable | the Plan this measure belongs to. Nullable during transition; every new write sets it. |

- **Index `plan_id`** — the orchestrator's idempotent replace deletes a Plan and relies on the cascade to remove its measures; reads fetch a Plan's measures by `plan_id`.
- **`ON DELETE CASCADE`** is what makes "delete the Plan → its measures go too" a single statement, replacing the m2m cleanup.
+- **Index `plan_id`** — the rebuild's idempotent replace deletes a Plan and relies on the cascade to remove its measures; reads fetch a Plan's measures by `plan_id`.
+- **`ON DELETE CASCADE`** makes "delete the Plan → its measures go too" a single statement, replacing the m2m cleanup.

-## Transition / sequencing
+## This repo's part (all of steps 3–6, gated on 1+2 being live)

-1. **Add `plan_id` (nullable)** — this migration. New `ModellingOrchestrator` writes populate it; legacy writers and existing rows are unaffected.
-2. **Cut legacy readers** off `plan_recommendations` onto `plan_id` (separate work, not in #1157).
-3. **Drop `plan_recommendations`** once no reader remains (separate migration).
+The user's instruction is to implement the backend end-to-end **as if the FE has already applied steps 1 and 2** (the `plan_id` column exists and is backfilled). Concretely, in `backend/` + the rebuild:

-Existing live `recommendation` rows keep `plan_id = NULL` until/unless re-modelled; they remain reachable via the legacy `plan_recommendations` join during the transition.
+- The plan/recommendation/scenario/installed-measure models are **consolidated into `infrastructure/postgres/modelling/`** as single SQLModel definitions (`…Row`), `recommendation` carrying `plan_id`; `backend/app/db/models/recommendations.py` becomes a re-export shim (ADR-0017 amendment).
+- Writers set `plan_id`; readers join on `plan_id`; the m2m write/cleanup and the `PlanRecommendations` model are removed.

 ## Not changed here

-No new columns for contingency (per-measure contingency stays summed into `plan.contingency_cost`, matching legacy), no `phase` column (multi-phase deferred, ADR-0005), and the energy/bill columns are populated by a later Bill Derivation slice (ADR-0017).
+No new contingency columns (per-measure contingency stays summed into `plan.contingency_cost`); no `phase` column (multi-phase deferred, ADR-0005). The `etl/` and `sfr/` reporting scripts that read the m2m are **out of scope** — handled in a later pass.