Rewrite the migration spec into the full expand/contract sequence (add plan_id → backfill → dual-write → cut reads → drop) with the two load-bearing rules: backfill before any read cuts over, and dual-write the m2m until all reads are off it (the Drizzle FE reads the tables directly, so the repos can't deploy atomically). Amend ADR-0017 from "m2m retired for new writes" to "m2m dropped + one SQLModel definition per table under infrastructure/postgres/modelling/". Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
4.6 KiB
Retire plan_recommendations — link measures by recommendation.plan_id
Context: Modelling-stage rebuild. The ModellingOrchestrator persists a Plan and its selected Plan Measures (rows of the live recommendation table). A measure belongs to exactly one Plan, so the plan_recommendations many-to-many is replaced by a direct recommendation.plan_id FK and then dropped. The m2m's cascade delete is the known performance killer this change removes (see ADR-0017).
The plan/recommendation/scenario tables are read directly by the Drizzle FE and written by both the legacy engine.py path and the rebuild. So this is an expand/contract migration on a live, two-repo (Python backend + Drizzle FE) schema. The DB migrations are FE-owned (Drizzle); this doc is their spec and pins the ordering so the repos stay in step.
Cardinality
plan_recommendations is one-to-many in practice, never many-to-many: both writers (upload_recommendations, bulk_upload_recommendations_and_materials) create fresh recommendation rows per Plan and link each to a single plan_id. A recommendation is never shared across Plans, so a single recommendation.plan_id FK models reality faithfully and the backfill is a clean 1:1.
Sequence (expand → backfill → migrate reads → contract)
The two hard rules: backfill before any reader cuts to plan_id (else every historical Plan — all plan_id = NULL, linked only via the m2m — vanishes from the FE), and dual-write the m2m through the transition (so backend and FE reads can each cut to plan_id independently, in any order, with zero breakage; the m2m write is removed only at the end).
| # | Step | Owner | Safe because |
|---|---|---|---|
| 1 | Add recommendation.plan_id — bigint, FK → plan.id, ON DELETE CASCADE, indexed, nullable |
FE (Drizzle) | additive; legacy rows keep NULL |
| 2 | Backfill plan_id from the m2m (see SQL below) |
FE (Drizzle data migration) | every existing measure gets its Plan before any read cuts over |
| 3 | Dual-write: writers set plan_id and keep writing the m2m |
backend | both old (m2m) and new (plan_id) readers work |
| 4 | Cut reads to plan_id — backend (portfolio_functions, Outputs, export/property_scenarios) and the Drizzle FE |
backend + FE | backfill (2) means no NULLs; dual-write (3) means order between repos is free |
| 5 | Stop writing the m2m | backend | no reader uses it after (4) |
| 6 | Drop plan_recommendations |
FE (Drizzle) + backend (remove model) | unreferenced after (5) |
Backfill SQL (step 2)
UPDATE recommendation r
SET plan_id = pr.plan_id
FROM plan_recommendations pr
WHERE pr.recommendation_id = r.id
AND r.plan_id IS NULL;
Guard before dropping the m2m: assert no recommendation maps to more than one Plan (a data anomaly the writers can't produce, but worth checking on real data):
SELECT recommendation_id, count(*)
FROM plan_recommendations
GROUP BY recommendation_id
HAVING count(*) > 1;
-- expect zero rows
Step 1 — column definition
| Column | Type | Notes |
|---|---|---|
plan_id |
bigint, FK → plan.id, ON DELETE CASCADE, indexed, nullable |
the Plan this measure belongs to. Nullable during transition; every new write sets it. |
- Index
plan_id— the rebuild's idempotent replace deletes a Plan and relies on the cascade to remove its measures; reads fetch a Plan's measures byplan_id. ON DELETE CASCADEmakes "delete the Plan → its measures go too" a single statement, replacing the m2m cleanup.
This repo's part (all of steps 3–6, gated on 1+2 being live)
The user's instruction is to implement the backend end-to-end as if the FE has already applied steps 1 and 2 (the plan_id column exists and is backfilled). Concretely, in backend/ + the rebuild:
- The plan/recommendation/scenario/installed-measure models are consolidated into
infrastructure/postgres/modelling/as single SQLModel definitions (…Row),recommendationcarryingplan_id;backend/app/db/models/recommendations.pybecomes a re-export shim (ADR-0017 amendment). - Writers set
plan_id; readers join onplan_id; the m2m write/cleanup and thePlanRecommendationsmodel are removed.
Not changed here
No new contingency columns (per-measure contingency stays summed into plan.contingency_cost); no phase column (multi-phase deferred, ADR-0005). The etl/ and sfr/ reporting scripts that read the m2m are out of scope — handled in a later pass.