Model/docs/migrations/recommendation-plan-id.md
Khalim Conn-Kowlessar b76d0f814b docs(modelling): design the plan_recommendations retirement (ADR-0017 amendment)
Rewrite the migration spec into the full expand/contract sequence (add plan_id
→ backfill → dual-write → cut reads → drop) with the two load-bearing rules:
backfill before any read cuts over, and dual-write the m2m until all reads are
off it (the Drizzle FE reads the tables directly, so the repos can't deploy
atomically). Amend ADR-0017 from "m2m retired for new writes" to "m2m dropped +
one SQLModel definition per table under infrastructure/postgres/modelling/".

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 20:24:30 +00:00

4.6 KiB
Raw Blame History

Retire plan_recommendations — link measures by recommendation.plan_id

Context: Modelling-stage rebuild. The ModellingOrchestrator persists a Plan and its selected Plan Measures (rows of the live recommendation table). A measure belongs to exactly one Plan, so the plan_recommendations many-to-many is replaced by a direct recommendation.plan_id FK and then dropped. The m2m's cascade delete is the known performance killer this change removes (see ADR-0017).

The plan/recommendation/scenario tables are read directly by the Drizzle FE and written by both the legacy engine.py path and the rebuild. So this is an expand/contract migration on a live, two-repo (Python backend + Drizzle FE) schema. The DB migrations are FE-owned (Drizzle); this doc is their spec and pins the ordering so the repos stay in step.

Cardinality

plan_recommendations is one-to-many in practice, never many-to-many: both writers (upload_recommendations, bulk_upload_recommendations_and_materials) create fresh recommendation rows per Plan and link each to a single plan_id. A recommendation is never shared across Plans, so a single recommendation.plan_id FK models reality faithfully and the backfill is a clean 1:1.

Sequence (expand → backfill → migrate reads → contract)

The two hard rules: backfill before any reader cuts to plan_id (else every historical Plan — all plan_id = NULL, linked only via the m2m — vanishes from the FE), and dual-write the m2m through the transition (so backend and FE reads can each cut to plan_id independently, in any order, with zero breakage; the m2m write is removed only at the end).

# Step Owner Safe because
1 Add recommendation.plan_idbigint, FK → plan.id, ON DELETE CASCADE, indexed, nullable FE (Drizzle) additive; legacy rows keep NULL
2 Backfill plan_id from the m2m (see SQL below) FE (Drizzle data migration) every existing measure gets its Plan before any read cuts over
3 Dual-write: writers set plan_id and keep writing the m2m backend both old (m2m) and new (plan_id) readers work
4 Cut reads to plan_id — backend (portfolio_functions, Outputs, export/property_scenarios) and the Drizzle FE backend + FE backfill (2) means no NULLs; dual-write (3) means order between repos is free
5 Stop writing the m2m backend no reader uses it after (4)
6 Drop plan_recommendations FE (Drizzle) + backend (remove model) unreferenced after (5)

Backfill SQL (step 2)

UPDATE recommendation r
SET    plan_id = pr.plan_id
FROM   plan_recommendations pr
WHERE  pr.recommendation_id = r.id
  AND  r.plan_id IS NULL;

Guard before dropping the m2m: assert no recommendation maps to more than one Plan (a data anomaly the writers can't produce, but worth checking on real data):

SELECT recommendation_id, count(*)
FROM   plan_recommendations
GROUP  BY recommendation_id
HAVING count(*) > 1;
-- expect zero rows

Step 1 — column definition

Column Type Notes
plan_id bigint, FK → plan.id, ON DELETE CASCADE, indexed, nullable the Plan this measure belongs to. Nullable during transition; every new write sets it.
  • Index plan_id — the rebuild's idempotent replace deletes a Plan and relies on the cascade to remove its measures; reads fetch a Plan's measures by plan_id.
  • ON DELETE CASCADE makes "delete the Plan → its measures go too" a single statement, replacing the m2m cleanup.

This repo's part (all of steps 36, gated on 1+2 being live)

The user's instruction is to implement the backend end-to-end as if the FE has already applied steps 1 and 2 (the plan_id column exists and is backfilled). Concretely, in backend/ + the rebuild:

  • The plan/recommendation/scenario/installed-measure models are consolidated into infrastructure/postgres/modelling/ as single SQLModel definitions (…Row), recommendation carrying plan_id; backend/app/db/models/recommendations.py becomes a re-export shim (ADR-0017 amendment).
  • Writers set plan_id; readers join on plan_id; the m2m write/cleanup and the PlanRecommendations model are removed.

Not changed here

No new contingency columns (per-measure contingency stays summed into plan.contingency_cost); no phase column (multi-phase deferred, ADR-0005). The etl/ and sfr/ reporting scripts that read the m2m are out of scope — handled in a later pass.