diff --git a/docs/adr/0017-plan-persistence-evolve-live-tables.md b/docs/adr/0017-plan-persistence-evolve-live-tables.md index b3e51c1a..44fd6ef8 100644 --- a/docs/adr/0017-plan-persistence-evolve-live-tables.md +++ b/docs/adr/0017-plan-persistence-evolve-live-tables.md @@ -24,6 +24,14 @@ The rebuild's persistence convention is SQLModel `table=True` rows in `infrastru - **Keep the `plan_recommendations` m2m** — rejected: the join's cascade delete is the known performance killer this change exists to remove. - **JSONB blob for the package** — rejected: the FE queries per-measure columns; flat typed columns are the existing contract. +## Amendment (2026-06-03) — retire `plan_recommendations`, consolidate the models + +The original decision *retired the m2m for new writes* but left it in place. This amendment **drops it** and **consolidates the model definitions**, decided in a `/grill-with-docs` session: + +- **`plan_recommendations` is dropped.** All readers (`portfolio_functions`, `Outputs`, `export/property_scenarios`) and writers are cut onto `recommendation.plan_id`. The m2m is one-to-many in practice (a measure is never shared across Plans), so a single FK models it faithfully. The cross-repo expand/contract sequence (add → backfill → dual-write → cut reads → drop) is specified in [docs/migrations/recommendation-plan-id.md](../migrations/recommendation-plan-id.md); the two load-bearing rules are **backfill before any read cuts over** and **dual-write the m2m until all reads are off it** (the FE reads via Drizzle directly, so the repos cannot deploy atomically). +- **One model per physical table, in `infrastructure/postgres/modelling/`.** The drift hazard the original ADR accepted (two ORM definitions of `plan`/`recommendation`) is resolved by **consolidating the whole `backend/app/db/models/recommendations.py` cluster** (`plan`, `recommendation`, `recommendation_materials`, `scenario`, `installed_measure`, the `PlanType`/`MeasureType` enums) into single **SQLModel** `…Row` definitions in a new `infrastructure/postgres/modelling/` subpackage, carrying full legacy column parity plus `recommendation.plan_id`. `backend/app/db/models/recommendations.py` becomes a **re-export shim** (the established `epc_property.py` pattern), aliasing the legacy names to the canonical `…Row` classes so `backend/` callers keep working. The rebuild's partial `PlanRow`/`RecommendationRow`/`ScenarioRow` mirrors are absorbed into these. +- **Scope:** `backend/` + the rebuild only. The `etl/` and `sfr/` reporting scripts that read the m2m are deferred to a later pass. + ## Consequences - **Two ORM definitions of `plan`/`recommendation`** coexist (legacy SQLAlchemy + new SQLModel mirror), a drift hazard — mitigated by this being the established mirror pattern and the physical table being the single contract. Retiring the legacy models is later, separate work. diff --git a/docs/migrations/recommendation-plan-id.md b/docs/migrations/recommendation-plan-id.md index a9a42bd2..887f788f 100644 --- a/docs/migrations/recommendation-plan-id.md +++ b/docs/migrations/recommendation-plan-id.md @@ -1,28 +1,62 @@ -# `recommendation.plan_id` — FE-owned migration +# Retire `plan_recommendations` — link measures by `recommendation.plan_id` -**Context:** #1157 of the Modelling-stage rebuild. The `ModellingOrchestrator` persists a **Plan** and its selected **Plan Measures** (rows of the live `recommendation` table). To link a measure to its Plan it adds **`recommendation.plan_id`**, replacing the `plan_recommendations` many-to-many join for new writes (the m2m's cascade delete is pathologically slow — see [ADR-0017](../adr/0017-plan-persistence-evolve-live-tables.md)). +**Context:** Modelling-stage rebuild. The `ModellingOrchestrator` persists a **Plan** and its selected **Plan Measures** (rows of the live `recommendation` table). A measure belongs to exactly one Plan, so the `plan_recommendations` many-to-many is replaced by a direct **`recommendation.plan_id`** FK and then **dropped**. The m2m's cascade delete is the known performance killer this change removes (see [ADR-0017](../adr/0017-plan-persistence-evolve-live-tables.md)). -The SQLModel mirror is defined in `infrastructure/postgres/` so the ephemeral-Postgres tests build it via `SQLModel.metadata.create_all`. The **production migration is FE-owned (Drizzle ORM)**. +The plan/recommendation/scenario tables are read **directly by the Drizzle FE** and written by both the legacy `engine.py` path and the rebuild. So this is an expand/contract migration on a live, **two-repo** (Python backend + Drizzle FE) schema. The **DB migrations are FE-owned (Drizzle)**; this doc is their spec and pins the ordering so the repos stay in step. -## Change +## Cardinality -Add one column to the existing `recommendation` table: +`plan_recommendations` is **one-to-many in practice, never many-to-many**: both writers (`upload_recommendations`, `bulk_upload_recommendations_and_materials`) create *fresh* `recommendation` rows per Plan and link each to a single `plan_id`. A recommendation is never shared across Plans, so a single `recommendation.plan_id` FK models reality faithfully and the backfill is a clean 1:1. + +## Sequence (expand → backfill → migrate reads → contract) + +The two hard rules: **backfill before any reader cuts to `plan_id`** (else every historical Plan — all `plan_id = NULL`, linked only via the m2m — vanishes from the FE), and **dual-write the m2m through the transition** (so backend and FE reads can each cut to `plan_id` independently, in any order, with zero breakage; the m2m write is removed only at the end). + +| # | Step | Owner | Safe because | +|---|---|---|---| +| 1 | **Add `recommendation.plan_id`** — `bigint`, FK → `plan.id`, **`ON DELETE CASCADE`**, indexed, **nullable** | FE (Drizzle) | additive; legacy rows keep `NULL` | +| 2 | **Backfill** `plan_id` from the m2m (see SQL below) | FE (Drizzle data migration) | every existing measure gets its Plan before any read cuts over | +| 3 | **Dual-write**: writers set `plan_id` **and** keep writing the m2m | backend | both old (m2m) and new (`plan_id`) readers work | +| 4 | **Cut reads to `plan_id`** — backend (`portfolio_functions`, `Outputs`, `export/property_scenarios`) **and** the Drizzle FE | backend + FE | backfill (2) means no NULLs; dual-write (3) means order between repos is free | +| 5 | **Stop writing the m2m** | backend | no reader uses it after (4) | +| 6 | **Drop `plan_recommendations`** | FE (Drizzle) + backend (remove model) | unreferenced after (5) | + +### Backfill SQL (step 2) + +```sql +UPDATE recommendation r +SET plan_id = pr.plan_id +FROM plan_recommendations pr +WHERE pr.recommendation_id = r.id + AND r.plan_id IS NULL; +``` + +Guard before dropping the m2m: assert no recommendation maps to more than one Plan (a data anomaly the writers can't produce, but worth checking on real data): + +```sql +SELECT recommendation_id, count(*) +FROM plan_recommendations +GROUP BY recommendation_id +HAVING count(*) > 1; +-- expect zero rows +``` + +## Step 1 — column definition | Column | Type | Notes | |---|---|---| -| `plan_id` | bigint, FK → `plan.id`, **`ON DELETE CASCADE`**, indexed | the Plan this measure belongs to. Nullable during transition (legacy rows predate it); new writes always set it. | +| `plan_id` | bigint, FK → `plan.id`, **`ON DELETE CASCADE`**, indexed, nullable | the Plan this measure belongs to. Nullable during transition; every new write sets it. | -- **Index `plan_id`** — the orchestrator's idempotent replace deletes a Plan and relies on the cascade to remove its measures; reads fetch a Plan's measures by `plan_id`. -- **`ON DELETE CASCADE`** is what makes "delete the Plan → its measures go too" a single statement, replacing the m2m cleanup. +- **Index `plan_id`** — the rebuild's idempotent replace deletes a Plan and relies on the cascade to remove its measures; reads fetch a Plan's measures by `plan_id`. +- **`ON DELETE CASCADE`** makes "delete the Plan → its measures go too" a single statement, replacing the m2m cleanup. -## Transition / sequencing +## This repo's part (all of steps 3–6, gated on 1+2 being live) -1. **Add `plan_id` (nullable)** — this migration. New `ModellingOrchestrator` writes populate it; legacy writers and existing rows are unaffected. -2. **Cut legacy readers** off `plan_recommendations` onto `plan_id` (separate work, not in #1157). -3. **Drop `plan_recommendations`** once no reader remains (separate migration). +The user's instruction is to implement the backend end-to-end **as if the FE has already applied steps 1 and 2** (the `plan_id` column exists and is backfilled). Concretely, in `backend/` + the rebuild: -Existing live `recommendation` rows keep `plan_id = NULL` until/unless re-modelled; they remain reachable via the legacy `plan_recommendations` join during the transition. +- The plan/recommendation/scenario/installed-measure models are **consolidated into `infrastructure/postgres/modelling/`** as single SQLModel definitions (`…Row`), `recommendation` carrying `plan_id`; `backend/app/db/models/recommendations.py` becomes a re-export shim (ADR-0017 amendment). +- Writers set `plan_id`; readers join on `plan_id`; the m2m write/cleanup and the `PlanRecommendations` model are removed. ## Not changed here -No new columns for contingency (per-measure contingency stays summed into `plan.contingency_cost`, matching legacy), no `phase` column (multi-phase deferred, ADR-0005), and the energy/bill columns are populated by a later Bill Derivation slice (ADR-0017). +No new contingency columns (per-measure contingency stays summed into `plan.contingency_cost`); no `phase` column (multi-phase deferred, ADR-0005). The `etl/` and `sfr/` reporting scripts that read the m2m are **out of scope** — handled in a later pass.