diff --git a/docs/migrations/recommendation-material-id.md b/docs/migrations/recommendation-material-id.md new file mode 100644 index 00000000..eee00195 --- /dev/null +++ b/docs/migrations/recommendation-material-id.md @@ -0,0 +1,45 @@ +# Retire `recommendation_materials` — reference the Product by `recommendation.material_id` + +**Context:** Modelling-stage rebuild. A Plan Measure installs a single **Product**, so the per-material `recommendation_materials` child table (depth / quantity / quantity_unit / estimated_cost per row) is replaced by a single **`recommendation.material_id`** on the row and then **dropped**. Same motivation and shape as the [`plan_recommendations` retirement](./recommendation-plan-id.md): the child table's cascade-delete + indexes are a known performance killer on large deletes. The `plan`/`recommendation`/`recommendation_materials` tables are read directly by the Drizzle FE and written by both the legacy `engine.py` path and the rebuild, so this is an **expand/contract migration on a live, two-repo schema**. The **DB migrations are FE-owned (Drizzle)**; this doc pins the ordering so the repos stay in step. See [ADR-0017](../adr/0017-plan-persistence-evolve-live-tables.md). + +## Cardinality + +`recommendation_materials` is **one-to-many in practice** (one recommendation → its material lines), but for the four modelled fabric measures each Option installs exactly **one** Product, so a single `recommendation.material_id` models reality faithfully. A future *bundle* Option that genuinely needs multiple Products (e.g. boiler + cylinder insulation) is out of scope and revisited when those measures land — it is a new decision, not a regression of this one. + +## Status + +**Expand half landed in the backend** (this branch): the `ModellingOrchestrator` now threads the catalogue id `Product.id → MeasureOption.material_id → PlanMeasure.material_id → recommendation.material_id`, and `RecommendationModel` declares the column. The repo SQLModel is a **read-only mirror** — it does not migrate the live DB. + +**The contraction is the owner's, starting next (with its own ADR):** cut the legacy writers (`recommendations_functions.upload_recommendations` / `bulk_upload_recommendations_and_materials`) off `recommendation_materials`, backfill `material_id`, drop the child table, and decide the disposition of `depth` / `quantity` / `quantity_unit` (kept-for-reference vs dropped — see the grilling notes; `quantity` has reference value). + +## Sequence (expand → backfill → migrate reads → contract) + +The hard rule: **add the `material_id` column live before the backend that writes it deploys** (else the rebuild's `recommendation` INSERT fails on an unknown column). + +| # | Step | Owner | Safe because | +|---|---|---|---| +| 1 | **Add `recommendation.material_id`** — `bigint`, indexed, **nullable**, no FK constraint (mirror convention; the live FK to `material` is the FE's call) | FE (Drizzle) | additive; legacy rows keep `NULL` | +| 2 | **Deploy the rebuild backend** (writes `material_id` from the catalogue) | backend | column exists from (1); nullable so unbilled / JSON-catalogue measures write `NULL` | +| 3 | **Backfill** `material_id` from `recommendation_materials` (single-material rows) | FE (Drizzle data migration) | every existing measure gets its Product before any read cuts over | +| 4 | **Cut FE reads** off `recommendation_materials` onto `material_id` | FE | backfill (3) means no NULLs for single-material measures | +| 5 | **Stop writing `recommendation_materials`** (legacy writers) | backend | no reader uses it after (4) | +| 6 | **Drop `recommendation_materials`** + remove the `RecommendationMaterialModel` mirror | FE (Drizzle) + backend | unreferenced after (5) | + +### Backfill SQL sketch (step 3) + +```sql +UPDATE recommendation r +SET material_id = rm.material_id +FROM recommendation_materials rm +WHERE rm.recommendation_id = r.id + AND r.material_id IS NULL; +``` + +Guard before dropping the child table: assert no recommendation maps to more than one material (the modelled fabric measures never produce this; worth checking on real data before the drop): + +```sql +SELECT recommendation_id, count(*) +FROM recommendation_materials +GROUP BY recommendation_id +HAVING count(*) > 1; +``` diff --git a/domain/modelling/generators/floor_recommendation.py b/domain/modelling/generators/floor_recommendation.py index f2360677..b078a230 100644 --- a/domain/modelling/generators/floor_recommendation.py +++ b/domain/modelling/generators/floor_recommendation.py @@ -87,5 +87,6 @@ def recommend_floor_insulation( } ), cost=cost, + material_id=product.id, ) return Recommendation(surface="Ground floor", options=(option,)) diff --git a/domain/modelling/generators/roof_recommendation.py b/domain/modelling/generators/roof_recommendation.py index aa09b620..6b689733 100644 --- a/domain/modelling/generators/roof_recommendation.py +++ b/domain/modelling/generators/roof_recommendation.py @@ -59,5 +59,6 @@ def recommend_loft_insulation( } ), cost=cost, + material_id=product.id, ) return Recommendation(surface="Roof", options=(option,)) diff --git a/domain/modelling/generators/ventilation_recommendation.py b/domain/modelling/generators/ventilation_recommendation.py index f3eaebec..0fed5c7b 100644 --- a/domain/modelling/generators/ventilation_recommendation.py +++ b/domain/modelling/generators/ventilation_recommendation.py @@ -53,6 +53,7 @@ def recommend_ventilation( ventilation=VentilationOverlay(mechanical_ventilation_kind=_MEV_KIND) ), cost=cost, + material_id=product.id, ) return Recommendation(surface="Ventilation", options=(option,)) diff --git a/domain/modelling/generators/wall_recommendation.py b/domain/modelling/generators/wall_recommendation.py index 6b263ba2..86c81f41 100644 --- a/domain/modelling/generators/wall_recommendation.py +++ b/domain/modelling/generators/wall_recommendation.py @@ -63,5 +63,6 @@ def recommend_cavity_wall( } ), cost=cost, + material_id=product.id, ) return Recommendation(surface="Main wall", options=(option,)) diff --git a/domain/modelling/plan.py b/domain/modelling/plan.py index cfaaf9ff..7016ecda 100644 --- a/domain/modelling/plan.py +++ b/domain/modelling/plan.py @@ -38,6 +38,10 @@ class PlanMeasure: impact: MeasureImpact kwh_savings: Optional[float] = None energy_cost_savings: Optional[float] = None + # The catalogue id of the Product installed (from the selected Option), + # persisted as ``recommendation.material_id``. None when priced from a + # catalogue with no ids. + material_id: Optional[int] = None @dataclass(frozen=True) diff --git a/domain/modelling/product.py b/domain/modelling/product.py index fe5c78f3..afd46897 100644 --- a/domain/modelling/product.py +++ b/domain/modelling/product.py @@ -7,6 +7,7 @@ not "material". Read via a `ProductRepository`. """ from dataclasses import dataclass +from typing import Optional @dataclass(frozen=True) @@ -14,3 +15,8 @@ class Product: measure_type: str unit_cost_per_m2: float contingency_rate: float + # The catalogue row id, threaded onto the persisted Plan Measure as + # ``recommendation.material_id`` (the single-material reference that replaces + # the retired ``recommendation_materials`` BOM). Optional: the JSON + # stopgap catalogue carries no ids. + id: Optional[int] = None diff --git a/domain/modelling/recommendation.py b/domain/modelling/recommendation.py index 4a287ee9..96331c44 100644 --- a/domain/modelling/recommendation.py +++ b/domain/modelling/recommendation.py @@ -32,6 +32,10 @@ class MeasureOption: description: str overlay: EpcSimulation cost: Optional[Cost] = None + # The catalogue id of the Product this Option installs (Product.id), carried + # through to the persisted Plan Measure's ``material_id``. None when priced + # from a catalogue with no ids. + material_id: Optional[int] = None @dataclass(frozen=True) diff --git a/infrastructure/postgres/modelling/recommendation_table.py b/infrastructure/postgres/modelling/recommendation_table.py index 77af71fc..62b3ff6f 100644 --- a/infrastructure/postgres/modelling/recommendation_table.py +++ b/infrastructure/postgres/modelling/recommendation_table.py @@ -49,6 +49,10 @@ class RecommendationModel(SQLModel, table=True): type: str measure_type: Optional[str] = Field(default=None) description: str + # The single Product this measure installs — the live ``material_id`` column + # that replaces the retired ``recommendation_materials`` BOM (one material + # per Plan Measure). Plain int, out-of-cluster (mirror convention). + material_id: Optional[int] = Field(default=None, index=True) estimated_cost: Optional[float] = Field(default=None) starting_u_value: Optional[float] = Field(default=None) new_u_value: Optional[float] = Field(default=None) @@ -75,6 +79,7 @@ class RecommendationModel(SQLModel, table=True): type=measure.measure_type, measure_type=measure.measure_type, description=measure.description, + material_id=measure.material_id, estimated_cost=measure.cost.total, sap_points=measure.impact.sap_points, co2_equivalent_savings=( diff --git a/orchestration/modelling_orchestrator.py b/orchestration/modelling_orchestrator.py index 7fc9b491..1a6ee852 100644 --- a/orchestration/modelling_orchestrator.py +++ b/orchestration/modelling_orchestrator.py @@ -259,4 +259,5 @@ def _plan_measure( impact=impact, kwh_savings=before.total_consumption_kwh - after.total_consumption_kwh, energy_cost_savings=before.total_gbp - after.total_gbp, + material_id=option.material_id, ) diff --git a/repositories/product/product_postgres_repository.py b/repositories/product/product_postgres_repository.py index 13926885..5a46348c 100644 --- a/repositories/product/product_postgres_repository.py +++ b/repositories/product/product_postgres_repository.py @@ -31,4 +31,5 @@ class ProductPostgresRepository(ProductRepository): measure_type=measure_type, unit_cost_per_m2=row.total_cost, contingency_rate=contingency_rate(measure_type), + id=row.id, ) diff --git a/tests/orchestration/test_ara_first_run_pipeline_integration.py b/tests/orchestration/test_ara_first_run_pipeline_integration.py index c830325c..9292ef09 100644 --- a/tests/orchestration/test_ara_first_run_pipeline_integration.py +++ b/tests/orchestration/test_ara_first_run_pipeline_integration.py @@ -302,6 +302,12 @@ def test_modelling_optimises_and_persists_a_multi_measure_plan( "suspended_floor_insulation", "mechanical_ventilation", } + # Each persisted measure carries the catalogue id of the Product it installs + # (the MaterialRow ids seeded above), replacing the retired + # recommendation_materials BOM with a single material_id on the row. + assert by_type["cavity_wall_insulation"].material_id == 1 + assert by_type["suspended_floor_insulation"].material_id == 2 + assert by_type["mechanical_ventilation"].material_id == 3 for rec in rec_rows: assert rec.default is True assert rec.already_installed is False