diff --git a/CONTEXT.md b/CONTEXT.md index 4634df4f..9cf31602 100644 --- a/CONTEXT.md +++ b/CONTEXT.md @@ -227,7 +227,7 @@ A "selecting A requires B" edge between **Recommendations**, for couplings that _Avoid_: best-practice measure (legacy term), forced measure **Optimised Package**: -The subset of a Property's Recommendations selected by the Optimiser Service for installation, chosen to satisfy the Scenario's goal subject to budget. +The subset of a Property's Recommendations selected by the Optimiser Service for installation. For an **Increasing EPC** goal the objective is **least-cost-to-target**: the cheapest package that reaches the goal band — so it **stops at the target and does not overshoot** into a higher band, leaving surplus budget unspent. When the target is **unreachable within budget**, it falls back to the **maximum improvement the budget buys** (best effort, below target). With **no budget** it is simply the cheapest package that reaches the target. Reaching the target is judged on the **true whole-package re-score** (ADR-0016), not on summed per-measure scores. (Other goals — Energy Savings, Reducing CO₂ — don't yet set a target and currently maximise improvement within budget; future work.) _Avoid_: selected measures, default measures, optimal solution, recommended bundle **Measure Type**: diff --git a/docs/adr/0016-package-rescore-over-warm-start-optimisation.md b/docs/adr/0016-package-rescore-over-warm-start-optimisation.md index 6b1395ea..2c87af1e 100644 --- a/docs/adr/0016-package-rescore-over-warm-start-optimisation.md +++ b/docs/adr/0016-package-rescore-over-warm-start-optimisation.md @@ -20,3 +20,24 @@ This resolves the open question deferred in **ADR-0005 §14**. - Per-Option scores are *approximate by design* (independent-vs-baseline) and must never be persisted or surfaced as a measure's "true" impact — only the package re-score is truthful. Measure-level impact shown to users is derived from the final scored package, not from step A. - **Three distinct scoring roles, each with one job:** (1) per-Option independent-vs-baseline → optimiser *input* (approximate signal, never surfaced); (2) whole-package re-score → truthful *package total*; (3) **final-package marginal cascade** → per-measure *attribution* for display. Role 3 runs only on the *selected* set, applied in **best-practice prescribed order** (walls → roof → ventilation → … per the legacy `Recommendations` class), so `attribution(mᵢ) = score(m₁..mᵢ) − score(m₁..mᵢ₋₁)`; the marginals **telescope exactly to the package total** (role 2) with no residual. The "drop a middle measure" inaccuracy cannot occur because the actual final set is scored, not a hypothetical. The selected package is the cascade unit; ordering within it follows the best-practice sequence. - **The package-scoring primitive is reusable.** "Compose selected overlays → throwaway `EpcPropertyData` → calculator" serves both the optimiser's package re-score (role 2) and a future endpoint that re-scores a *user-assembled* plan live (the FE toggling Rolled-over Options on/off). Because the calculator is fast, live re-score is the **accurate** path the moment a user deviates from the optimiser's selection. Note the trap this avoids: summing stored per-measure figures across a user-edited selection re-introduces the sub-additivity overestimate — a user-edited plan must be re-scored as a package, never summed from stored attributions. + +## Amendment (2026-06-03): the optimiser objective is **least-cost-to-target**, not maximum gain + +The original decision above got the **warm-start objective wrong**. It framed the grouped knapsack as *maximise SAP gain subject to budget* and the target as a *floor* the repair tops up to. The rebuild faithfully implemented that — and it is the wrong objective. The legacy `StrategicOptimiser.solve()` (`recommendations/optimiser/StrategicOptimiser.py`, **Case 1**) is the intended behaviour, and it is the opposite primary objective: + +> **min cost** subject to `gain ≥ target` **and** `cost ≤ budget`; only if that is infeasible, **max gain** subject to `cost ≤ budget`. + +For an **Increasing EPC** goal the objective is therefore **least-cost-to-target** — the cheapest package that reaches the goal band. This is the common case (most users want "reach band C as cheaply as possible," not "spend the budget for maximum SAP"). + +- **No budget** → cheapest package that reaches the target, no spend cap (legacy Case 3). +- **Budget, target reachable** → cheapest package that reaches the target band; it **stops at the target and does not overshoot** into a higher band, leaving surplus budget unspent (the "don't overshoot" property falls out of cost-minimisation — you stop at the cheapest package in band C, so you never climb into B). The within-band headroom is *not* maximised — least cost wins, e.g. SAP 70 @ £2k is chosen over SAP 75 @ £3k. +- **Budget, target unreachable** → fall back to **maximum improvement within budget** (best effort below target). "Unreachable" is judged on the **true re-scored** SAP after repair, not the signal. +- Goals **other than Increasing EPC** set no target and stay max-gain-within-budget (a separate deferred front). + +**What is unchanged:** the warm-start-on-signal → inject dependencies → re-score-for-truth → greedy-repair structure, the three scoring roles, and the dependency-injection rule all stand. We **keep** the signal-based warm-start (and re-score+repair) rather than exhaustively re-scoring every candidate package, for the same scalability reason the original rejected full enumeration — the cross-product is tiny at fabric-only scale today but explodes as heating/PV/windows land. Only the warm-start's *selection rule* changes (min-cost-to-target instead of max-gain), plus the two points below. + +**Target predicate.** Reaching the target is `sap_continuous ≥ band_floor` (e.g. ≥ 69.0 for C) — the continuous band floor, the conservative choice (it sits ~0.5 SAP above the rounding threshold of 68.5, so the rounded SAP lands safely in band). The legacy `allow_slack` buffer is **not** carried over: it existed to hedge the MILP's approximate summed gains, a hedge our re-score + repair already provides. Combined with the "recommend slightly more than land short" preference, the conservative floor + repair-to-true-target reliably hit the band, often with a little headroom, while the *recommended* cost remains a safe over-estimate. + +**Ventilation-aware selection.** Because a forced Measure Dependency (ventilation) carries a real cost (~£900) and a negative SAP (typically −1 to −3, occasionally −5), the warm-start must **price the dependency it will trigger**, not just inject it afterwards. So the dependency is folded into each candidate during selection (via the same `_inject`, with the ventilation Option carrying a real negative role-1 signal instead of a `0.0` placeholder) — otherwise the min-cost selection (i) ignores the £900 a wall drags in, so a wall-free package that reaches target can be cheaper than the "least-cost" pick, and (ii) at large negative ventilation can select a small-gain wall whose mandatory ventilation makes it net-negative, which repair cannot un-pick. **Enforcement is now in two places:** *presence* — `_inject` on the final selected set on every path (warm-start, each repair step, max-gain fallback), guaranteeing ventilation whenever a trigger is present; *awareness* — the same `_inject` folded into candidate evaluation so the objective prices it. Presence was always guaranteed by ADR-0016; awareness is the new part. + +This supersedes the original framing of the warm-start objective (lines above describing "maximise gain … undershoots the goal") and the "re-solving the MILP" fallback note; the rest of ADR-0016 stands.