diff --git a/CONTEXT.md b/CONTEXT.md index 36ae6d4c..2e8c5d00 100644 --- a/CONTEXT.md +++ b/CONTEXT.md @@ -234,6 +234,16 @@ _Avoid_: selected measures, default measures, optimal solution, recommended bund The catalogue classification of a retrofit measure (e.g. `solar_pv`, `loft_insulation`, `ashp`); one or more Recommendations reference the same Measure Type with property-specific cost and impact. _Avoid_: measure (ambiguous), category +### Valuation + +**Property Valuation**: +The current open-market value of a Property — an externally-sourced **Baseline** attribute (customer upload or, later, an estimate), **absent for most Properties** and never derived from the EPC. +_Avoid_: valuation (ambiguous with Valuation Uplift), market price, current value, house price + +**Valuation Uplift**: +The estimated increase in a Property's market value produced by a **Plan's** retrofit — **plan-conditional** (it depends on the Plan's target **EPC Band**) and **percentage-primary**: always expressible as a % from the Band jump (current → target), and as an absolute £ amount **only when a Property Valuation is known**. Capped so the £ uplift never exceeds twice the Plan's cost (the cap can only bite once a Property Valuation supplies the £ form — see ADR-0018). +_Avoid_: valuation increase, value gain, financial uplift, property_valuation_increase (pick one — Valuation Uplift is canonical) + ### Address matching **Lexiscore**: @@ -297,6 +307,7 @@ _Avoid_: API key, auth token, secret - Triggering the model against N **Scenarios** produces N **Plans** per Property. Each **Plan** holds one **Optimised Package** — its selected **Plan Measures** — plus the Property's post-retrofit figures. - A **Scenario Snapshot** is pinned at trigger time per (task, scenario) so mid-run edits to the live Scenario do not affect an in-flight modelling job. - A **Recommendation** references one **Measure Type** and carries property-specific cost and impact. +- A **Property Valuation** (current market value) is a Baseline attribute and is mostly absent; a **Valuation Uplift** is a Plan output, always a percentage from the **EPC Band** jump and an absolute £ only when a Property Valuation exists. - **Address Matching** uses a **User Address** and **Postcode** to find a **UPRN** by scoring **UPRN Candidates** from an EPC search. A **Lexirank** of 1 with no **Ambiguous Match** and a **Lexiscore** ≥ the **Score Threshold** produces a **Best Match**. ## Example dialogue @@ -333,4 +344,5 @@ _Avoid_: API key, auth token, secret - **"EPC"** is overloaded as both the document and the rating band letter. Use **EPC** for the document, **EPC Band** for the letter. - **"re-scoring"** has two meanings in the codebase — **Rebaselining** (re-predicting baseline performance after an EPC change) and post-optimisation measure re-prediction. Prefer **Rebaselining** for the former; for the latter, the **Optimiser Service** step does its own scoring without a special name. - **"phase"** (sequencing measures into ordered steps within a Scenario/Plan) was a speculative, prospective-client feature and is **deferred — out of scope** (see ADR-0005). It is *not* a current domain term: a **Scenario** carries one set of measures, a **Plan** one **Optimised Package**. The only live use of "phase" is cut-over timeline language in the PRD ("Phase 0 — Status quo"), which is project-management vocabulary and does not enter code. +- **"valuation"** was used for both a Property's current market value and the increase a retrofit produces — resolved into two distinct terms: **Property Valuation** (current value, a Baseline attribute) and **Valuation Uplift** (the plan-conditional, percentage-primary increase). The bare word "valuation" should be qualified to one of these. - **"stale"** appears in two senses: cache-freshness ("a Repo record is stale and the orchestrator should refetch") — a legitimate operational concept; and as loose shorthand for the EPC's recorded cost fields being unusable. The cost fields are not stale — they are pinned to the inspection-date fuel rates by design. Use "pinned to inspection date" or "pre-SAP10 schema" (whichever applies) instead. diff --git a/docs/adr/0018-valuation-uplift-percentage-primary.md b/docs/adr/0018-valuation-uplift-percentage-primary.md new file mode 100644 index 00000000..b6328adb --- /dev/null +++ b/docs/adr/0018-valuation-uplift-percentage-primary.md @@ -0,0 +1,40 @@ +# Valuation Uplift is percentage-primary + +The Modelling rebuild needs a financial-uplift output (the increase in a Property's +market value from a retrofit). The legacy model (`backend/ml_models/Valuation.py`) is +**value-primary**: it starts from a current market value and returns absolute pounds. +But that current value — a **Property Valuation** — is sourced from a customer upload +(or a ~93-entry hardcoded demo stub) and is **absent for the overwhelming majority of +Properties**, including every property in an EPC-only scale corpus. A value-primary +model therefore produces nothing for almost all inputs. + +We model **Valuation Uplift** as **percentage-primary** instead: the uplift is computed +purely from the **EPC Band** jump (current → target) and is always returned as a +percentage; the absolute £ form (`lower/upper/average_value`, `post_retrofit_value`) is +derived **only when a Property Valuation is supplied**, otherwise left `None`. This means +every Plan gets an inspectable uplift even with no market value, and it cleanly separates +the two concepts the word "valuation" was blurring — the externally-sourced **Property +Valuation** (a Baseline attribute) from the plan-conditional **Valuation Uplift** (a Plan +output). The domain function lives in `domain/modelling/valuation.py` (Modelling is the +consumer that knows the target band; relocatable to a neutral package later, as +`domain/billing/` was, if Baseline takes ownership of Property Valuation). + +## Consequences + +- The percentage uplift compounds the legacy's four hardcoded broker tables + (MoneySupermarket, Lloyds, Knight Frank, Rightmove), taking min/max/average across the + sources that cover the band step. These 2022-era figures are ported verbatim as + committed reference data; they are a provenance snapshot, not a live source. +- The **2× ROI cap** (uplift ≤ twice the retrofit cost) is a £ comparison, so it can only + bite once a Property Valuation supplies the £ form; the bare percentages are uncapped. +- The model is a pure function of the before/after **EPC Band** — it does **not** use the + continuous SAP score, so it needs no precision work beyond the band the Plan already + computes. + +## Deferred (not in this phase) + +- **Property Valuation sourcing** — the upload-CSV ingestion slice, the Property field + + persisted column, and the decision to retire or keep the demo `UPRN_VALUE_LOOKUP` stub. + Where it persists (Baseline/performance table vs. a separate valuation table) is open. +- **Per-measure `property_valuation_increase`** and **`rental_yield_increase`** — the + legacy path never populated either; uplift is a plan-level figure for now.