docs(modelling): record Valuation Uplift design (ADR-0018 + glossary)

From the grill-with-docs pass on the depth+scale phase. Splits the
overloaded "valuation" into two glossary terms — Property Valuation
(current market value, a Baseline attribute, mostly missing) and
Valuation Uplift (plan-conditional, percentage-primary; absolute £ only
when a Property Valuation exists, 2x ROI cap on the £ form). ADR-0018
records the percentage-primary decision and why (the EPC scale corpus
has no market values, so a value-primary model produces nothing), plus
the deferred sourcing / per-measure / rental-yield items.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Khalim Conn-Kowlessar 2026-06-04 07:52:33 +00:00
parent d5f1fc335b
commit 26d7fc036e
2 changed files with 52 additions and 0 deletions

View file

@ -234,6 +234,16 @@ _Avoid_: selected measures, default measures, optimal solution, recommended bund
The catalogue classification of a retrofit measure (e.g. `solar_pv`, `loft_insulation`, `ashp`); one or more Recommendations reference the same Measure Type with property-specific cost and impact.
_Avoid_: measure (ambiguous), category
### Valuation
**Property Valuation**:
The current open-market value of a Property — an externally-sourced **Baseline** attribute (customer upload or, later, an estimate), **absent for most Properties** and never derived from the EPC.
_Avoid_: valuation (ambiguous with Valuation Uplift), market price, current value, house price
**Valuation Uplift**:
The estimated increase in a Property's market value produced by a **Plan's** retrofit — **plan-conditional** (it depends on the Plan's target **EPC Band**) and **percentage-primary**: always expressible as a % from the Band jump (current → target), and as an absolute £ amount **only when a Property Valuation is known**. Capped so the £ uplift never exceeds twice the Plan's cost (the cap can only bite once a Property Valuation supplies the £ form — see ADR-0018).
_Avoid_: valuation increase, value gain, financial uplift, property_valuation_increase (pick one — Valuation Uplift is canonical)
### Address matching
**Lexiscore**:
@ -297,6 +307,7 @@ _Avoid_: API key, auth token, secret
- Triggering the model against N **Scenarios** produces N **Plans** per Property. Each **Plan** holds one **Optimised Package** — its selected **Plan Measures** — plus the Property's post-retrofit figures.
- A **Scenario Snapshot** is pinned at trigger time per (task, scenario) so mid-run edits to the live Scenario do not affect an in-flight modelling job.
- A **Recommendation** references one **Measure Type** and carries property-specific cost and impact.
- A **Property Valuation** (current market value) is a Baseline attribute and is mostly absent; a **Valuation Uplift** is a Plan output, always a percentage from the **EPC Band** jump and an absolute £ only when a Property Valuation exists.
- **Address Matching** uses a **User Address** and **Postcode** to find a **UPRN** by scoring **UPRN Candidates** from an EPC search. A **Lexirank** of 1 with no **Ambiguous Match** and a **Lexiscore** ≥ the **Score Threshold** produces a **Best Match**.
## Example dialogue
@ -333,4 +344,5 @@ _Avoid_: API key, auth token, secret
- **"EPC"** is overloaded as both the document and the rating band letter. Use **EPC** for the document, **EPC Band** for the letter.
- **"re-scoring"** has two meanings in the codebase — **Rebaselining** (re-predicting baseline performance after an EPC change) and post-optimisation measure re-prediction. Prefer **Rebaselining** for the former; for the latter, the **Optimiser Service** step does its own scoring without a special name.
- **"phase"** (sequencing measures into ordered steps within a Scenario/Plan) was a speculative, prospective-client feature and is **deferred — out of scope** (see ADR-0005). It is *not* a current domain term: a **Scenario** carries one set of measures, a **Plan** one **Optimised Package**. The only live use of "phase" is cut-over timeline language in the PRD ("Phase 0 — Status quo"), which is project-management vocabulary and does not enter code.
- **"valuation"** was used for both a Property's current market value and the increase a retrofit produces — resolved into two distinct terms: **Property Valuation** (current value, a Baseline attribute) and **Valuation Uplift** (the plan-conditional, percentage-primary increase). The bare word "valuation" should be qualified to one of these.
- **"stale"** appears in two senses: cache-freshness ("a Repo record is stale and the orchestrator should refetch") — a legitimate operational concept; and as loose shorthand for the EPC's recorded cost fields being unusable. The cost fields are not stale — they are pinned to the inspection-date fuel rates by design. Use "pinned to inspection date" or "pre-SAP10 schema" (whichever applies) instead.

View file

@ -0,0 +1,40 @@
# Valuation Uplift is percentage-primary
The Modelling rebuild needs a financial-uplift output (the increase in a Property's
market value from a retrofit). The legacy model (`backend/ml_models/Valuation.py`) is
**value-primary**: it starts from a current market value and returns absolute pounds.
But that current value — a **Property Valuation** — is sourced from a customer upload
(or a ~93-entry hardcoded demo stub) and is **absent for the overwhelming majority of
Properties**, including every property in an EPC-only scale corpus. A value-primary
model therefore produces nothing for almost all inputs.
We model **Valuation Uplift** as **percentage-primary** instead: the uplift is computed
purely from the **EPC Band** jump (current → target) and is always returned as a
percentage; the absolute £ form (`lower/upper/average_value`, `post_retrofit_value`) is
derived **only when a Property Valuation is supplied**, otherwise left `None`. This means
every Plan gets an inspectable uplift even with no market value, and it cleanly separates
the two concepts the word "valuation" was blurring — the externally-sourced **Property
Valuation** (a Baseline attribute) from the plan-conditional **Valuation Uplift** (a Plan
output). The domain function lives in `domain/modelling/valuation.py` (Modelling is the
consumer that knows the target band; relocatable to a neutral package later, as
`domain/billing/` was, if Baseline takes ownership of Property Valuation).
## Consequences
- The percentage uplift compounds the legacy's four hardcoded broker tables
(MoneySupermarket, Lloyds, Knight Frank, Rightmove), taking min/max/average across the
sources that cover the band step. These 2022-era figures are ported verbatim as
committed reference data; they are a provenance snapshot, not a live source.
- The **2× ROI cap** (uplift ≤ twice the retrofit cost) is a £ comparison, so it can only
bite once a Property Valuation supplies the £ form; the bare percentages are uncapped.
- The model is a pure function of the before/after **EPC Band** — it does **not** use the
continuous SAP score, so it needs no precision work beyond the band the Plan already
computes.
## Deferred (not in this phase)
- **Property Valuation sourcing** — the upload-CSV ingestion slice, the Property field +
persisted column, and the decision to retire or keep the demo `UPRN_VALUE_LOOKUP` stub.
Where it persists (Baseline/performance table vs. a separate valuation table) is open.
- **Per-measure `property_valuation_increase`** and **`rental_yield_increase`** — the
legacy path never populated either; uplift is a plan-level figure for now.