mirror of
https://github.com/Hestia-Homes/Model.git
synced 2026-06-08 11:17:27 +00:00
Adds ADR-0010 superseding ADR-0009's spec-version target, PCDB
sequencing, and cert-calibration layer. Captures the conclusions
of a grill-with-docs session:
1. Active spec target is SAP 10.2 (14-03-2025), not SAP 10.3 — no
SAP-10.3-lodged certs exist in the corpus to validate against.
2. table_12_cert_calibration is deleted (not "re-derived at the
end"). It was pre-March-2025 spec prices fit against a mixture
distribution of two spec-version regimes, with downstream-
component bugs absorbed into the fit — not Elmhurst deviation.
3. Validation Cohort: filter the corpus to inspection_date ≥
2025-07-01 so every cert in the probe was lodged on SAP 10.2
(14-03-2025) prices. One spec, one signal.
4. PCDB integration is promoted from "Session C deferred" to
prerequisite P4 — dominates residual variance on heat pumps and
the 78% of gas-boiler certs lodging main_heating_data_source=1.
5. Trace mode (SapResult.intermediate) and BRE worked-example
fixtures replace the 7 cert-based golden fixtures, which
contained compensating errors.
6. Strict-type EpcPropertyData via codes.csv-derived canonical
enums (P6) — the in-source motivation lives at
dimensions.py:74-82 (Khalim's comment, included in this commit).
7. Worksheet-faithful structure is a sweep-time principle: each
worksheet module mirrors SAP 10.2 worksheet line numbering.
CONTEXT.md additions:
- Refined "Calculated SAP10 Performance" and "SAP10 Calculation"
to reference SAP 10.2 + ADR-0010.
- New term "SAP Spec Version" — domain-meaningful because the
same EpcPropertyData yields different sap_score under different
spec revisions.
- New term "Validation Cohort" — the version-locked sub-corpus.
HANDOVER_SYSTEMATIC_REVIEW.md is rewritten section-by-section to
reflect ADR-0010: §1 framing, §2 status pointer, new §2.5 with the
six prerequisites P1–P6 in dependency order, §3 diagnosis (cert-cal
was stale prices, not Elmhurst deviation), §4 scope (PCDB IN,
SAP 10.3 stays OUT), §5 approach (worksheet-faithful principle as
§5.5), §7 tension dissolved, §7b findings re-framed, §8 dead-ends
re-classified as conditional, §9 cohort filter, §10 fixture
strategy, §11 trace mode as prerequisite, §12 prereqs-first,
§13 Phase 0/Phase 1 workflow, §14 ADR-0010 reference, §15 final
note.
P2.1 (commit ac1aa56a) already lands the first ADR-0010 slice
(probe swap to spec prices).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
306 lines
27 KiB
Markdown
306 lines
27 KiB
Markdown
# Ara
|
||
|
||
The Domna product for domestic retrofit modelling: ingests open-source EPC data, lets users correct or supersede it with their own surveys, and produces optimised retrofit packages for each property in a portfolio.
|
||
|
||
## Language
|
||
|
||
### Product
|
||
|
||
**Ara**:
|
||
The Domna product. Latin for "the altar"; named under Domna's classical-naming convention. Covers both the modelling product and the backend that powers it.
|
||
_Avoid_: ARA (acronym style), v2 backend, the new backend
|
||
|
||
**Domna**:
|
||
The company. Roman name; sibling to Ara in the same naming convention.
|
||
|
||
### Energy Performance Certificates
|
||
|
||
**EPC**:
|
||
An Energy Performance Certificate — a government-issued document rating a dwelling's energy efficiency from A (best) to G (worst).
|
||
_Avoid_: energy certificate, energy report
|
||
|
||
**Certificate Number**:
|
||
The unique identifier assigned to an EPC by the government registry.
|
||
_Avoid_: cert number, EPC ID
|
||
|
||
**Registration Date**:
|
||
The date an EPC was lodged with the government register; used to identify the most recent certificate for a property.
|
||
_Avoid_: assessment date, submission date
|
||
|
||
**EPC Band**:
|
||
A single letter A–G representing a property's current or potential energy efficiency rating.
|
||
_Avoid_: energy rating, EPC grade, EPC score
|
||
|
||
**Schema Type**:
|
||
The versioned RdSAP or SAP schema that describes the structure of an EPC's raw data (e.g. `RdSAP-Schema-21.0.1`).
|
||
_Avoid_: schema version, EPC format
|
||
|
||
**Domestic Certificate**:
|
||
An EPC issued for a residential dwelling, as opposed to a commercial one.
|
||
_Avoid_: residential EPC, home EPC
|
||
|
||
### Properties and addresses
|
||
|
||
**Property**:
|
||
The Ara domain aggregate representing a single dwelling under modelling: its identity, source data, enrichments, and modelling outputs.
|
||
_Avoid_: dwelling, unit, home, asset
|
||
|
||
**Properties**:
|
||
A first-class collection of Property objects; the unit of bulk operation in services.
|
||
_Avoid_: property list, batch (used for SQS chunks)
|
||
|
||
**UPRN**:
|
||
Unique Property Reference Number — the government-issued permanent identifier for a physical address in the UK.
|
||
_Avoid_: property ID, address ID, code
|
||
|
||
**Postcode**:
|
||
A UK postal code used to group nearby addresses; the primary search key for finding EPC records.
|
||
_Avoid_: zip code, postal code
|
||
|
||
**User Address**:
|
||
A free-text address string provided by a user or imported from a customer dataset, before any normalisation or matching.
|
||
_Avoid_: user input, raw address, user_inputed_address
|
||
|
||
**Comparable Properties**:
|
||
The reference cohort matched to a target Property by both geographic proximity (postcode prefix / UPRN range) and physical similarity (property type, built form, age band); used by the EPC Prediction Service for gap-filling and anomaly detection.
|
||
_Avoid_: neighbours, similar properties, peer set
|
||
|
||
### Source data
|
||
|
||
**Site Notes**:
|
||
The full-coverage record produced by a Domna survey of a single Property; carries every EPC field the modelling pipeline requires, and when present supersedes the public EPC for that Property — except when the public EPC is newer.
|
||
_Avoid_: energy assessment, site survey, field survey, Domna survey, Hestia survey
|
||
|
||
**Landlord Overrides**:
|
||
Property data supplied by a landlord that may correct or supplement the public EPC for a single Property; triggers Rebaselining when applied; not applicable when Site Notes are present.
|
||
_Avoid_: patches (deprecated), corrections, manual EPC, edits
|
||
|
||
### Modelling
|
||
|
||
**Effective EPC**:
|
||
The EpcPropertyData scored by the modelling pipeline for a single Property, derived from either Site Notes alone or the public EPC with Landlord Overrides applied; carries source-derived physical fields and originally recorded performance values, with model-rebaselined performance held separately in Baseline Performance.
|
||
_Avoid_: modelling EPC, working EPC, resolved EPC, derived EPC
|
||
|
||
**Rebaselining**:
|
||
Re-predicting a Property's SAP score, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh via ML so the modelling pipeline scores it against the current SAP10 methodology. Triggered when either (a) the Effective EPC was lodged under a pre-SAP10 schema (`sap_version < 10.0`), so the recorded scores reflect a superseded methodology, or (b) Site Notes / Landlord Overrides changed the physical state of the Property (walls / heating / windows / etc.) so the lodged scores no longer reflect what's installed. Both triggers may fire together. Produces Effective Performance; Lodged Performance is preserved unchanged. kWh is included as ML targets per ADR-0007 — see [[epc-ml-transform]].
|
||
_Avoid_: re-scoring, re-prediction, performance recomputation, refresh (for cache-freshness)
|
||
|
||
**Baseline Performance**:
|
||
A Property's current performance aggregate, holding both Lodged Performance and Effective Performance plus annual space heating kWh, hot water kWh, fuel split, and bills derived from the Effective EPC — kWh values come from the EPC's recorded fields for SAP10 baselines or from ML when Rebaselining fires; bills are derived deterministically from kWh × current Fuel Rates. Persisted as one row; surfaced as one block in the UI.
|
||
_Avoid_: baseline predictions, predicted baseline, rebaselined values
|
||
|
||
**Lodged Performance**:
|
||
The SAP / EPC Band / carbon emissions / heat demand recorded on the public EPC (or the Site Notes' as-surveyed values when Site Notes are the source) — unmodified by modelling. The half of Baseline Performance that says "what the government register says about this Property".
|
||
_Avoid_: original performance, raw EPC values, recorded baseline
|
||
|
||
**Effective Performance**:
|
||
The SAP / EPC Band / carbon emissions / heat demand the modelling pipeline actually scored against — equal to Lodged Performance when no Rebaselining trigger fires, replaced by ML output when triggered. The half of Baseline Performance that says "what we modelled".
|
||
_Avoid_: modelled performance, rebaselined performance (only correct when rebaselining ran), scored values
|
||
|
||
**Calculated SAP10 Performance**:
|
||
The SAP score, EPC Band, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh produced by **SAP10 Calculation** from a Property's EpcPropertyData. Distinct from Effective Performance (ML output) and Lodged Performance (gov register) during the validation phase. Surfaced alongside Effective Performance in the UI; may supersede Effective Performance in a later ADR once parity is confirmed against the cert-reported SAP across ≥1000 sample certs lodged on the calculator's target spec version (see [[sap-spec-version]]). ADR-0009 (as amended by ADR-0010).
|
||
_Avoid_: calculator output, computed performance, worksheet performance, SAP10 output
|
||
|
||
**SAP10 Calculation**:
|
||
The process that runs the deterministic SAP 10.2 (14-03-2025 amendment) worksheet over a Property's EpcPropertyData and emits **Calculated SAP10 Performance**. Implemented by the `Sap10Calculator` service class in `domain/sap/`. Reads cert fabric/heating/geometry fields, applies the RdSAP 10 (10-06-2025) cert→input mapping, executes the 12-month heat balance per SAP 10.2 §§1-14, looks up boiler/heat-pump performance in the **PCDB** when the cert lodges a product index, and returns a `SapResult` carrying the five Calculated SAP10 Performance quantities plus a monthly breakdown and worksheet-line audit trail. Distinct from **Rebaselining**, which is ML-based. ADR-0009 originally targeted SAP 10.3 (13-01-2026); ADR-0010 retargets to SAP 10.2 (14-03-2025) until the cert corpus migrates.
|
||
_Avoid_: SAP calculation (ambiguous with the gov calculator), SAP scoring, calculator run, SAP 10.3 calculation (active target is 10.2 — see [[sap-spec-version]])
|
||
|
||
**SAP Spec Version**:
|
||
The dated revision of the SAP specification that produced a given SAP/PEUI/CO2 value. Domain-meaningful because the same EpcPropertyData yields different `sap_score` under different spec versions — fuel-price tables, CO2 factors, PCDB references, and rating-equation deflators all change between revisions. **Lodged Performance** carries the version current when the cert was lodged (mostly SAP 10.1 / SAP 10.2 pre- and post-14-03-2025 amendment in the corpus). **Calculated SAP10 Performance** is locked to SAP 10.2 (14-03-2025). A 1-to-1 Lodged-vs-Calculated comparison therefore only makes sense within a **Validation Cohort** of certs lodged on the same spec version.
|
||
_Avoid_: SAP version (ambiguous with the `sap_version` field on the cert, which only carries the major version like 10.2 — not the amendment date), spec revision
|
||
|
||
**Validation Cohort**:
|
||
The subset of corpus certs used to validate **SAP10 Calculation** against **Lodged Performance**, filtered to certs lodged after the calculator's target **SAP Spec Version** rolled out in commercial assessor software — currently `inspection_date ≥ 2025-07-01` (a buffer past 14-03-2025 to allow vendor rollout). Smaller than the full corpus but each cert is comparable under the same spec, so probe MAE is a clean signal of calculator-vs-spec correctness rather than spec-version mixture noise. ADR-0010.
|
||
_Avoid_: parity cohort, validation set, corpus sample
|
||
|
||
**Measure Application**:
|
||
The process that translates an Optimised Package into cert-field changes and produces the "ending state snapshot" EpcPropertyData that Plan Phase persists. Implemented by the `MeasureApplicator` service class in `domain/sap/` (or a sibling package). Each Measure Type's translation rules (e.g. `loft_insulation` → `roof_insulation_thickness_mm = 270mm`, `ashp` → `main_heating_details[0]` replacement) live here. Pure function — does not run SAP10 Calculation itself; the caller chains `MeasureApplicator.apply(epc, package) → Sap10Calculator.calculate(post_epc)`. ADR-0009.
|
||
_Avoid_: measure overrides (rejected during ADR-0009 grill — phantom mid-layer), package applier, retrofit simulator
|
||
|
||
**EPC Energy Derivation**:
|
||
The process that derives a Property's fuel split and annual bills from its space heating kWh and hot water kWh values plus the heating fuel deduced from SAP fields. kWh values themselves come from the EPC's recorded fields (`renewable_heat_incentive.space_heating_existing_dwelling` and `.water_heating`) for SAP10 baselines, or from ML prediction when Rebaselining fires or when scoring a post-measure state. Bills are computed deterministically from delivered kWh × current Fuel Rates + standing charges + SEG credits. The UCL Correction is no longer applied at runtime — it is folded into ML training labels (see [[epc-ml-transform]] and ADR-0007).
|
||
_Avoid_: kWh prediction (kWh is now an ML target — see Rebaselining), baseline kWh, energy estimation
|
||
|
||
**UCL Correction**:
|
||
The per-band linear correction (Few et al. 2023, _Energy & Buildings_ 288 113024) that aligns EPC-modelled Primary Energy Intensity with metered consumption. Folded into ML training labels at fit time (per ADR-0007) rather than applied at runtime — the trained model emits metered-equivalent PEUI directly, avoiding the discontinuities at EPC band boundaries that arose when the per-band linear correction was applied post-prediction. Calibrated against gas-heated, non-PV homes in England and Wales rated under SAP 2012; the current implementation extrapolates it to all properties (open question §15.14).
|
||
_Avoid_: UCL adjustment, energy correction, metered correction
|
||
|
||
**EPC Anomaly Flag**:
|
||
A per-field indicator that a Property's value for an EPC field differs significantly from Comparable Properties; advisory only — surfaces in the UI to prompt user review, does not block modelling.
|
||
_Avoid_: outlier, mismatch, divergence flag
|
||
|
||
### ML training
|
||
|
||
**EPC ML Transform**:
|
||
The versioned class at `packages/domain/src/domain/ml/transform.py` that maps an EpcPropertyData to a fixed-width row of features + targets. The single ML-data contract between this repo and the AutoGluon training repo. Owns the windows compression, building-parts compression, Top-N Code Taxonomy, and UCL folding decisions. Each version is tagged on the deployed scoring lambda; a mismatch is a deploy-time fail.
|
||
_Avoid_: feature builder, ML mapper, EPC vectoriser
|
||
|
||
**Feature Schema Version**:
|
||
The semver version of the EPC ML Transform (e.g. `0.1.0`), included in the parquet output path and the deployed scoring lambda's tag. MAJOR bump when columns are removed or renamed; MINOR when optional columns are added; PATCH for non-behavioural fixes.
|
||
_Avoid_: transform version, schema version (overloaded with the SAP RdSAP schema version on EPCs), model version
|
||
|
||
**Primary Energy Intensity** (**PEUI**):
|
||
A Property's total annual primary energy use per square metre of floor area (kWh/m²/yr), the SAP10 quantity recorded as `energy_consumption_current` on the EPC. Covers all end uses (heating, hot water, lighting, appliances, cooking) weighted by SAP primary energy factors per fuel. The quantity the UCL Correction aligns to metered consumption.
|
||
_Avoid_: heat demand (which colloquially means the building's space heating thermal requirement — a distinct concept), energy demand, total energy use, kWh per square metre
|
||
|
||
**PV Capacity Source**:
|
||
A flag on the EPC ML Transform feature set indicating whether a Property's PV capacity is `measured` (from `sap_energy_source.photovoltaic_supply[].peak_power`), `estimated_from_roof_area` (the `percent_roof_area` fallback used when the surveyor could not confirm array configuration), or `none` (no PV present). Lets the model weight the correct capacity signal per property.
|
||
_Avoid_: PV source, PV configuration type, solar source
|
||
|
||
**Top-N Code Taxonomy**:
|
||
The empirical top-N SAP code list (covering ~95% of mass on the training sample) committed by the EPC ML Transform for each list-aggregated categorical field (`wall_construction`, `glazing_type`, `frame_material`, etc.). Rare codes go into a per-field `_other` bucket. The taxonomy is locked at each Feature Schema Version; changes warrant a MINOR bump (adding) or MAJOR bump (removing codes).
|
||
_Avoid_: code list, code dictionary, vocab
|
||
|
||
### Reference data
|
||
|
||
**Fuel Rates**:
|
||
The current per-fuel rate (pence/kWh) and standing charge used to compute a Property's bills; time-versioned and regional, refreshed from Ofgem's published caps via an ETL. The Smart Export Guarantee rate sits in the same set as `electricity_export`. Consumed by EPC Energy Derivation.
|
||
_Avoid_: fuel prices (commodity prices, different concept), tariff, energy cost
|
||
|
||
**Carbon Factors**:
|
||
The per-fuel CO2 emission factor (kgCO2e/kWh) used to compute a Property's carbon emissions; time-versioned, refreshed from Defra's annual publication. Consumed by EPC Energy Derivation.
|
||
_Avoid_: emission factors (ambiguous), CO2 rates
|
||
|
||
### Outputs
|
||
|
||
**Scenario**:
|
||
A named portfolio-level retrofit plan, built by a user in the scenario-builder UI and persisted before any modelling fires; carries the overall goal (e.g. Increasing EPC), budget, exclusions, housing type, and an ordered list of Scenario Phases. The model is triggered against one or more Scenarios at once; each Scenario yields one Plan per Property.
|
||
_Avoid_: project, batch, run-set
|
||
|
||
**Scenario Phase**:
|
||
One ordered step inside a Scenario, carrying a measure-type allowlist (e.g. "loft insulation and walls in phase 1; ASHP in phase 2"), an optional phase budget, and an optional phase target. A single-phase Scenario is one Scenario Phase with all measure types allowed and the full budget on it — there is no special-case path.
|
||
_Avoid_: scenario stage, scenario step, tranche
|
||
|
||
**Scenario Snapshot**:
|
||
A frozen copy of a Scenario pinned at trigger time, keyed by (task, scenario); used by the modelling pipeline so mid-run edits to the live Scenario do not affect an in-flight job. Snapshots are read-only and may be garbage-collected after the task completes.
|
||
_Avoid_: scenario version, frozen scenario, pinned scenario
|
||
|
||
**Plan**:
|
||
The per-Property output of one Scenario's modelling run; carries an ordered list of Plan Phases matching the Scenario's Phase shape. A Property modelled against N Scenarios in one trigger ends up with N Plans.
|
||
_Avoid_: recommendation set, output, result
|
||
|
||
**Plan Phase**:
|
||
The per-Property output of one Scenario Phase: the Optimised Package selected for that phase, the ending state snapshot (the Property's SAP / kWh / bills after the package is applied), and any Rolled-over Options that flow as candidates into the next Plan Phase.
|
||
_Avoid_: plan stage, plan step
|
||
|
||
**Rolled-over Options**:
|
||
Recommendations generated but not selected by the Optimiser in a given Plan Phase, that remain eligible as candidates in subsequent Plan Phases. Exact roll-over rule (automatic vs user-marked) is under design.
|
||
_Avoid_: deferred measures, leftover recommendations
|
||
|
||
**Recommendation**:
|
||
A single proposed retrofit measure for a Property, with its cost, SAP impact, kWh savings, carbon savings, and parts list.
|
||
_Avoid_: suggestion, option
|
||
|
||
**Optimised Package**:
|
||
The subset of a Property's Recommendations selected by the Optimiser Service for installation, chosen to satisfy the Scenario's goal subject to budget.
|
||
_Avoid_: selected measures, default measures, optimal solution, recommended bundle
|
||
|
||
**Measure Type**:
|
||
The catalogue classification of a retrofit measure (e.g. `solar_pv`, `loft_insulation`, `ashp`); one or more Recommendations reference the same Measure Type with property-specific cost and impact.
|
||
_Avoid_: measure (ambiguous), category
|
||
|
||
### Address matching
|
||
|
||
**Lexiscore**:
|
||
A similarity score in [0, 1] between a User Address and a candidate EPC address; combines token overlap and character-level similarity.
|
||
_Avoid_: score, match score, similarity
|
||
|
||
**Lexirank**:
|
||
Dense rank of candidates sorted by Lexiscore descending; rank 1 = best match.
|
||
_Avoid_: rank, position
|
||
|
||
**UPRN Candidate**:
|
||
An EPC Search Result that is a plausible match for a given User Address, before scoring decides the winner.
|
||
_Avoid_: match candidate, result
|
||
|
||
**Score Threshold**:
|
||
The minimum Lexiscore (currently 0.6) below which no match is returned even if a candidate exists.
|
||
_Avoid_: minimum score, cutoff
|
||
|
||
**Ambiguous Match**:
|
||
A matching outcome where two or more candidates share Lexirank 1, making it impossible to select a unique winner.
|
||
_Avoid_: tie, draw, duplicate
|
||
|
||
**Best Match**:
|
||
The single UPRN Candidate with Lexirank 1 that meets or exceeds the Score Threshold.
|
||
_Avoid_: winner, top result
|
||
|
||
### API and integration
|
||
|
||
**EPC Search Result**:
|
||
A lightweight record returned by the government domestic search endpoint — address lines, postcode, UPRN, band, and certificate number, but not full certificate data.
|
||
_Avoid_: search row, EPC row, result
|
||
|
||
**EPC Property Data**:
|
||
The fully mapped domain object produced after fetching and parsing a complete EPC certificate; the schema the modelling pipeline operates against.
|
||
_Avoid_: EPC data, certificate data, parsed EPC
|
||
|
||
**Old EPC API**:
|
||
The retired government API (`epc.opendatacommunities.org`) using HTTP Basic auth; decommissioned 30 May 2026.
|
||
_Avoid_: legacy API
|
||
|
||
**New EPC API**:
|
||
The replacement government API (`api.get-energy-performance-data.communities.gov.uk`) using Bearer Token auth.
|
||
_Avoid_: new API, current API
|
||
|
||
**Bearer Token**:
|
||
The auth credential required by the New EPC API; stored in the `EPC_AUTH_TOKEN` environment variable.
|
||
_Avoid_: API key, auth token, secret
|
||
|
||
## Relationships
|
||
|
||
- A **Property** represents a single physical dwelling for modelling; identified by `(portfolio_id, UPRN)` or `(portfolio_id, landlord_property_id)`.
|
||
- A **Property** has zero or more **EPCs** across time, exactly one **Effective EPC**, zero or one set of **Site Notes**, and zero or one set of **Landlord Overrides**.
|
||
- An **EPC** belongs to exactly one **Property** and has one **Certificate Number**.
|
||
- An **EPC** carries an **EPC Band** and is identifiable by its **Registration Date**; the most recent one is the current.
|
||
- A **UPRN** identifies a physical dwelling permanently; it does not change when the property changes owner — but each portfolio gets its own **Property** keyed against it.
|
||
- When a **Property** has both **Site Notes** and a public **EPC**, the newer of the two derives the **Effective EPC**. **Landlord Overrides** apply only when the **EPC** is the source — never when **Site Notes** are.
|
||
- A Property's **Baseline Performance** holds two halves: **Lodged Performance** (the gov register's SAP / band / carbon / heat) and **Effective Performance** (what the modelling pipeline scored against). The two are equal unless **Rebaselining** fires.
|
||
- **Rebaselining** produces **Effective Performance** by ML re-prediction across SAP score, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh, when either (a) the Effective EPC was lodged under a pre-SAP10 schema, or (b) the Effective EPC's physical state diverges from the lodged EPC. **Lodged Performance** is never overwritten.
|
||
- **EPC Energy Derivation** derives **fuel split** and **bills** from kWh values (sourced from the EPC's `renewable_heat_incentive` fields for baseline SAP10 properties, or from ML when Rebaselining fires), reading current **Fuel Rates** and **Carbon Factors** from their respective repos.
|
||
- The **EPC Prediction Service** uses **Comparable Properties** for both gap-filling and producing **EPC Anomaly Flags**.
|
||
- A **Scenario** carries one or more ordered **Scenario Phases**. Triggering the model against N Scenarios produces N **Plans** per Property; each Plan carries an ordered list of **Plan Phases** matching the Scenario's shape.
|
||
- Each **Plan Phase** holds its **Optimised Package**, the ending state snapshot, and any **Rolled-over Options** that flow as candidates into the next Plan Phase. A single-phase Scenario is one Scenario Phase with all measure types allowed; the same machinery handles it.
|
||
- A **Scenario Snapshot** is pinned at trigger time per (task, scenario) so mid-run edits to the live Scenario do not affect an in-flight modelling job.
|
||
- A **Recommendation** references one **Measure Type** and carries property-specific cost and impact.
|
||
- **Address Matching** uses a **User Address** and **Postcode** to find a **UPRN** by scoring **UPRN Candidates** from an EPC search. A **Lexirank** of 1 with no **Ambiguous Match** and a **Lexiscore** ≥ the **Score Threshold** produces a **Best Match**.
|
||
|
||
## Example dialogue
|
||
|
||
> **Dev:** "A landlord uploads a corrected boiler for one of their properties. What happens?"
|
||
>
|
||
> **Domain expert:** "That's a **Landlord Override** on the heating fields. Save it against the **Property**. The **Effective EPC** has changed, so **Rebaselining** runs to re-predict SAP / carbon / PEUI / space heating kWh / hot water kWh, and **EPC Energy Derivation** re-runs to update the fuel split and bills based on the new kWh values and fuel deduction. With fresh **Baseline Performance** we regenerate **Recommendations**."
|
||
|
||
> **Dev:** "What if the same Property also has Site Notes?"
|
||
>
|
||
> **Domain expert:** "**Site Notes** supersede the public **EPC**, so **Landlord Overrides** don't apply. We model from the **Site Notes** version of the **Effective EPC**. If the public **EPC** is newer than the **Site Notes**, that's the one exception — we use the newer one."
|
||
|
||
> **Dev:** "After modelling we end up with a list of measures. Which ones get installed?"
|
||
>
|
||
> **Domain expert:** "The **Optimiser Service** picks the **Optimised Package** — a subset of **Recommendations** that hits the **Scenario** goal within budget. The rest stay in the **Plan** as alternatives the user can swap in."
|
||
|
||
> **Dev:** "I'm looking at a property where the EPC says cavity walls but every other house on the street has solid. Is that a bug?"
|
||
>
|
||
> **Domain expert:** "That's an **EPC Anomaly Flag**. We compute it against the **Comparable Properties** for that postcode. It's advisory — the UI surfaces it and the landlord can apply a **Landlord Override** if it's wrong."
|
||
|
||
> **Dev:** "The property card shows two SAP scores side by side. Why?"
|
||
>
|
||
> **Domain expert:** "Those are **Lodged Performance** and **Effective Performance**. **Lodged** is what the gov register says — the EPC was rated under SAP 2012. **Effective** is what we scored against — we ran **Rebaselining** to predict the SAP10-equivalent rating because the methodology changed. Both stay on the **Baseline Performance** so users can see what's on record and what we're modelling against."
|
||
|
||
> **Dev:** "A landlord wants a 3-year retrofit plan — fabric work this year, heat pump next, solar after. How do we model that?"
|
||
>
|
||
> **Domain expert:** "Three **Scenario Phases** in one **Scenario**. Phase 1 allows fabric measures with this year's budget, phase 2 allows the heat pump with next year's budget, phase 3 allows solar. When we model, the **Optimiser Service** runs per phase against the rolling state — the heat pump is scored against the post-insulation property, not the original one. Each **Plan Phase** captures the **Optimised Package** plus the ending SAP / bills, and any **Rolled-over Options** that didn't make this phase's budget become candidates next phase."
|
||
|
||
## Flagged ambiguities
|
||
|
||
- **"property"** was historically warned against in favour of "dwelling"; that has been inverted. **Property** is now canonical for the Ara domain aggregate. Legacy code still uses "dwelling" in places — treat as alias.
|
||
- **"energy assessment"** in the existing codebase (`energy_assessment_functions`, `energy_assessments_by_uprn`) refers to what is now canonically called **Site Notes**. New code uses **Site Notes**.
|
||
- **"patch"** / `patch_epc` in the existing codebase has been merged into **Landlord Overrides**; the original concept is deprecated.
|
||
- **"already_installed measures"** in the existing codebase is likely subsumed by **Landlord Overrides** ("we have a heat pump now" → override the heating fields). Final call deferred to implementation.
|
||
- **"address"** appears as both the raw **User Address** (free-text) and a structured field on an **EPC Search Result** (normalised lines). Always qualify: "user address" vs "EPC address" or "address line 1".
|
||
- **"score"** is used for `AddressMatch.score()` output, the `lexiscore` column, and informally. Prefer **Lexiscore** in domain discussions; reserve "score" for method-level code comments.
|
||
- **"user_inputed_address"** in `backend/address2UPRN/main.py` is a misspelling and a synonym for **User Address** — the canonical term. New code should use `user_address`.
|
||
- **"EPC"** is overloaded as both the document and the rating band letter. Use **EPC** for the document, **EPC Band** for the letter.
|
||
- **"re-scoring"** has two meanings in the codebase — **Rebaselining** (re-predicting baseline performance after an EPC change) and post-optimisation measure re-prediction. Prefer **Rebaselining** for the former; for the latter, the **Optimiser Service** step does its own scoring without a special name.
|
||
- **"phase"** appears in two unrelated contexts: as cut-over timeline language in the PRD ("Phase 0 — Status quo", "Phase 1 — Forced cut-over") and as a domain concept in **Scenario Phase** / **Plan Phase**. Only the latter is a glossary term; cut-over phases are project-management vocabulary that does not enter code.
|
||
- **"stale"** appears in two senses: cache-freshness ("a Repo record is stale and the orchestrator should refetch") — a legitimate operational concept; and as loose shorthand for the EPC's recorded cost fields being unusable. The cost fields are not stale — they are pinned to the inspection-date fuel rates by design. Use "pinned to inspection date" or "pre-SAP10 schema" (whichever applies) instead.
|