Model/docs/adr/0004-baseline-performance-lodged-effective-pair.md
Khalim Conn-Kowlessar 457d959b1f refactor(property-baseline): rename baseline → property_baseline aggregate (PR #1139 review)
Wholesale rename of the Baseline aggregate to PropertyBaseline for clarity /
to disambiguate from baselines that appear elsewhere in Modelling. Scoped to
this aggregate only — the distinct Rebaselining term (rebaseline_reason,
StubRebaseliner, RebaselineNotImplemented) is deliberately untouched.

- domain/baseline → domain/property_baseline; BaselinePerformance →
  PropertyBaselinePerformance.
- repositories/baseline → repositories/property_baseline; BaselineRepository
  / BaselinePostgresRepository → PropertyBaseline*.
- orchestration/baseline_orchestrator.py → property_baseline_orchestrator.py;
  BaselineOrchestrator → PropertyBaselineOrchestrator. BaselineStage →
  PropertyBaselineStage.
- infrastructure/postgres: baseline_performance_table.py →
  property_baseline_performance_table.py; table `baseline_performance` →
  `property_baseline_performance`; Model renamed.
- UnitOfWork attribute `.baseline` → `.property_baseline`.
- Docs: ADR-0004 references + migration doc (renamed to
  property-baseline-performance-table.md) updated.

CONTEXT.md glossary term ("Baseline Performance") left as-is pending a
ubiquitous-language call (raised on the PR). 123 tests pass; pyright strict
clean (only the unrelated pre-existing moto import errors remain).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 16:28:48 +00:00

3.8 KiB

PropertyBaselinePerformance stores both lodged and effective values

A Property's current performance has two states we care about: the rating that was lodged on the government register (the "lodged" SAP / band / carbon / heat) and the rating produced by the modelling pipeline against the current Effective EPC (the "effective" values, which may have been rebaselined by ML when the EPC was pre-SAP10 or when Landlord Overrides / Site Notes changed physical state). We considered storing a single set of values — the rebaselined-if-needed-otherwise-lodged figures — and rejected that. Both are stored as a pair on every PropertyBaselinePerformance, equal when no rebaselining trigger fires.

The pair lets the FE show "this is what the gov register says vs this is the SAP10-equivalent we modelled against" side by side without a second query, and keeps the audit trail clean: a user looking at a property's plan can see exactly which figure drove the recommendation pipeline. Storing only one set forces a downstream consumer to recompute the missing one from raw EPC fields when it needs both, which is the kind of derivation creep we want to keep out of the FE.

The cost is a wider row + the discipline that every PropertyBaselinePerformance populates both halves, even when they're equal. Annual kWh, fuel split and bills are not paired — they are always derived deterministically by EpcEnergyDerivationService against the Effective state, because the EPC's recorded cost fields use fuel rates pinned to the inspection date and the UCL correction depends on the modelled band.

Consequences

  • Reversing this means rewriting every consumer that has learned to read both values. Hard to roll back once the FE depends on the pair.
  • The rebaseline trigger has two reasons (pre_sap10, physical_state_changed, or both) — store the reason alongside so we know why a property was rebaselined when debugging.

Amendment (2026-05-30, #1135): standalone property_baseline_performance table

The original consequence read "property_details_epc (or its successor) carries 8 fields instead of 4 for the SAP-equivalent block" — i.e. the pair as columns on the EPC-details table. That is superseded. property_details_epc is being retired: it is too tightly coupled to the schema of the legacy EPC API, which the Ara rebuild is moving off. So the pair has no home there.

PropertyBaselinePerformance instead persists as its own standalone property_baseline_performance table, one row per Property, behind a dedicated PropertyBaselineRepository port (save / get_for_property), mirroring the EPC slice's repo shape. This is the cleaner model regardless of the retirement: PropertyBaselinePerformance is its own aggregate (a Property's current performance), not a detail of any single EPC.

The row is flat typed columns, not a JSONB blob, because the FE both surfaces the block and queries the lodged-vs-effective pair: lodged_{sap_score, epc_band, co2_emissions, primary_energy_intensity}, the four effective_* mirrors, rebaseline_reason, and (for the part of the energy block that needs no derivation) space_heating_kwh / water_heating_kwh. The fourth paired quantity is Primary Energy Intensity, not "heat demand" — see CONTEXT.md (the prose above predates that term being sharpened).

Fuel split and bills — the rest of the EPC Energy Derivation block — are deferred to a follow-up: bills require a current Fuel Rates source (Ofgem-cap ETL) that does not yet exist, and fuel split is produced by the same EpcEnergyDerivationService, so the two land together rather than churning the table twice.

The SQLModel row is defined in infrastructure/postgres/ so the ephemeral-Postgres tests build it via create_all; the production migration is FE-owned (Drizzle ORM) and tracked in docs/migrations/.