feat(baseline): run Sap10Calculator in shadow on Property Baseline (ADR-0013)

Wire Sap10Calculator into PropertyBaselineOrchestrator as a non-load-bearing
shadow runner. For each property it scores the Effective EPC beside the
load-bearing Lodged/Effective write, catches any strict-raise -> log.error
(never aborts the batch), and on success log.warning's divergence from Lodged:
SAP |continuous - lodged| > 0.5; PEUI/CO2 > 1% relative (CO2 after kg->tonnes).
Every line is tagged with sap_version so SAP-10.2 signal separates from
older-spec drift (ADR-0010 Validation Cohort).

Per ADR-0013, Calculated SAP10 Performance is not a persisted third value-set:
effective = calculated in every baselining scenario, so the calculator IS the
mechanism that produces Effective Performance (the Rebaseliner). It runs in
shadow only while being hardened; when overrides/estimation land it is promoted
to drive Effective and the failure posture flips to abort (ADR-0012, calculator
now load-bearing). No table change.

- ADR-0013 + CONTEXT (Calculated SAP10 Performance / Effective Performance /
  Rebaselining) record the decision.
- CalculatorShadow port + LoggingCalculatorShadow + Calculator protocol.
- FakeCalculatorShadow for orchestrator unit tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Khalim Conn-Kowlessar 2026-06-02 08:01:47 +00:00
parent ce33cd94ef
commit 561e1b8b49
9 changed files with 473 additions and 7 deletions

View file

@ -82,7 +82,7 @@ The EpcPropertyData scored by the modelling pipeline for a single Property, deri
_Avoid_: modelling EPC, working EPC, resolved EPC, derived EPC
**Rebaselining**:
Re-predicting a Property's SAP score, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh via ML so the modelling pipeline scores it against the current SAP10 methodology. Triggered when either (a) the Effective EPC was lodged under a pre-SAP10 schema (`sap_version < 10.0`), so the recorded scores reflect a superseded methodology, or (b) Site Notes / Landlord Overrides changed the physical state of the Property (walls / heating / windows / etc.) so the lodged scores no longer reflect what's installed. Both triggers may fire together. Produces Effective Performance; Lodged Performance is preserved unchanged. kWh is included as ML targets per ADR-0007 — see [[epc-ml-transform]].
Re-predicting a Property's SAP score, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh via **SAP10 Calculation** (the deterministic `Sap10Calculator`, which superseded the old ML-API rebaseliner; an ML residual head over the calculator is future — ADR-0009/0013) so the modelling pipeline scores it against the current SAP10 methodology. Triggered when either (a) the Effective EPC was lodged under a pre-SAP10 schema (`sap_version < 10.0`), so the recorded scores reflect a superseded methodology, or (b) Site Notes / Landlord Overrides changed the physical state of the Property (walls / heating / windows / etc.) so the lodged scores no longer reflect what's installed. Both triggers may fire together. Produces Effective Performance; Lodged Performance is preserved unchanged. kWh is included as ML targets per ADR-0007 — see [[epc-ml-transform]].
_Avoid_: re-scoring, re-prediction, performance recomputation, refresh (for cache-freshness)
**Baseline Performance**:
@ -94,12 +94,12 @@ The SAP / EPC Band / carbon emissions / Primary Energy Intensity recorded on the
_Avoid_: original performance, raw EPC values, recorded baseline
**Effective Performance**:
The SAP / EPC Band / carbon emissions / Primary Energy Intensity the modelling pipeline actually scored against — equal to Lodged Performance when no Rebaselining trigger fires, replaced by ML output when triggered. The half of Baseline Performance that says "what we modelled".
The SAP / EPC Band / carbon emissions / Primary Energy Intensity the modelling pipeline actually scored against — equal to Lodged Performance when no Rebaselining trigger fires, replaced by **SAP10 Calculation** output (the deterministic `Sap10Calculator`, which superseded the old ML-API rebaseliner; an ML residual head over the calculator is future — ADR-0009/0013) when triggered. The half of Baseline Performance that says "what we modelled".
_Avoid_: modelled performance, rebaselined performance (only correct when rebaselining ran), scored values
**Calculated SAP10 Performance**:
The SAP score, EPC Band, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh produced by **SAP10 Calculation** from a Property's EpcPropertyData. Distinct from Effective Performance (ML output) and Lodged Performance (gov register) during the validation phase. Surfaced alongside Effective Performance in the UI; may supersede Effective Performance in a later ADR once parity is confirmed against the cert-reported SAP across ≥1000 sample certs lodged on the calculator's target spec version (see [[sap-spec-version]]). ADR-0009 (as amended by ADR-0010).
_Avoid_: calculator output, computed performance, worksheet performance, SAP10 output
The SAP score, EPC Band, CO2 emissions, Primary Energy Intensity, space heating kWh, and hot water kWh produced by **SAP10 Calculation** from a Property's EpcPropertyData. It is **not** a separately-persisted third value-set beside Lodged and Effective: in every baselining scenario the calculator's output *is* the **Effective Performance** (real lodged SAP10 EPC with no overrides ⇒ Calculated = Lodged = Effective; overrides or an estimated / pre-SAP10 EPC ⇒ Calculated = Effective, there being no lodged SAP10 figure to compare against). The calculator is therefore the mechanism that produces Effective Performance, having superseded the old ML-API rebaseliner. While it is being hardened it runs in **shadow** for the first baselining slice — computed on every Property, compared to Lodged, and any divergence (SAP > 0.5, or PEUI / CO2 beyond tolerance) or strict-raise **logged, not persisted** — then is promoted to drive Effective Performance once overrides / estimation land (ADR-0013). The ≥1000-cert parity confirmation against the cert-reported SAP (see [[sap-spec-version]]) gates that promotion. ADR-0009 introduced the term, as amended by ADR-0010 and realized by ADR-0013.
_Avoid_: calculator output, computed performance, worksheet performance, SAP10 output, calculated value-set (it is not a stored third set)
**SAP10 Calculation**:
The process that runs the deterministic SAP 10.2 (14-03-2025 amendment) worksheet over a Property's EpcPropertyData and emits **Calculated SAP10 Performance**. Implemented by the `Sap10Calculator` service class in `domain/sap10_calculator/` (`calculator.py`). Reads cert fabric/heating/geometry fields, applies the RdSAP 10 (10-06-2025) cert→input mapping, executes the 12-month heat balance per SAP 10.2 §§1-14, looks up boiler/heat-pump performance in the **PCDB** when the cert lodges a product index, and returns a `SapResult` carrying the five Calculated SAP10 Performance quantities plus a monthly breakdown and worksheet-line audit trail. Distinct from **Rebaselining**, which is ML-based. ADR-0009 originally targeted SAP 10.3 (13-01-2026); ADR-0010 retargets to SAP 10.2 (14-03-2025) until the cert corpus migrates.

View file

@ -10,7 +10,9 @@ from sqlmodel import Session
from applications.ara_first_run.ara_first_run_trigger_body import (
AraFirstRunTriggerBody,
)
from domain.property_baseline.calculator_shadow import LoggingCalculatorShadow
from domain.property_baseline.rebaseliner import StubRebaseliner
from domain.sap10_calculator.calculator import Sap10Calculator
from infrastructure.postgres.config import PostgresConfig
from infrastructure.postgres.engine import make_engine
from orchestration.property_baseline_orchestrator import PropertyBaselineOrchestrator
@ -81,6 +83,9 @@ def build_first_run_pipeline(
baseline=PropertyBaselineOrchestrator(
unit_of_work=unit_of_work,
rebaseliner=StubRebaseliner(),
# Shadow only: validates the calculator over the wild cohort without
# gating the load-bearing baseline write (ADR-0013).
calculator_shadow=LoggingCalculatorShadow(Sap10Calculator()),
),
modelling=ModellingOrchestrator(
scenario_repo=ScenarioRepository(),

View file

@ -0,0 +1,88 @@
---
Status: accepted
---
# The `Sap10Calculator` produces Effective Performance (it is the Rebaseliner); Calculated SAP10 Performance is not a persisted third value-set, and is wired in shadow first
Refines [ADR-0004](0004-baseline-performance-lodged-effective-pair.md) (the Lodged/Effective
pair), [ADR-0009](0009-deterministic-sap-calculator.md)/[ADR-0010](0010-sap10-calculator-spec-target-and-validation.md)
(the calculator + the **Calculated SAP10 Performance** term), [ADR-0011](0011-composable-stage-orchestrators.md)
(the `Rebaseliner` seam) and [ADR-0012](0012-unit-of-work-per-stage-batch-transaction.md)
(all-or-nothing per batch). Decided in a `/grill-with-docs` session (2026-06-01) before wiring
`Sap10Calculator` into `PropertyBaselineOrchestrator`.
## Context
The old `model_engine` (`backend/engine/engine.py`) called out to an **ML API**
(`model_api.predict_all` over `BASELINE_MODEL_PREFIXES`) to rebaseline the properties that needed
it. The rebuild replaces that round-trip with the **deterministic `Sap10Calculator`, run live**.
The handover and CONTEXT (line 100) framed **Calculated SAP10 Performance** as a *third* value-set
persisted *alongside* Lodged and Effective (`calculated_*` columns). Walking the baselining
scenarios shows that framing reifies a distinction that does not exist in the domain:
- real lodged SAP10 EPC, no overrides ⇒ Calculated = Lodged = Effective;
- real EPC + property/landlord overrides ⇒ Calculated = Lodged-plus-overrides = Effective;
- estimated EPC (± overrides), or a pre-SAP10 EPC ⇒ Calculated = Effective (no lodged SAP10 to
compare against — Lodged Performance exists only for a *real lodged* EPC).
In every scenario **Effective = Calculated**. There is no third quantity.
## Decision
**The calculator is the mechanism that produces Effective Performance** — i.e. the deterministic
`Rebaseliner` (ADR-0011's seam), superseding the old ML-API rebaseliner. "Calculated SAP10
Performance" is the *name of that output during validation*, **not** a separately-persisted third
value-set. No `calculated_*` columns are added; `property_baseline_performance` keeps its
Lodged/Effective shape (ADR-0004). The ADR-0009 ML model is repositioned as a *future residual head*
over the calculator, not the baseline producer.
**Shadow-first, then promotion.** The calculator still strict-raises (`UnmappedSapCode`,
`MissingMainFuelType`, `UnresolvedPcdbCombiLoss`) on cert mappings it has not yet hardened, and the
strict-typing of `EpcPropertyData` that will close most of those gaps is still pending. A ~40,000
property test cohort is about to flow through baselining. So this lands in two steps:
1. **This slice — shadow.** Performance is still **defined by the input data**: `StubRebaseliner`
keeps producing Effective (`= Lodged` for the only live scenario, real SAP10 + no overrides).
The calculator runs *beside* it, on every Property's Effective EPC, **purely to be battle-tested
in the wild**. It is **not load-bearing**, therefore:
- a calculator raise is **caught and logged at `error`, never aborts the batch** — otherwise one
unmappable cert would lose the load-bearing Lodged/Effective write for the whole batch, and
over a 40k run most batches would never baseline;
- on success, its output is **compared to Lodged and logged, not persisted**`warning` when
`|sap_continuous lodged_sap| > 0.5`, or PEUI / CO2 diverge beyond tolerance (CO2 after the
kg→tonnes conversion). Each log is tagged with the cert's `sap_version` so SAP-10.2 divergence
(a real calculator signal) is separable from older-spec drift (expected — see
[ADR-0010](0010-sap10-calculator-spec-target-and-validation.md) Validation Cohort).
2. **Next slice or two — load-bearing.** When overrides + EPC estimation land (days away),
`StubRebaseliner` is replaced by a calculator-backed `Rebaseliner`: the calculator's output
**becomes Effective Performance**. The failure posture **flips to abort** per ADR-0012 — now that
the calculator *is* the baseline, a silent wrong answer is the expensive outcome, so a raise must
fail the batch noisily. Same exception, opposite handling, because the calculator went from
shadow to load-bearing. The shadow logging is then retired.
## Considered options
- **A third persisted `calculated_*` value-set on `PropertyBaselinePerformance`** (the handover's
recommendation) — rejected: `Effective = Calculated` in every scenario, so the columns would
store a distinction with no domain reality, and the future "supersede effective" promotion would
be a data move instead of nothing.
- **Promote the calculator to drive Effective immediately** — rejected for this one slice: it still
strict-raises on un-hardened mappings, so over the imminent 40k run it would gate the
load-bearing baseline write. Shadow-first surfaces every gap as an aggregatable error log without
blocking baselining.
- **A separate `calculator_shadow` validation table** — held in reserve: log-only is enough while
the calculator is moving and the shadow step is a 12 day stepping stone; we add a queryable table
only if log aggregation proves too weak.
## Consequences
- `property_baseline_performance` is **unchanged** this slice — no migration.
- CONTEXT **Calculated SAP10 Performance**, **Effective Performance**, and **Rebaselining** are
updated: the calculator (not ML) is the rebaseliner mechanism in the rebuilt engine; Calculated is
not a stored third set.
- The shadow runner's broad `except` is deliberate (the point is to discover *what* breaks in the
wild); each caught exception is logged with its type and `property_id`.
- This decision is short-lived in its shadow form by design; the durable half — "the calculator
produces Effective Performance; there is no third value-set" — outlives it.

View file

@ -0,0 +1,141 @@
from __future__ import annotations
import logging
from abc import ABC, abstractmethod
from typing import TYPE_CHECKING, Optional, Protocol
from domain.property_baseline.performance import Performance
if TYPE_CHECKING:
from datatypes.epc.domain.epc_property_data import EpcPropertyData
from domain.sap10_calculator.calculator import SapResult
logger = logging.getLogger(__name__)
# A continuous SAP this far from the lodged integer would round to a different
# band-driving score; PEUI / CO2 scale with dwelling size so they use a relative
# tolerance (ADR-0013). Starting dials — tune against the wild-cohort logs.
_SAP_ABS_TOL = 0.5
_REL_TOL = 0.01
_KG_PER_TONNE = 1000.0
class CalculatorShadow(ABC):
"""Runs SAP10 Calculation in shadow beside the load-bearing baseline write
and reports divergence from Lodged Performance (ADR-0013).
The calculator is not yet load-bearing it is still being hardened, and a
large test cohort is about to flow through baselining. So an implementation
**must never raise**: a shadow failure may not abort the batch (ADR-0012's
all-or-nothing governs only the load-bearing Lodged/Effective write). It
observes, compares against Lodged, and logs; it does not feed Effective
Performance. The seam is retired when the calculator is promoted to the
Rebaseliner and its output *becomes* Effective Performance.
"""
@abstractmethod
def observe(
self,
*,
property_id: int,
effective_epc: "EpcPropertyData",
lodged: Performance,
) -> None: ...
def _relative_diff(calculated: float, lodged: float) -> float:
"""|calculated lodged| / |lodged|; a zero lodged value diverges iff
calculated is non-zero (avoids a divide-by-zero on degenerate certs)."""
if lodged == 0:
return 0.0 if calculated == 0 else float("inf")
return abs(calculated - lodged) / abs(lodged)
class Calculator(Protocol):
"""The slice of `Sap10Calculator` the shadow needs: cert in, result out.
`Sap10Calculator` satisfies it structurally no coupling to its module."""
def calculate(self, epc: "EpcPropertyData") -> "SapResult": ...
class LoggingCalculatorShadow(CalculatorShadow):
"""Runs the calculator and logs, never persists, never raises (ADR-0013).
A strict-raise (an un-mapped cert) is caught and logged at ``error`` so the
wild-cohort gap is greppable; a successful result whose SAP / PEUI / CO2
diverges from Lodged beyond tolerance is logged at ``warning``. Every line
is tagged with ``property_id`` and the cert's ``sap_version`` so SAP-10.2
divergence (a real calculator signal) is separable from older-spec drift.
"""
def __init__(self, calculator: Calculator) -> None:
self._calculator = calculator
def observe(
self,
*,
property_id: int,
effective_epc: "EpcPropertyData",
lodged: Performance,
) -> None:
sap_version = effective_epc.sap_version
try:
# Broad by design: the point is to discover *what* breaks in the
# wild, and a shadow failure must never abort the batch (ADR-0013).
result = self._calculator.calculate(effective_epc)
except Exception as exc:
logger.error(
"SAP10 shadow calculation failed for property_id=%s "
"sap_version=%s: %r",
property_id,
sap_version,
exc,
)
return
if abs(result.sap_score_continuous - lodged.sap_score) > _SAP_ABS_TOL:
self._warn_divergence(
quantity="sap_score",
property_id=property_id,
sap_version=sap_version,
lodged=lodged.sap_score,
calculated=result.sap_score_continuous,
)
if _relative_diff(
result.primary_energy_kwh_per_m2, lodged.primary_energy_intensity
) > _REL_TOL:
self._warn_divergence(
quantity="primary_energy_intensity",
property_id=property_id,
sap_version=sap_version,
lodged=lodged.primary_energy_intensity,
calculated=result.primary_energy_kwh_per_m2,
)
# Lodged CO2 is tonnes/yr; the calculator emits kg/yr (ADR-0013).
calculated_co2_t = result.co2_kg_per_yr / _KG_PER_TONNE
if _relative_diff(calculated_co2_t, lodged.co2_emissions) > _REL_TOL:
self._warn_divergence(
quantity="co2_emissions",
property_id=property_id,
sap_version=sap_version,
lodged=lodged.co2_emissions,
calculated=calculated_co2_t,
)
def _warn_divergence(
self,
*,
quantity: str,
property_id: int,
sap_version: Optional[float],
lodged: float,
calculated: float,
) -> None:
logger.warning(
"SAP10 shadow divergence on %s for property_id=%s sap_version=%s: "
"lodged=%s calculated=%s",
quantity,
property_id,
sap_version,
lodged,
calculated,
)

View file

@ -6,6 +6,7 @@ from datatypes.epc.domain.epc_property_data import (
EpcPropertyData,
RenewableHeatIncentive,
)
from domain.property_baseline.calculator_shadow import CalculatorShadow
from domain.property_baseline.property_baseline_performance import PropertyBaselinePerformance
from domain.property_baseline.performance import lodged_performance
from domain.property_baseline.rebaseliner import Rebaseliner
@ -32,9 +33,11 @@ class PropertyBaselineOrchestrator:
*,
unit_of_work: Callable[[], UnitOfWork],
rebaseliner: Rebaseliner,
calculator_shadow: CalculatorShadow,
) -> None:
self._unit_of_work = unit_of_work
self._rebaseliner = rebaseliner
self._calculator_shadow = calculator_shadow
def run(self, property_ids: list[int]) -> None:
with self._unit_of_work() as uow:
@ -54,6 +57,14 @@ class PropertyBaselineOrchestrator:
water_heating_kwh=rhi.water_heating_kwh,
)
uow.property_baseline.save(baseline, property_id)
# Shadow only: validate the calculator in the wild without
# gating the load-bearing write above (ADR-0013). `observe`
# never raises, so it cannot abort the batch.
self._calculator_shadow.observe(
property_id=property_id,
effective_epc=effective_epc,
lodged=lodged,
)
uow.commit()

View file

@ -0,0 +1,166 @@
from __future__ import annotations
import logging
from typing import Optional
import pytest
from datatypes.epc.domain.epc import Epc
from datatypes.epc.domain.epc_property_data import EpcPropertyData
from domain.property_baseline.calculator_shadow import LoggingCalculatorShadow
from domain.property_baseline.performance import Performance
from domain.sap10_calculator.calculator import SapResult
from domain.sap10_calculator.exceptions import UnmappedSapCode
def _epc(*, sap_version: Optional[float]) -> EpcPropertyData:
epc = object.__new__(EpcPropertyData)
epc.sap_version = sap_version
return epc
def _lodged() -> Performance:
return Performance(
sap_score=72, epc_band=Epc.C, co2_emissions=1.8, primary_energy_intensity=180
)
def _sap_result(
*,
sap_score_continuous: float = 72.0,
primary_energy_kwh_per_m2: float = 180.0,
co2_kg_per_yr: float = 1800.0,
) -> SapResult:
"""A `SapResult` whose three compared quantities default to *matching*
`_lodged()`; each test perturbs one axis."""
return SapResult(
sap_score=round(sap_score_continuous),
sap_score_continuous=sap_score_continuous,
ecf=0.0,
total_fuel_cost_gbp=0.0,
co2_kg_per_yr=co2_kg_per_yr,
space_heating_kwh_per_yr=0.0,
space_cooling_kwh_per_yr=0.0,
fabric_energy_efficiency_kwh_per_m2_yr=0.0,
main_heating_fuel_kwh_per_yr=0.0,
main_2_heating_fuel_kwh_per_yr=0.0,
secondary_heating_fuel_kwh_per_yr=0.0,
space_cooling_fuel_kwh_per_yr=0.0,
hot_water_kwh_per_yr=0.0,
pumps_fans_kwh_per_yr=0.0,
lighting_kwh_per_yr=0.0,
primary_energy_kwh_per_yr=0.0,
primary_energy_kwh_per_m2=primary_energy_kwh_per_m2,
monthly=(),
intermediate={},
)
class _RaisingCalculator:
def calculate(self, epc: EpcPropertyData) -> SapResult:
raise UnmappedSapCode("heat_emitter_type", 99)
class _StubCalculator:
def __init__(self, result: SapResult) -> None:
self._result = result
def calculate(self, epc: EpcPropertyData) -> SapResult:
return self._result
def test_observe_swallows_a_calculator_raise_and_logs_error(
caplog: pytest.LogCaptureFixture,
) -> None:
# Arrange — the calculator strict-raises on a cert it cannot yet map.
shadow = LoggingCalculatorShadow(_RaisingCalculator())
epc = _epc(sap_version=10.2)
# Act — observe must not propagate the raise (ADR-0013: shadow is not
# load-bearing, so it cannot abort the batch).
with caplog.at_level(logging.ERROR):
shadow.observe(property_id=42, effective_epc=epc, lodged=_lodged())
# Assert — exactly one error record, tagged with property_id + sap_version
# and carrying the exception so the wild-cohort gap is greppable.
assert len(caplog.records) == 1
message = caplog.records[0].getMessage()
assert caplog.records[0].levelno == logging.ERROR
assert "property_id=42" in message
assert "sap_version=10.2" in message
assert "heat_emitter_type" in message
def test_observe_warns_when_sap_diverges_beyond_half_a_point(
caplog: pytest.LogCaptureFixture,
) -> None:
# Arrange — calculated SAP 75.0 vs lodged 72 is 3.0 out (> 0.5).
shadow = LoggingCalculatorShadow(
_StubCalculator(_sap_result(sap_score_continuous=75.0))
)
epc = _epc(sap_version=10.2)
# Act
with caplog.at_level(logging.WARNING):
shadow.observe(property_id=42, effective_epc=epc, lodged=_lodged())
# Assert — one warning, naming the diverging quantity + the tags.
assert len(caplog.records) == 1
message = caplog.records[0].getMessage()
assert caplog.records[0].levelno == logging.WARNING
assert "sap_score" in message
assert "property_id=42" in message
assert "sap_version=10.2" in message
def test_observe_warns_when_peui_diverges_beyond_one_percent(
caplog: pytest.LogCaptureFixture,
) -> None:
# Arrange — calculated PEUI 200 vs lodged 180 is ~11% out (> 1%).
shadow = LoggingCalculatorShadow(
_StubCalculator(_sap_result(primary_energy_kwh_per_m2=200.0))
)
epc = _epc(sap_version=10.2)
# Act
with caplog.at_level(logging.WARNING):
shadow.observe(property_id=42, effective_epc=epc, lodged=_lodged())
# Assert
assert len(caplog.records) == 1
assert "primary_energy_intensity" in caplog.records[0].getMessage()
def test_observe_warns_when_co2_diverges_beyond_one_percent_after_kg_to_tonnes(
caplog: pytest.LogCaptureFixture,
) -> None:
# Arrange — calculator emits kg/yr; 2000 kg = 2.0 t vs lodged 1.8 t (~11%).
shadow = LoggingCalculatorShadow(
_StubCalculator(_sap_result(co2_kg_per_yr=2000.0))
)
epc = _epc(sap_version=10.2)
# Act
with caplog.at_level(logging.WARNING):
shadow.observe(property_id=42, effective_epc=epc, lodged=_lodged())
# Assert — the kg→tonnes conversion is applied before comparison, so a
# matching 1800 kg would *not* fire (guarded by the silent-when-aligned test).
assert len(caplog.records) == 1
assert "co2_emissions" in caplog.records[0].getMessage()
def test_observe_is_silent_when_the_calculator_agrees_with_lodged(
caplog: pytest.LogCaptureFixture,
) -> None:
# Arrange — all three quantities at the matching defaults (SAP 72, PEUI 180,
# 1800 kg ≡ 1.8 t): nothing should be logged.
shadow = LoggingCalculatorShadow(_StubCalculator(_sap_result()))
epc = _epc(sap_version=10.2)
# Act
with caplog.at_level(logging.WARNING):
shadow.observe(property_id=42, effective_epc=epc, lodged=_lodged())
# Assert
assert caplog.records == []

View file

@ -10,6 +10,8 @@ from types import TracebackType
from typing import Any, Optional
from datatypes.epc.domain.epc_property_data import EpcPropertyData
from domain.property_baseline.calculator_shadow import CalculatorShadow
from domain.property_baseline.performance import Performance
from domain.property_baseline.property_baseline_performance import PropertyBaselinePerformance
from domain.property.properties import Properties
from domain.property.property import Property
@ -88,6 +90,23 @@ class FakePropertyBaselineRepo(PropertyBaselineRepository):
raise NotImplementedError
class FakeCalculatorShadow(CalculatorShadow):
"""Records each `observe` call so a test can assert the orchestrator runs
the shadow per property without dragging in the real calculator."""
def __init__(self) -> None:
self.observed: list[tuple[int, EpcPropertyData, Performance]] = []
def observe(
self,
*,
property_id: int,
effective_epc: EpcPropertyData,
lodged: Performance,
) -> None:
self.observed.append((property_id, effective_epc, lodged))
class FakeUnitOfWork(UnitOfWork):
"""A unit that holds in-memory repos and counts commits."""

View file

@ -36,6 +36,7 @@ from repositories.geospatial.geospatial_repository import GeospatialRepository
from repositories.materials.materials_repository import MaterialsRepository
from repositories.postgres_unit_of_work import PostgresUnitOfWork
from repositories.scenario.scenario_repository import ScenarioRepository
from tests.orchestration.fakes import FakeCalculatorShadow
_JSON_SAMPLES = Path(__file__).resolve().parents[2] / "backend/epc_api/json_samples"
@ -111,7 +112,9 @@ def test_first_run_baselines_through_repos_and_is_idempotent_on_rerun(
solar_fetcher=_UnusedSolarFetcher(),
),
baseline=PropertyBaselineOrchestrator(
unit_of_work=unit_of_work, rebaseliner=StubRebaseliner()
unit_of_work=unit_of_work,
rebaseliner=StubRebaseliner(),
calculator_shadow=FakeCalculatorShadow(),
),
modelling=ModellingOrchestrator(
scenario_repo=ScenarioRepository(),

View file

@ -13,6 +13,7 @@ from domain.property_baseline.rebaseliner import RebaselineNotImplemented, StubR
from domain.property.property import Property, PropertyIdentity
from orchestration.property_baseline_orchestrator import PropertyBaselineOrchestrator
from tests.orchestration.fakes import (
FakeCalculatorShadow,
FakePropertyBaselineRepo,
FakePropertyRepo,
FakeUnitOfWork,
@ -37,6 +38,34 @@ def _property(*, sap_version: float) -> Property:
)
def test_run_invokes_the_calculator_shadow_per_property_and_still_persists() -> None:
# Arrange
property_baseline_repo = FakePropertyBaselineRepo()
shadow = FakeCalculatorShadow()
prop = _property(sap_version=10.2)
uow = FakeUnitOfWork(
property=FakePropertyRepo({10: prop}),
property_baseline=property_baseline_repo,
)
orchestrator = PropertyBaselineOrchestrator(
unit_of_work=lambda: uow,
rebaseliner=StubRebaseliner(),
calculator_shadow=shadow,
)
# Act
orchestrator.run([10])
# Assert — the load-bearing write + single commit are unchanged, and the
# shadow observed the Effective EPC + Lodged Performance once (ADR-0013).
lodged = Performance(
sap_score=72, epc_band=Epc.C, co2_emissions=1.8, primary_energy_intensity=180
)
assert len(property_baseline_repo.saved) == 1
assert uow.commits == 1
assert shadow.observed == [(10, prop.effective_epc, lodged)]
def test_run_establishes_persists_and_commits_the_batch_once() -> None:
# Arrange
property_baseline_repo = FakePropertyBaselineRepo()
@ -45,7 +74,9 @@ def test_run_establishes_persists_and_commits_the_batch_once() -> None:
property_baseline=property_baseline_repo,
)
orchestrator = PropertyBaselineOrchestrator(
unit_of_work=lambda: uow, rebaseliner=StubRebaseliner()
unit_of_work=lambda: uow,
rebaseliner=StubRebaseliner(),
calculator_shadow=FakeCalculatorShadow(),
)
# Act
@ -79,7 +110,9 @@ def test_run_raises_on_a_pre_sap10_property_and_does_not_commit() -> None:
property_baseline=property_baseline_repo,
)
orchestrator = PropertyBaselineOrchestrator(
unit_of_work=lambda: uow, rebaseliner=StubRebaseliner()
unit_of_work=lambda: uow,
rebaseliner=StubRebaseliner(),
calculator_shadow=FakeCalculatorShadow(),
)
# Act / Assert — the raise propagates; the batch is neither persisted nor