Merge branch 'feature/per-cert-mapper-validation' of https://github.com/Hestia-Homes/Model into feature/per-cert-mapper-validation

This commit is contained in:
Khalim Conn-Kowlessar 2026-06-01 15:16:28 +00:00
commit fb9b32ac3d
94 changed files with 4762 additions and 680 deletions

View file

@ -90,11 +90,11 @@ A Property's current performance aggregate, holding both Lodged Performance and
_Avoid_: baseline predictions, predicted baseline, rebaselined values
**Lodged Performance**:
The SAP / EPC Band / carbon emissions / heat demand recorded on the public EPC (or the Site Notes' as-surveyed values when Site Notes are the source) — unmodified by modelling. The half of Baseline Performance that says "what the government register says about this Property".
The SAP / EPC Band / carbon emissions / Primary Energy Intensity recorded on the public EPC (or the Site Notes' as-surveyed values when Site Notes are the source) — unmodified by modelling. The half of Baseline Performance that says "what the government register says about this Property".
_Avoid_: original performance, raw EPC values, recorded baseline
**Effective Performance**:
The SAP / EPC Band / carbon emissions / heat demand the modelling pipeline actually scored against — equal to Lodged Performance when no Rebaselining trigger fires, replaced by ML output when triggered. The half of Baseline Performance that says "what we modelled".
The SAP / EPC Band / carbon emissions / Primary Energy Intensity the modelling pipeline actually scored against — equal to Lodged Performance when no Rebaselining trigger fires, replaced by ML output when triggered. The half of Baseline Performance that says "what we modelled".
_Avoid_: modelled performance, rebaselined performance (only correct when rebaselining ran), scored values
**Calculated SAP10 Performance**:
@ -118,7 +118,7 @@ The process that translates an Optimised Package into cert-field changes and pro
_Avoid_: measure overrides (rejected during ADR-0009 grill — phantom mid-layer), package applier, retrofit simulator
**EPC Energy Derivation**:
The process that derives a Property's fuel split and annual bills from its space heating kWh and hot water kWh values plus the heating fuel deduced from SAP fields. kWh values themselves come from the EPC's recorded fields (`renewable_heat_incentive.space_heating_existing_dwelling` and `.water_heating`) for SAP10 baselines, or from ML prediction when Rebaselining fires or when scoring a post-measure state. Bills are computed deterministically from delivered kWh × current Fuel Rates + standing charges + SEG credits. The UCL Correction is no longer applied at runtime — it is folded into ML training labels (see [[epc-ml-transform]] and ADR-0007).
The process that derives a Property's fuel split and annual bills from its space heating kWh and hot water kWh values plus the heating fuel deduced from SAP fields. kWh values themselves come from the EPC's recorded fields (`renewable_heat_incentive.space_heating_kwh` and `.water_heating_kwh`) for SAP10 baselines, or from ML prediction when Rebaselining fires or when scoring a post-measure state. Bills are computed deterministically from delivered kWh × current Fuel Rates + standing charges + SEG credits. The UCL Correction is no longer applied at runtime — it is folded into ML training labels (see [[epc-ml-transform]] and ADR-0007).
_Avoid_: kWh prediction (kWh is now an ML target — see Rebaselining), baseline kWh, energy estimation
**UCL Correction**:
@ -129,6 +129,26 @@ _Avoid_: UCL adjustment, energy correction, metered correction
A per-field indicator that a Property's value for an EPC field differs significantly from Comparable Properties; advisory only — surfaces in the UI to prompt user review, does not block modelling.
_Avoid_: outlier, mismatch, divergence flag
### Pipeline composition
The modelling backend is composed from three independently-invocable **stage orchestrators**, chained differently per use case. This composability — not a single end-to-end function — is the point: it is what lets the interactive single-property flow pause between stages where the batch flows do not. (Supersedes the monolithic `model_engine`.)
**Ingestion**:
The first stage. Acquires a Property's external source data — the EPC certificate (New EPC API) and Google Solar insights — and resolves its coordinates, then writes everything to repos. Writes only; runs no modelling business logic. Per ADR-0003 nothing downstream reads across this seam by calling back to a source — downstream stages read the persisted data from repos.
_Avoid_: fetching (a fetch is one source call; Ingestion is the whole write stage), data load
**Baseline** (stage):
The second stage. Reads the persisted source data from repos, hydrates the **Property** aggregate, resolves its **Effective EPC**, and establishes its **Baseline Performance**. Re-scoring after a user override lives here. Distinct from **Baseline Performance** (the aggregate it produces).
_Avoid_: rebaseline (that is a specific ML trigger — see Rebaselining), enrichment
**Modelling** (stage):
The third stage. Takes the baselined Property plus a set of **Scenarios** and produces **Recommendations** → an **Optimised Package** per **Scenario Phase****Plans**, persisted to repos. A separate orchestrator from Baseline so the single-property flow can stop after Baseline and only run Modelling when the user hits "play".
_Avoid_: scoring (overloaded), recommendation engine
**First Run**:
The use case where a Property has only a row in the property table (post address→UPRN matching) and no existing **Plan**: the pipeline runs Ingestion → Baseline → Modelling end-to-end over a batch. The first sibling lambda being built (`ara_first_run`).
_Avoid_: initial run, cold run
### ML training
**EPC ML Transform**:

View file

@ -0,0 +1,34 @@
FROM public.ecr.aws/lambda/python:3.11
# Postgres host/port/database are baked into the image at build time from
# the deploy workflow's --build-arg values (GitHub Actions DEV_DB_* secrets),
# mirroring applications/postcode_splitter/Dockerfile. They map onto the
# POSTGRES_* names PostgresConfig.from_env reads. Username/password are NOT
# baked in -- Terraform injects those as Lambda env vars from Secrets Manager.
ARG DEV_DB_HOST
ARG DEV_DB_PORT
ARG DEV_DB_NAME
ENV POSTGRES_HOST=${DEV_DB_HOST}
ENV POSTGRES_PORT=${DEV_DB_PORT}
ENV POSTGRES_DATABASE=${DEV_DB_NAME}
WORKDIR /var/task
COPY applications/ara_first_run/requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy the layered source the handler imports from. DDD-shaped packages only —
# no pandas, no legacy backend/.
COPY domain/ domain/
COPY infrastructure/ infrastructure/
COPY orchestration/ orchestration/
COPY repositories/ repositories/
COPY utilities/ utilities/
COPY applications/ applications/
# Place the handler at the Lambda task root so the runtime can resolve
# ``main.handler`` without an extra package prefix.
COPY applications/ara_first_run/handler.py /var/task/main.py
CMD ["main.handler"]

View file

@ -0,0 +1,25 @@
from __future__ import annotations
from uuid import UUID
from pydantic import BaseModel
class AraFirstRunTriggerBody(BaseModel):
"""The SQS event the ``ara_first_run`` Lambda is triggered with.
A thin command. ``task_id``/``sub_task_id`` drive the SubTask lifecycle (the
``@subtask_handler`` decorator reads them); the three business fields are what
the pipeline threads downstream. UPRNs and Scenario definitions are
deliberately absent they are read from their source-of-truth tables, not
carried on the event (issue #1130).
No ``model_config`` override: Pydantic's default ``extra="ignore"`` lets the
FastAPI backend add fields to the payload without breaking deployed lambdas.
"""
task_id: UUID
sub_task_id: UUID
portfolio_id: int
property_ids: list[int]
scenario_ids: list[int]

View file

@ -0,0 +1,121 @@
from __future__ import annotations
import os
from collections.abc import Callable
from typing import Any, Optional, Protocol
from sqlalchemy import Engine
from sqlmodel import Session
from applications.ara_first_run.ara_first_run_trigger_body import (
AraFirstRunTriggerBody,
)
from domain.property_baseline.rebaseliner import StubRebaseliner
from infrastructure.postgres.config import PostgresConfig
from infrastructure.postgres.engine import make_engine
from orchestration.property_baseline_orchestrator import PropertyBaselineOrchestrator
from orchestration.ara_first_run_pipeline import AraFirstRunPipeline
from orchestration.ingestion_orchestrator import (
EpcFetcher,
IngestionOrchestrator,
SolarFetcher,
)
from orchestration.modelling_orchestrator import ModellingOrchestrator
from orchestration.task_orchestrator import TaskOrchestrator
from repositories.geospatial.geospatial_repository import GeospatialRepository
from repositories.materials.materials_repository import MaterialsRepository
from repositories.postgres_unit_of_work import PostgresUnitOfWork
from repositories.scenario.scenario_repository import ScenarioRepository
from repositories.unit_of_work import UnitOfWork
from utilities.aws_lambda.subtask_handler import subtask_handler
# Module-scoped so the connection pool is reused across warm Lambda invocations
# rather than rebuilt per invocation (ADR-0012).
_engine: Optional[Engine] = None
def _get_engine() -> Engine:
global _engine
if _engine is None:
_engine = make_engine(PostgresConfig.from_env(dict(os.environ)))
return _engine
class _RunsFirstRun(Protocol):
"""The slice of AraFirstRunPipeline the handler delegates to."""
def run(self, command: AraFirstRunTriggerBody) -> None: ...
def dispatch_first_run(body: dict[str, Any], *, pipeline: _RunsFirstRun) -> None:
"""Validate the raw event body and hand the command to the pipeline.
The handler's entire decision logic — kept as a named seam so it is
exercised without the Lambda runtime. No business logic: validate, delegate.
"""
trigger = AraFirstRunTriggerBody.model_validate(body)
pipeline.run(trigger)
def build_first_run_pipeline(
*,
unit_of_work: Callable[[], UnitOfWork],
epc_fetcher: EpcFetcher,
geospatial_repo: GeospatialRepository,
solar_fetcher: SolarFetcher,
) -> AraFirstRunPipeline:
"""Compose the real three-stage pipeline on a Unit-of-Work factory.
Each stage opens its own unit(s) and commits per batch (ADR-0012); the
handler no longer holds a session. The source clients are passed in because
their config is not settled see ``_source_clients_from_env``. Modelling is
stubbed (#1136); its Scenario / Materials ports are seams.
"""
return AraFirstRunPipeline(
ingestion=IngestionOrchestrator(
unit_of_work=unit_of_work,
epc_fetcher=epc_fetcher,
geospatial_repo=geospatial_repo,
solar_fetcher=solar_fetcher,
),
baseline=PropertyBaselineOrchestrator(
unit_of_work=unit_of_work,
rebaseliner=StubRebaseliner(),
),
modelling=ModellingOrchestrator(
scenario_repo=ScenarioRepository(),
materials_repo=MaterialsRepository(),
),
)
def _source_clients_from_env() -> tuple[EpcFetcher, GeospatialRepository, SolarFetcher]:
"""The Ingestion source clients — EPC API, Google Solar, geospatial S3.
TODO(deploy): their config (EPC auth token, Google Solar API key, geospatial
S3 parquet reader), env-var names, and the pandas/s3fs runtime deps are not
settled that wiring is a separate Terraform piece, out of scope for #1136.
Raises until then so the lambda fails loudly rather than half-running.
"""
raise NotImplementedError(
"ara_first_run source-client wiring (EPC / Google Solar / geospatial) "
"is pending the deploy/Terraform piece; see #1136."
)
@subtask_handler()
def handler(
body: dict[str, Any], context: Any, task_orchestrator: TaskOrchestrator
) -> None:
engine = _get_engine()
unit_of_work: Callable[[], UnitOfWork] = lambda: PostgresUnitOfWork(
lambda: Session(engine)
)
epc_fetcher, geospatial_repo, solar_fetcher = _source_clients_from_env()
pipeline = build_first_run_pipeline(
unit_of_work=unit_of_work,
epc_fetcher=epc_fetcher,
geospatial_repo=geospatial_repo,
solar_fetcher=solar_fetcher,
)
dispatch_first_run(body, pipeline=pipeline)

View file

@ -0,0 +1,28 @@
# Local-test environment for the ara_first_run Lambda.
#
# cp .env.local.example .env.local then fill in the values below.
#
# .env.local is gitignored. The container hits a REAL Postgres (the SubTask
# lifecycle store), so every value here points at infrastructure that exists.
#
# NOTE: the DDD code uses different env var names than the repo root .env. The
# mapping (root .env name -> var here) is given per section. Keep comments on
# their own lines — docker-compose's env_file parser folds a trailing "# ..."
# into the value.
# --- Postgres (utilities/aws_lambda/default_orchestrator -> PostgresConfig.from_env) ---
# POSTGRES_HOST <- DB_HOST, PORT <- DB_PORT, USERNAME <- DB_USERNAME,
# PASSWORD <- DB_PASSWORD, DATABASE <- DB_NAME.
POSTGRES_HOST=
POSTGRES_PORT=5432
POSTGRES_USERNAME=
POSTGRES_PASSWORD=
POSTGRES_DATABASE=
# POSTGRES_DRIVER=psycopg2 (optional; defaults to psycopg2)
# --- AWS credentials for boto3 (used by later slices; the SubTask lifecycle
# CloudWatch URL is read from the Lambda runtime's own AWS_* env in prod) ---
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
AWS_DEFAULT_REGION=eu-west-2
# AWS_SESSION_TOKEN= (only if using temporary/SSO credentials)

View file

@ -0,0 +1,9 @@
services:
ara-first-run:
build:
context: ../../../
dockerfile: applications/ara_first_run/Dockerfile
ports:
- "9002:8080"
env_file:
- .env.local

View file

@ -0,0 +1,30 @@
#!/usr/bin/env python3
import json
import requests
HOST = "localhost"
PORT = "9002"
LAMBDA_URL = f"http://{HOST}:{PORT}/2015-03-31/functions/function/invocations"
payload = {
"Records": [
{
"body": json.dumps(
{
"task_id": "e295d89b-a7c5-4a9a-8b4e-b405fab1f298",
"sub_task_id": "f4a9944f-41f0-4a33-8669-5016ec574068",
"portfolio_id": 42,
"property_ids": [101, 102, 103],
"scenario_ids": [7, 8],
}
)
}
]
}
response = requests.post(LAMBDA_URL, json=payload)
print("Status code:", response.status_code)
print("Response:")
print(response.text)

View file

@ -0,0 +1,12 @@
#!/usr/bin/env bash
set -euo pipefail
cd "$(dirname "$0")"
if [ ! -f .env.local ]; then
cp .env.local.example .env.local
echo "Created .env.local from the template — fill it in, then re-run." >&2
exit 1
fi
docker compose build --no-cache
docker compose up --force-recreate

View file

@ -0,0 +1,4 @@
boto3
pydantic
sqlmodel
psycopg2-binary

View file

@ -19,7 +19,7 @@ from backend.address2UPRN.scoring import all_uprns_match, rank_address_similarit
from datatypes.epc.domain.historic_epc_matching import (
match_addresses_for_postcode,
)
from backend.epc_client.epc_client_service import EpcClientService
from infrastructure.epc_client.epc_client_service import EpcClientService
from datatypes.epc.domain.historic_epc_matching import ScoredHistoricEpc
logger = setup_logger()

View file

@ -1,659 +1,29 @@
from __future__ import annotations
"""Re-export shim.
from typing import Optional
from sqlmodel import SQLModel, Field
The EPC persistence models moved to ``infrastructure/postgres/epc_property_table.py``
as part of the Ara backend rebuild (PRD Hestia-Homes/Model#1128, Slice 1 #1129).
This shim keeps the dying ``backend/`` callers working until cut-over. New code must
import from ``infrastructure.postgres.epc_property_table`` directly.
"""
from datatypes.epc.domain.epc_property_data import (
EpcPropertyData,
EnergyElement,
MainHeatingDetail,
SapBuildingPart,
SapFloorDimension,
SapFlatDetails,
SapWindow,
from infrastructure.postgres.epc_property_table import (
EpcBuildingPartModel,
EpcEnergyElementModel,
EpcFlatDetailsModel,
EpcFloorDimensionModel,
EpcMainHeatingDetailModel,
EpcPropertyEnergyPerformanceModel,
EpcPropertyModel,
EpcWindowModel,
)
class EpcPropertyModel(SQLModel, table=True):
__tablename__ = "epc_property"
id: Optional[int] = Field(default=None, primary_key=True)
property_id: Optional[int] = Field(default=None)
portfolio_id: Optional[int] = Field(default=None)
uploaded_file_id: Optional[int] = Field(default=None)
# Identity / admin
uprn: Optional[int] = Field(default=None)
uprn_source: Optional[str] = Field(default=None)
report_reference: Optional[str] = Field(default=None)
report_type: Optional[str] = Field(default=None)
assessment_type: Optional[str] = Field(default=None)
sap_version: Optional[float] = Field(default=None)
schema_type: Optional[str] = Field(default=None)
schema_versions_original: Optional[str] = Field(default=None)
status: Optional[str] = Field(default=None)
calculation_software_version: Optional[str] = Field(default=None)
# Address
address_line_1: Optional[str] = Field(default=None)
address_line_2: Optional[str] = Field(default=None)
post_town: Optional[str] = Field(default=None)
postcode: Optional[str] = Field(default=None)
region_code: Optional[str] = Field(default=None)
country_code: Optional[str] = Field(default=None)
language_code: Optional[str] = Field(default=None)
# Property description
dwelling_type: str
property_type: Optional[str] = Field(default=None)
built_form: Optional[str] = Field(default=None)
tenure: str
transaction_type: str
inspection_date: str # store as ISO string; cast on read if needed
completion_date: Optional[str] = Field(default=None)
registration_date: Optional[str] = Field(default=None)
total_floor_area_m2: float
measurement_type: Optional[int] = Field(default=None)
# Flags
solar_water_heating: bool
has_hot_water_cylinder: bool
has_fixed_air_conditioning: bool
has_conservatory: Optional[bool] = Field(default=None)
has_heated_separate_conservatory: Optional[bool] = Field(default=None)
conservatory_type: Optional[int] = Field(default=None)
# Counts
door_count: int
wet_rooms_count: int
extensions_count: int
heated_rooms_count: int
open_chimneys_count: int
habitable_rooms_count: int
insulated_door_count: int
cfl_fixed_lighting_bulbs_count: int
led_fixed_lighting_bulbs_count: int
incandescent_fixed_lighting_bulbs_count: int
blocked_chimneys_count: Optional[int] = Field(default=None)
draughtproofed_door_count: Optional[int] = Field(default=None)
energy_rating_average: Optional[int] = Field(default=None)
low_energy_fixed_lighting_bulbs_count: Optional[int] = Field(default=None)
fixed_lighting_outlets_count: Optional[int] = Field(default=None)
low_energy_fixed_lighting_outlets_count: Optional[int] = Field(default=None)
number_of_storeys: Optional[int] = Field(default=None)
any_unheated_rooms: Optional[bool] = Field(default=None)
# Misc
hydro: Optional[bool] = Field(default=None)
photovoltaic_array: Optional[bool] = Field(default=None)
waste_water_heat_recovery: Optional[str] = Field(default=None)
pressure_test: Optional[int] = Field(default=None)
pressure_test_certificate_number: Optional[int] = Field(default=None)
percent_draughtproofed: Optional[int] = Field(default=None)
insulated_door_u_value: Optional[float] = Field(default=None)
multiple_glazed_proportion: Optional[int] = Field(default=None)
windows_transmission_u_value: Optional[float] = Field(default=None)
windows_transmission_data_source: Optional[int] = Field(default=None)
windows_transmission_solar_transmittance: Optional[float] = Field(default=None)
# Energy source
energy_mains_gas: bool
energy_meter_type: str
energy_pv_battery_count: int
energy_wind_turbines_count: int
energy_gas_smart_meter_present: bool
energy_is_dwelling_export_capable: bool
energy_wind_turbines_terrain_type: str
energy_electricity_smart_meter_present: bool
energy_pv_connection: Optional[str] = Field(default=None)
energy_pv_percent_roof_area: Optional[int] = Field(default=None)
energy_pv_battery_capacity: Optional[float] = Field(default=None)
energy_wind_turbine_hub_height: Optional[float] = Field(default=None)
energy_wind_turbine_rotor_diameter: Optional[float] = Field(default=None)
# Heating config
heating_cylinder_size: Optional[str] = Field(default=None)
heating_water_heating_code: Optional[int] = Field(default=None)
heating_water_heating_fuel: Optional[int] = Field(default=None)
heating_immersion_heating_type: Optional[str] = Field(default=None)
heating_cylinder_insulation_type: Optional[str] = Field(default=None)
heating_cylinder_thermostat: Optional[str] = Field(default=None)
heating_secondary_fuel_type: Optional[int] = Field(default=None)
heating_secondary_heating_type: Optional[str] = Field(default=None)
heating_cylinder_insulation_thickness_mm: Optional[int] = Field(default=None)
heating_wwhrs_index_number_1: Optional[int] = Field(default=None)
heating_wwhrs_index_number_2: Optional[int] = Field(default=None)
heating_shower_outlet_type: Optional[str] = Field(default=None)
heating_shower_wwhrs: Optional[int] = Field(default=None)
# Ventilation
ventilation_type: Optional[str] = Field(default=None)
ventilation_draught_lobby: Optional[bool] = Field(default=None)
ventilation_pressure_test: Optional[str] = Field(default=None)
ventilation_open_flues_count: Optional[int] = Field(default=None)
ventilation_closed_flues_count: Optional[int] = Field(default=None)
ventilation_boiler_flues_count: Optional[int] = Field(default=None)
ventilation_other_flues_count: Optional[int] = Field(default=None)
ventilation_extract_fans_count: Optional[int] = Field(default=None)
ventilation_passive_vents_count: Optional[int] = Field(default=None)
ventilation_flueless_gas_fires_count: Optional[int] = Field(default=None)
ventilation_in_pcdf_database: Optional[bool] = Field(default=None)
mechanical_ventilation: Optional[int] = Field(default=None)
mechanical_vent_duct_type: Optional[int] = Field(default=None)
mechanical_vent_duct_placement: Optional[int] = Field(default=None)
mechanical_vent_duct_insulation: Optional[int] = Field(default=None)
mechanical_ventilation_index_number: Optional[int] = Field(default=None)
mechanical_vent_measured_installation: Optional[str] = Field(default=None)
@classmethod
def from_epc_property_data(
cls,
data: EpcPropertyData,
property_id: Optional[int] = None,
portfolio_id: Optional[int] = None,
) -> EpcPropertyModel:
es = data.sap_energy_source
h = data.sap_heating
v = data.sap_ventilation
shower = h.shower_outlets.shower_outlet if h.shower_outlets else None
pv = es.photovoltaic_supply
wt = es.wind_turbine_details
pvb = es.pv_batteries
return cls(
property_id=property_id,
portfolio_id=portfolio_id,
uprn=data.uprn,
uprn_source=data.uprn_source,
report_reference=data.report_reference,
report_type=data.report_type,
assessment_type=data.assessment_type,
sap_version=data.sap_version,
schema_type=data.schema_type,
schema_versions_original=data.schema_versions_original,
status=data.status,
calculation_software_version=data.calculation_software_version,
address_line_1=data.address_line_1,
address_line_2=data.address_line_2,
post_town=data.post_town,
postcode=data.postcode,
region_code=data.region_code,
country_code=data.country_code,
language_code=data.language_code,
dwelling_type=data.dwelling_type,
property_type=data.property_type,
built_form=data.built_form,
tenure=data.tenure,
transaction_type=data.transaction_type,
inspection_date=data.inspection_date.isoformat(),
completion_date=(
data.completion_date.isoformat() if data.completion_date else None
),
registration_date=(
data.registration_date.isoformat() if data.registration_date else None
),
total_floor_area_m2=data.total_floor_area_m2,
measurement_type=data.measurement_type,
solar_water_heating=data.solar_water_heating,
has_hot_water_cylinder=data.has_hot_water_cylinder,
has_fixed_air_conditioning=data.has_fixed_air_conditioning,
has_conservatory=data.has_conservatory,
has_heated_separate_conservatory=data.has_heated_separate_conservatory,
conservatory_type=data.conservatory_type,
door_count=data.door_count,
wet_rooms_count=data.wet_rooms_count,
extensions_count=data.extensions_count,
heated_rooms_count=data.heated_rooms_count,
open_chimneys_count=data.open_chimneys_count,
habitable_rooms_count=data.habitable_rooms_count,
insulated_door_count=data.insulated_door_count,
cfl_fixed_lighting_bulbs_count=data.cfl_fixed_lighting_bulbs_count,
led_fixed_lighting_bulbs_count=data.led_fixed_lighting_bulbs_count,
incandescent_fixed_lighting_bulbs_count=data.incandescent_fixed_lighting_bulbs_count,
blocked_chimneys_count=data.blocked_chimneys_count,
draughtproofed_door_count=data.draughtproofed_door_count,
energy_rating_average=data.energy_rating_average,
low_energy_fixed_lighting_bulbs_count=data.low_energy_fixed_lighting_bulbs_count,
fixed_lighting_outlets_count=data.fixed_lighting_outlets_count,
low_energy_fixed_lighting_outlets_count=data.low_energy_fixed_lighting_outlets_count,
number_of_storeys=data.number_of_storeys,
any_unheated_rooms=data.any_unheated_rooms,
hydro=data.hydro,
photovoltaic_array=data.photovoltaic_array,
waste_water_heat_recovery=data.waste_water_heat_recovery,
pressure_test=data.pressure_test,
pressure_test_certificate_number=data.pressure_test_certificate_number,
percent_draughtproofed=data.percent_draughtproofed,
insulated_door_u_value=data.insulated_door_u_value,
multiple_glazed_proportion=data.multiple_glazed_proportion,
windows_transmission_u_value=(
data.windows_transmission_details.u_value
if data.windows_transmission_details
else None
),
windows_transmission_data_source=(
data.windows_transmission_details.data_source
if data.windows_transmission_details
else None
),
windows_transmission_solar_transmittance=(
data.windows_transmission_details.solar_transmittance
if data.windows_transmission_details
else None
),
energy_mains_gas=es.mains_gas,
energy_meter_type=str(es.meter_type),
energy_pv_battery_count=es.pv_battery_count,
energy_wind_turbines_count=es.wind_turbines_count,
energy_gas_smart_meter_present=es.gas_smart_meter_present,
energy_is_dwelling_export_capable=es.is_dwelling_export_capable,
energy_wind_turbines_terrain_type=str(es.wind_turbines_terrain_type),
energy_electricity_smart_meter_present=es.electricity_smart_meter_present,
energy_pv_connection=(
str(es.pv_connection) if es.pv_connection is not None else None
),
energy_pv_percent_roof_area=(
pv.none_or_no_details.percent_roof_area if pv else None
),
energy_pv_battery_capacity=pvb.pv_battery.battery_capacity if pvb else None,
energy_wind_turbine_hub_height=wt.hub_height if wt else None,
energy_wind_turbine_rotor_diameter=wt.rotor_diameter if wt else None,
heating_cylinder_size=(
str(h.cylinder_size) if h.cylinder_size is not None else None
),
heating_water_heating_code=h.water_heating_code,
heating_water_heating_fuel=h.water_heating_fuel,
heating_immersion_heating_type=(
str(h.immersion_heating_type)
if h.immersion_heating_type is not None
else None
),
heating_cylinder_insulation_type=(
str(h.cylinder_insulation_type)
if h.cylinder_insulation_type is not None
else None
),
heating_cylinder_thermostat=h.cylinder_thermostat,
heating_secondary_fuel_type=h.secondary_fuel_type,
heating_secondary_heating_type=(
str(h.secondary_heating_type)
if h.secondary_heating_type is not None
else None
),
heating_cylinder_insulation_thickness_mm=h.cylinder_insulation_thickness_mm,
heating_wwhrs_index_number_1=h.instantaneous_wwhrs.wwhrs_index_number1,
heating_wwhrs_index_number_2=h.instantaneous_wwhrs.wwhrs_index_number2,
heating_shower_outlet_type=(
str(shower.shower_outlet_type) if shower else None
),
heating_shower_wwhrs=shower.shower_wwhrs if shower else None,
ventilation_type=v.ventilation_type if v else None,
ventilation_draught_lobby=v.draught_lobby if v else None,
ventilation_pressure_test=v.pressure_test if v else None,
ventilation_open_flues_count=v.open_flues_count if v else None,
ventilation_closed_flues_count=v.closed_flues_count if v else None,
ventilation_boiler_flues_count=v.boiler_flues_count if v else None,
ventilation_other_flues_count=v.other_flues_count if v else None,
ventilation_extract_fans_count=v.extract_fans_count if v else None,
ventilation_passive_vents_count=v.passive_vents_count if v else None,
ventilation_flueless_gas_fires_count=(
v.flueless_gas_fires_count if v else None
),
ventilation_in_pcdf_database=v.ventilation_in_pcdf_database if v else None,
mechanical_ventilation=data.mechanical_ventilation,
mechanical_vent_duct_type=data.mechanical_vent_duct_type,
mechanical_vent_duct_placement=data.mechanical_vent_duct_placement,
mechanical_vent_duct_insulation=data.mechanical_vent_duct_insulation,
mechanical_ventilation_index_number=data.mechanical_ventilation_index_number,
mechanical_vent_measured_installation=data.mechanical_vent_measured_installation,
)
class EpcPropertyEnergyPerformanceModel(SQLModel, table=True):
__tablename__ = "epc_property_energy_performance"
id: Optional[int] = Field(default=None, primary_key=True)
epc_property_id: int = Field(
foreign_key="epc_property.id", nullable=False, unique=True
)
energy_rating_current: Optional[int] = Field(default=None)
energy_consumption_current: Optional[int] = Field(default=None)
environmental_impact_current: Optional[int] = Field(default=None)
heating_cost_current: Optional[float] = Field(default=None)
lighting_cost_current: Optional[float] = Field(default=None)
hot_water_cost_current: Optional[float] = Field(default=None)
co2_emissions_current: Optional[float] = Field(default=None)
co2_emissions_current_per_floor_area: Optional[int] = Field(default=None)
current_energy_efficiency_band: Optional[str] = Field(default=None)
energy_rating_potential: Optional[float] = Field(default=None)
energy_consumption_potential: Optional[int] = Field(default=None)
environmental_impact_potential: Optional[int] = Field(default=None)
heating_cost_potential: Optional[float] = Field(default=None)
lighting_cost_potential: Optional[float] = Field(default=None)
hot_water_cost_potential: Optional[float] = Field(default=None)
co2_emissions_potential: Optional[float] = Field(default=None)
potential_energy_efficiency_band: Optional[str] = Field(default=None)
@classmethod
def from_epc_property_data(
cls, data: EpcPropertyData, epc_property_id: int
) -> EpcPropertyEnergyPerformanceModel:
return cls(
epc_property_id=epc_property_id,
energy_rating_current=data.energy_rating_current,
energy_consumption_current=data.energy_consumption_current,
environmental_impact_current=data.environmental_impact_current,
heating_cost_current=data.heating_cost_current,
lighting_cost_current=data.lighting_cost_current,
hot_water_cost_current=data.hot_water_cost_current,
co2_emissions_current=data.co2_emissions_current,
co2_emissions_current_per_floor_area=data.co2_emissions_current_per_floor_area,
current_energy_efficiency_band=(
data.current_energy_efficiency_band.value
if data.current_energy_efficiency_band
else None
),
energy_rating_potential=data.energy_rating_potential,
energy_consumption_potential=data.energy_consumption_potential,
environmental_impact_potential=data.environmental_impact_potential,
heating_cost_potential=data.heating_cost_potential,
lighting_cost_potential=data.lighting_cost_potential,
hot_water_cost_potential=data.hot_water_cost_potential,
co2_emissions_potential=data.co2_emissions_potential,
potential_energy_efficiency_band=(
data.potential_energy_efficiency_band.value
if data.potential_energy_efficiency_band
else None
),
)
class EpcFlatDetailsModel(SQLModel, table=True):
__tablename__ = "epc_flat_details"
id: Optional[int] = Field(default=None, primary_key=True)
epc_property_id: int = Field(
foreign_key="epc_property.id", nullable=False, unique=True
)
level: int
top_storey: str
flat_location: int
heat_loss_corridor: int
storey_count: Optional[int] = Field(default=None)
unheated_corridor_length_m: Optional[int] = Field(default=None)
@classmethod
def from_domain(
cls, flat: SapFlatDetails, epc_property_id: int
) -> EpcFlatDetailsModel:
return cls(
epc_property_id=epc_property_id,
level=flat.level,
top_storey=flat.top_storey,
flat_location=flat.flat_location,
heat_loss_corridor=flat.heat_loss_corridor,
storey_count=flat.storey_count,
unheated_corridor_length_m=flat.unheated_corridor_length_m,
)
class EpcMainHeatingDetailModel(SQLModel, table=True):
__tablename__ = "epc_main_heating_detail"
id: Optional[int] = Field(default=None, primary_key=True)
epc_property_id: int = Field(foreign_key="epc_property.id", nullable=False)
has_fghrs: bool
main_fuel_type: str
heat_emitter_type: str
emitter_temperature: str
main_heating_control: str
fan_flue_present: Optional[bool] = Field(default=None)
boiler_flue_type: Optional[int] = Field(default=None)
boiler_ignition_type: Optional[int] = Field(default=None)
central_heating_pump_age: Optional[int] = Field(default=None)
central_heating_pump_age_str: Optional[str] = Field(default=None)
main_heating_index_number: Optional[int] = Field(default=None)
sap_main_heating_code: Optional[int] = Field(default=None)
main_heating_number: Optional[int] = Field(default=None)
main_heating_category: Optional[int] = Field(default=None)
main_heating_fraction: Optional[int] = Field(default=None)
main_heating_data_source: Optional[int] = Field(default=None)
condensing: Optional[bool] = Field(default=None)
weather_compensator: Optional[bool] = Field(default=None)
@classmethod
def from_domain(
cls, detail: MainHeatingDetail, epc_property_id: int
) -> EpcMainHeatingDetailModel:
return cls(
epc_property_id=epc_property_id,
has_fghrs=detail.has_fghrs,
main_fuel_type=str(detail.main_fuel_type),
heat_emitter_type=str(detail.heat_emitter_type),
emitter_temperature=str(detail.emitter_temperature),
main_heating_control=str(detail.main_heating_control),
fan_flue_present=detail.fan_flue_present,
boiler_flue_type=detail.boiler_flue_type,
boiler_ignition_type=detail.boiler_ignition_type,
central_heating_pump_age=detail.central_heating_pump_age,
central_heating_pump_age_str=detail.central_heating_pump_age_str,
main_heating_index_number=detail.main_heating_index_number,
sap_main_heating_code=detail.sap_main_heating_code,
main_heating_number=detail.main_heating_number,
main_heating_category=detail.main_heating_category,
main_heating_fraction=detail.main_heating_fraction,
main_heating_data_source=detail.main_heating_data_source,
condensing=detail.condensing,
weather_compensator=detail.weather_compensator,
)
class EpcBuildingPartModel(SQLModel, table=True):
__tablename__ = "epc_building_part"
id: Optional[int] = Field(default=None, primary_key=True)
epc_property_id: int = Field(foreign_key="epc_property.id", nullable=False)
identifier: str
construction_age_band: str
wall_construction: str
wall_insulation_type: str
wall_thickness_measured: bool
party_wall_construction: str
building_part_number: Optional[int] = Field(default=None)
wall_dry_lined: Optional[bool] = Field(default=None)
wall_thickness_mm: Optional[int] = Field(default=None)
wall_insulation_thickness: Optional[str] = Field(default=None)
floor_heat_loss: Optional[int] = Field(default=None)
floor_insulation_thickness: Optional[str] = Field(default=None)
flat_roof_insulation_thickness: Optional[str] = Field(default=None)
floor_type: Optional[str] = Field(default=None)
floor_construction_type: Optional[str] = Field(default=None)
floor_insulation_type_str: Optional[str] = Field(default=None)
floor_u_value_known: Optional[bool] = Field(default=None)
roof_construction: Optional[int] = Field(default=None)
roof_insulation_location: Optional[str] = Field(default=None)
roof_insulation_thickness: Optional[str] = Field(default=None)
room_in_roof_floor_area: Optional[float] = Field(default=None)
room_in_roof_construction_age_band: Optional[str] = Field(default=None)
alt_wall_1_area: Optional[float] = Field(default=None)
alt_wall_1_dry_lined: Optional[str] = Field(default=None)
alt_wall_1_construction: Optional[int] = Field(default=None)
alt_wall_1_insulation_type: Optional[int] = Field(default=None)
alt_wall_1_thickness_measured: Optional[str] = Field(default=None)
alt_wall_1_insulation_thickness: Optional[str] = Field(default=None)
alt_wall_2_area: Optional[float] = Field(default=None)
alt_wall_2_dry_lined: Optional[str] = Field(default=None)
alt_wall_2_construction: Optional[int] = Field(default=None)
alt_wall_2_insulation_type: Optional[int] = Field(default=None)
alt_wall_2_thickness_measured: Optional[str] = Field(default=None)
alt_wall_2_insulation_thickness: Optional[str] = Field(default=None)
@classmethod
def from_domain(
cls, part: SapBuildingPart, epc_property_id: int
) -> EpcBuildingPartModel:
rir = part.sap_room_in_roof
aw1 = part.sap_alternative_wall_1
aw2 = part.sap_alternative_wall_2
return cls(
epc_property_id=epc_property_id,
identifier=part.identifier.value,
construction_age_band=part.construction_age_band,
wall_construction=str(part.wall_construction),
wall_insulation_type=str(part.wall_insulation_type),
wall_thickness_measured=part.wall_thickness_measured,
party_wall_construction=str(part.party_wall_construction),
building_part_number=part.building_part_number,
wall_dry_lined=part.wall_dry_lined,
wall_thickness_mm=part.wall_thickness_mm,
wall_insulation_thickness=part.wall_insulation_thickness,
floor_heat_loss=part.floor_heat_loss,
floor_insulation_thickness=part.floor_insulation_thickness,
flat_roof_insulation_thickness=(
str(part.flat_roof_insulation_thickness)
if part.flat_roof_insulation_thickness is not None
else None
),
floor_type=part.floor_type,
floor_construction_type=part.floor_construction_type,
floor_insulation_type_str=part.floor_insulation_type_str,
floor_u_value_known=part.floor_u_value_known,
roof_construction=part.roof_construction,
roof_insulation_location=(
str(part.roof_insulation_location)
if part.roof_insulation_location is not None
else None
),
roof_insulation_thickness=(
str(part.roof_insulation_thickness)
if part.roof_insulation_thickness is not None
else None
),
room_in_roof_floor_area=float(rir.floor_area) if rir else None,
room_in_roof_construction_age_band=(
rir.construction_age_band if rir else None
),
alt_wall_1_area=aw1.wall_area if aw1 else None,
alt_wall_1_dry_lined=aw1.wall_dry_lined if aw1 else None,
alt_wall_1_construction=aw1.wall_construction if aw1 else None,
alt_wall_1_insulation_type=aw1.wall_insulation_type if aw1 else None,
alt_wall_1_thickness_measured=aw1.wall_thickness_measured if aw1 else None,
alt_wall_1_insulation_thickness=(
aw1.wall_insulation_thickness if aw1 else None
),
alt_wall_2_area=aw2.wall_area if aw2 else None,
alt_wall_2_dry_lined=aw2.wall_dry_lined if aw2 else None,
alt_wall_2_construction=aw2.wall_construction if aw2 else None,
alt_wall_2_insulation_type=aw2.wall_insulation_type if aw2 else None,
alt_wall_2_thickness_measured=aw2.wall_thickness_measured if aw2 else None,
alt_wall_2_insulation_thickness=(
aw2.wall_insulation_thickness if aw2 else None
),
)
class EpcFloorDimensionModel(SQLModel, table=True):
__tablename__ = "epc_floor_dimension"
id: Optional[int] = Field(default=None, primary_key=True)
epc_building_part_id: int = Field(
foreign_key="epc_building_part.id", nullable=False
)
floor: Optional[int] = Field(default=None)
room_height_m: float
total_floor_area_m2: float
party_wall_length_m: float
heat_loss_perimeter_m: float
floor_insulation: Optional[int] = Field(default=None)
floor_construction: Optional[int] = Field(default=None)
@classmethod
def from_domain(
cls, dim: SapFloorDimension, epc_building_part_id: int
) -> EpcFloorDimensionModel:
return cls(
epc_building_part_id=epc_building_part_id,
floor=dim.floor,
room_height_m=dim.room_height_m,
total_floor_area_m2=dim.total_floor_area_m2,
party_wall_length_m=dim.party_wall_length_m,
heat_loss_perimeter_m=dim.heat_loss_perimeter_m,
floor_insulation=dim.floor_insulation,
floor_construction=dim.floor_construction,
)
class EpcWindowModel(SQLModel, table=True):
__tablename__ = "epc_window"
id: Optional[int] = Field(default=None, primary_key=True)
epc_property_id: int = Field(foreign_key="epc_property.id", nullable=False)
frame_material: Optional[str] = Field(default=None)
glazing_gap: str
orientation: str
window_type: str
glazing_type: str
window_width: float
window_height: float
draught_proofed: bool
window_location: str
window_wall_type: str
permanent_shutters_present: bool
frame_factor: Optional[float] = Field(default=None)
permanent_shutters_insulated: Optional[str] = Field(default=None)
transmission_u_value: Optional[float] = Field(default=None)
transmission_data_source: Optional[str] = Field(default=None)
transmission_solar_transmittance: Optional[float] = Field(default=None)
@classmethod
def from_domain(cls, window: SapWindow, epc_property_id: int) -> EpcWindowModel:
td = window.window_transmission_details
return cls(
epc_property_id=epc_property_id,
frame_material=window.frame_material,
glazing_gap=str(window.glazing_gap),
orientation=str(window.orientation),
window_type=str(window.window_type),
glazing_type=str(window.glazing_type),
window_width=window.window_width,
window_height=window.window_height,
draught_proofed=bool(window.draught_proofed),
window_location=str(window.window_location),
window_wall_type=str(window.window_wall_type),
permanent_shutters_present=bool(window.permanent_shutters_present),
frame_factor=window.frame_factor,
permanent_shutters_insulated=window.permanent_shutters_insulated,
transmission_u_value=td.u_value if td else None,
transmission_data_source=td.data_source if td else None,
transmission_solar_transmittance=td.solar_transmittance if td else None,
)
class EpcEnergyElementModel(SQLModel, table=True):
__tablename__ = "epc_energy_element"
id: Optional[int] = Field(default=None, primary_key=True)
epc_property_id: int = Field(foreign_key="epc_property.id", nullable=False)
element_type: str # roof | wall | floor | main_heating | window | lighting | hot_water | secondary_heating | main_heating_controls
description: str
energy_efficiency_rating: int
environmental_efficiency_rating: int
@classmethod
def from_domain(
cls, element: EnergyElement, element_type: str, epc_property_id: int
) -> EpcEnergyElementModel:
return cls(
epc_property_id=epc_property_id,
element_type=element_type,
description=element.description,
energy_efficiency_rating=element.energy_efficiency_rating,
environmental_efficiency_rating=element.environmental_efficiency_rating,
)
__all__ = [
"EpcBuildingPartModel",
"EpcEnergyElementModel",
"EpcFlatDetailsModel",
"EpcFloorDimensionModel",
"EpcMainHeatingDetailModel",
"EpcPropertyEnergyPerformanceModel",
"EpcPropertyModel",
"EpcWindowModel",
]

View file

@ -1,3 +0,0 @@
from backend.epc_client.epc_client_service import EpcClientService
__all__ = ["EpcClientService"]

View file

@ -1,5 +1,8 @@
# Strict separation between Ingestion and Modelling
**Status: Accepted, refined by [ADR-0011](0011-composable-stage-orchestrators.md).** The one-way flow below stands. ADR-0011 generalises the chaining rule: it is no longer "only a `RefreshOrchestrator` may chain" — it is *"only a top-level use-case pipeline orchestrator (e.g. `FirstRunPipeline`) may chain across the Ingestion→Modelling seam; the stage orchestrators communicate through repos and never call across it."*
Data flows one way only: **Ingestion → Repos → Modelling**. Modelling services never make external HTTP calls; Ingestion services never run business logic. If Modelling needs fresh data, it sees a stale record in a repo and returns; the caller (a refresh orchestrator or the FE) decides whether to ingest first. We considered allowing modelling services to call fetchers directly on cache miss — convenient — and rejected it.
The trade-off is that modelling cannot "self-heal" by going to the gov EPC API when it finds stale data. The benefit is that modelling becomes a deterministic function of repository state: same Property in the repos, same modelling output. That is the property that makes modelling unit-testable against fakes (no DB, no network, no ML lambda), reproducible, and debuggable. It also enables a per-property UI flow where fetched data is shown to the user for review and possible override **before** modelling runs.

View file

@ -1,13 +1,41 @@
# `BaselinePerformance` stores both lodged and effective values
# `PropertyBaselinePerformance` stores both lodged and effective values
A Property's current performance has two states we care about: the rating that was lodged on the government register (the "lodged" SAP / band / carbon / heat) and the rating produced by the modelling pipeline against the current Effective EPC (the "effective" values, which may have been rebaselined by ML when the EPC was pre-SAP10 or when Landlord Overrides / Site Notes changed physical state). We considered storing a single set of values — the rebaselined-if-needed-otherwise-lodged figures — and rejected that. Both are stored as a pair on every `BaselinePerformance`, equal when no rebaselining trigger fires.
A Property's current performance has two states we care about: the rating that was lodged on the government register (the "lodged" SAP / band / carbon / heat) and the rating produced by the modelling pipeline against the current Effective EPC (the "effective" values, which may have been rebaselined by ML when the EPC was pre-SAP10 or when Landlord Overrides / Site Notes changed physical state). We considered storing a single set of values — the rebaselined-if-needed-otherwise-lodged figures — and rejected that. Both are stored as a pair on every `PropertyBaselinePerformance`, equal when no rebaselining trigger fires.
The pair lets the FE show "this is what the gov register says vs this is the SAP10-equivalent we modelled against" side by side without a second query, and keeps the audit trail clean: a user looking at a property's plan can see exactly which figure drove the recommendation pipeline. Storing only one set forces a downstream consumer to recompute the missing one from raw EPC fields when it needs both, which is the kind of derivation creep we want to keep out of the FE.
The cost is a wider row + the discipline that **every** `BaselinePerformance` populates both halves, even when they're equal. Annual kWh, fuel split and bills are not paired — they are always derived deterministically by `EpcEnergyDerivationService` against the Effective state, because the EPC's recorded cost fields use fuel rates pinned to the inspection date and the UCL correction depends on the modelled band.
The cost is a wider row + the discipline that **every** `PropertyBaselinePerformance` populates both halves, even when they're equal. Annual kWh, fuel split and bills are not paired — they are always derived deterministically by `EpcEnergyDerivationService` against the Effective state, because the EPC's recorded cost fields use fuel rates pinned to the inspection date and the UCL correction depends on the modelled band.
## Consequences
- Schema migration: `property_details_epc` (or its successor) carries 8 fields instead of 4 for the SAP-equivalent block.
- Reversing this means rewriting every consumer that has learned to read both values. Hard to roll back once the FE depends on the pair.
- The rebaseline trigger has two reasons (`pre_sap10`, `physical_state_changed`, or `both`) — store the reason alongside so we know *why* a property was rebaselined when debugging.
### Amendment (2026-05-30, #1135): standalone `property_baseline_performance` table
The original consequence read *"`property_details_epc` (or its successor) carries 8 fields
instead of 4 for the SAP-equivalent block"* — i.e. the pair as columns on the EPC-details table.
That is superseded. `property_details_epc` is being **retired**: it is too tightly coupled to the
schema of the legacy EPC API, which the Ara rebuild is moving off. So the pair has no home there.
`PropertyBaselinePerformance` instead persists as its **own standalone `property_baseline_performance` table, one
row per Property**, behind a dedicated `PropertyBaselineRepository` port (`save` / `get_for_property`),
mirroring the EPC slice's repo shape. This is the cleaner model regardless of the retirement:
`PropertyBaselinePerformance` is its own aggregate (a Property's current performance), not a detail of any
single EPC.
The row is **flat typed columns**, not a JSONB blob, because the FE both surfaces the block and
queries the lodged-vs-effective pair: `lodged_{sap_score, epc_band, co2_emissions,
primary_energy_intensity}`, the four `effective_*` mirrors, `rebaseline_reason`, and (for the part
of the energy block that needs no derivation) `space_heating_kwh` / `water_heating_kwh`. The
fourth paired quantity is **Primary Energy Intensity**, not "heat demand" — see CONTEXT.md
(the prose above predates that term being sharpened).
Fuel split and bills — the rest of the EPC Energy Derivation block — are **deferred to a
follow-up**: bills require a current Fuel Rates source (Ofgem-cap ETL) that does not yet exist, and
fuel split is produced by the same `EpcEnergyDerivationService`, so the two land together rather
than churning the table twice.
The SQLModel row is defined in `infrastructure/postgres/` so the ephemeral-Postgres tests build it
via `create_all`; the production migration is FE-owned (Drizzle ORM) and tracked in
`docs/migrations/`.

View file

@ -0,0 +1,41 @@
# Composable stage orchestrators; one lambda per use case; stages communicate through repos
**Status: Accepted.** Refines [ADR-0003](0003-strict-ingestion-modelling-separation.md) (Ingestion→Repos→Modelling one-way flow) for the concrete shape of the rebuilt backend. Decided in a `/grill-with-docs` session (2026-05-30) before the first `ara_first_run` slice. Replaces the stale §4 / §9 / §11 architecture of `ara_backend_design.md`, which predates this thinking.
## Context
The pipeline must serve three use cases from the *same building blocks*:
- **First Run** (batch) — a property has only a row in the property table; run everything end-to-end.
- **Refresh** (batch) — re-check for new data and re-model if it changed.
- **Single-property interactive** (a new front end) — fetch, **pause** for the user to validate/override, re-score, **pause** again, then model on demand.
The single-property flow is the forcing function: it must be able to stop *between* establishing baseline data and producing recommendations. The legacy `model_engine` (one 1331-line function) cannot be re-entered partway, which is why it cannot serve this flow.
## Decision
**Three independently-invocable stage orchestrators**, in `orchestration/`:
| Stage | Reads | Writes | Role |
|---|---|---|---|
| `IngestionOrchestrator` | Fetchers (EPC, Solar) + reference Repos (Geospatial) | source Repos | acquire + persist external source data |
| `BaselineOrchestrator` | source Repos | `Property` + Baseline Performance | hydrate the aggregate; resolve Effective EPC; re-score on override |
| `ModellingOrchestrator` | baselined Repos + Scenario/Materials Repos | Plans / Recommendations Repos | scenarios → recommendations → optimise → plans |
**One lambda per use case** composes these via a thin pipeline object. `applications/ara_first_run/` is the first: a `handler.py` that only wires dependencies and delegates to a `FirstRunPipeline` (`Ingestion → Baseline → Modelling`). `refresh` and the single-property app are later siblings composing the *same three* stages differently.
**Stages communicate through the repos, not in-memory.** The pipeline threads only identifiers (`property_ids`) between stages; each stage reads what it needs from repos and writes its outputs back. Baseline is therefore byte-identical whether ingestion ran 50 ms ago (First Run) or last week (single-property review) — there is no second entry mode.
**Data-source taxonomy: "external" does not mean "Fetcher."** A **Fetcher** hits a *live, per-entity* API and returns raw data (infra client, no DB): the New EPC API, Google Solar. A **Repo** reads *stored data by key* — ours *or* a hosted reference dataset — and returns domain objects (no HTTP): Ordnance Survey Open-UPRN coordinates (`GeospatialRepo`), cost data (`MaterialsRepo`). When a fetch needs reference data (Solar needs lat/long), the **orchestrator** reads the repo and threads the value into the fetcher; fetchers never call each other.
## Considered options
- **One lambda per stage, coordinated by AWS Step Functions** — rejected. Step Functions buys cross-lambda completion signalling we don't need when the three stages are cheap to keep warm in one process and a batch is bite-size (≤~100 properties). Promoting a stage to its own lambda later is cheap *because* it is already a separate class.
- **In-memory hand-off between stages in First Run** — rejected as the default. It gives `BaselineOrchestrator` two entry modes (fresh object vs repo read) and hides EPC persistence loss until a later Refresh reads the data back. Going through repos surfaces that loss inside First Run on day one. May be added later as an opt-in fast path where a profiler justifies it.
## Consequences
- A few redundant reads of rows just written, within one process — negligible at batch scale, and the price of each stage being a pure function of repo state.
- Each stage is unit-testable against fake repos with no upstream stage present.
- No HTTP library may appear in the `BaselineOrchestrator` / `ModellingOrchestrator` import graph (ADR-0003 holds per-stage).
- Because stages round-trip `EpcPropertyData` through persistence in First Run, a **persistence round-trip fidelity test** (fetch EPCs across schema versions → map → save → load → map back → assert deep-equality) is a prerequisite deliverable: it is what proves `epc_property` + child tables actually cover the domain object, and surfaces any required FE-owned migration early.

View file

@ -0,0 +1,31 @@
# Each stage commits its batch once, through a Unit of Work
**Status: Accepted.** Refines [ADR-0011](0011-composable-stage-orchestrators.md) (composable stage orchestrators, stages communicate through repos) with the persistence/transaction mechanics for batch processing. Decided in a `/grill-with-docs` session (2026-05-31) after the First Run spine (#1136) landed, prompted by reviewing the handler's session lifecycle.
## Context
A First Run trigger carries a **batch** of ~30 `property_ids`. The pipeline runs that batch through Ingestion → Baseline → Modelling. The first cut (#1136) wrapped **all three stages in one `Session` and one final `commit()`** in the handler. That has three problems:
1. **A connection is pinned for the whole long-running pipeline.** SQLAlchemy checks out a pooled connection on the first statement and holds it until commit. Ingestion is the only IO-heavy stage (per property: EPC HTTP, Google-Solar HTTP, geospatial S3), so the connection sits checked-out-but-idle across all that external IO — the RDS-Proxy/pgbouncer "transaction-pinned connection" anti-pattern.
2. **One giant transaction** for the batch: long-held locks, identity-map growth, all-or-nothing across stages.
3. **Cross-stage hand-off through an *uncommitted* transaction.** Baseline reads Ingestion's writes only because they share one open transaction — which contradicts ADR-0011/0003's "stages hand off through *persisted* state." If a stage ever moves to its own lambda, this breaks.
A tempting fix — commit per property — is **rejected**: per-property commits are a commit storm that has overloaded the database before. The unit of commit must be the **batch**, not the property.
## Decision
- **Transaction boundary = one stage = one Unit of Work = one commit.** A batch yields ~3 commits (Ingestion, Baseline, Modelling), never N. No per-property commits.
- **All-or-nothing per batch, fail noisily.** Any property failing aborts that stage's unit (rollback); the exception propagates so `@subtask_handler` marks the subtask FAILED on the task table. Operators debug and re-run the batch. There is no per-property partial success.
- **Re-runs are idempotent.** Because stages commit independently, a re-run after a mid-pipeline failure re-executes already-committed earlier stages. So each stage's batch write **replaces** the rows for the batch's `property_ids` (delete-for-these-ids then bulk insert, or upsert) inside its unit. This is also what the future re-score-on-override path needs (re-baselining overwrites, never duplicates).
- **Bulk reads, load-whole (ADR-0002).** Repos expose `get_many(property_ids) -> Properties` returning fully-hydrated aggregates, implemented as one IN-filtered query per table composed in memory — a handful of round-trips per batch, not 30 × tables. No lean stage-specific read path.
- **Ingestion splits fetch from write.** Phase 1 fetches the whole batch (EPC / coordinates / solar) over HTTP/S3 with **no DB unit open**; phase 2 opens a unit and writes the batch, committing once. The connection is therefore held only for the short batch write, never across external IO. This sharpens the Fetcher-vs-Repo taxonomy of ADR-0011: Fetchers do IO outside any unit; Repos do DB inside the committed unit.
- **Mechanism: a `UnitOfWork`.** A `UnitOfWork` port + a `PostgresUnitOfWork` adapter (built on a module-scoped engine + sessionmaker) owns the session and constructs the DB-backed repos on it (`uow.property`, `uow.epc`, `uow.solar`, `uow.baseline`). It commits on explicit `commit()` and rolls back on any exception. Orchestrators take a `unit_of_work` factory plus their **non-DB** dependencies, injected separately: the EPC/Solar fetchers, the geospatial **S3** repo (reference data — read outside the transaction), and the Rebaseliner. Baseline uses one unit for the batch; Ingestion uses two (read uprns → fetch outside any unit → write batch).
## Consequences
- The orchestrators' dependency shape changes from "individual session-bound repos" to "a `unit_of_work` factory + non-DB deps". The #1134 Ingestion and #1135 Baseline orchestrators are refactored accordingly; `FirstRunPipeline` is unchanged (it still composes the three stages and threads only `property_ids`).
- Hard to reverse once every stage depends on the UoW — hence this ADR.
- Atomicity is **stage-level**, not per-property; correctness of the re-run workflow depends on the idempotent batch writes above.
- The engine + sessionmaker move to module scope so the pool is reused across warm Lambda invocations, rather than rebuilt per invocation (the existing `default_orchestrator` has the same per-invocation smell and should follow).
- EPC writes span child tables, so the idempotent "replace for these `property_ids`" must delete child rows too (cascade) before re-insert.
- The Modelling stub is left untouched this slice — its `run` is a no-op that touches no DB, so giving it a `unit_of_work` now would be an unused dependency. It takes a unit when its scoring body is built (the per-service Modelling grills).

View file

@ -0,0 +1,170 @@
# EPC persistence schema gaps — migrations for round-trip fidelity
**Context:** Slice 1 (Hestia-Homes/Model#1129) of the `ara_first_run` rebuild. The round-trip
fidelity test (`EpcPropertyData → epc_property tables → reload → EpcPropertyData`, deep-equality)
surfaced that the current `epc_property` schema stores only a **partial, partly type-lossy
projection** of the `EpcPropertyData` domain object. This document lists every gap and the
migration needed to close it, so the schema (FE-owned for some tables) can be updated.
We can make the column/table changes on the **SQLModel definitions** in
`infrastructure/postgres/epc_property_table.py` directly — tests build their schema from those
models via `SQLModel.metadata.create_all`, so they don't need the live DB. The live migrations
listed here are what must be applied wherever the physical tables are owned.
**`epc_cache` relationship:** the raw gov-API JSON response is retained in the `epc_cache` table,
so the *source* is always recoverable even where the structured `epc_property` projection is
lossy. That makes these gaps "the structured store is incomplete" rather than "data is lost
forever" — but the modelling pipeline reads the structured `epc_property`, not the raw cache, so
the gaps below still block faithful modelling and must be closed.
Priority key: **P0** modelling needs it now · **P1** needed soon · **P2** completeness.
---
## Status after Slice 1 (#1129)
The round-trip test passes over the persisted projection for RdSAP-Schema-21.0.0 and 21.0.1.
The following were **applied on the SQLModel** (`infrastructure/postgres/epc_property_table.py`)
and **still require the matching DB migration** wherever the physical tables live:
- **§1 JSONB** — all `Union` code columns converted (`epc_property`: `heating_cylinder_size`,
`heating_immersion_heating_type`, `heating_cylinder_insulation_type`,
`heating_secondary_heating_type`, `heating_shower_outlet_type`, `energy_pv_connection`;
`epc_main_heating_detail`: `main_fuel_type`, `heat_emitter_type`, `emitter_temperature`,
`main_heating_control`; `epc_building_part`: `wall_construction`, `wall_insulation_type`,
`party_wall_construction`, `flat_roof_insulation_thickness`, `roof_insulation_location`,
`roof_insulation_thickness`; `epc_window`: `glazing_gap`, `orientation`, `window_type`,
`glazing_type`, `window_location`, `window_wall_type`, `draught_proofed`,
`permanent_shutters_present`, `transmission_data_source`).
- **New scalar columns**`epc_property`: `heating_number_baths`, `heating_number_baths_wwhrs`,
`heating_electric_shower_count`, `heating_mixer_shower_count`,
`mechanical_vent_duct_insulation_level`, `addendum_stone_walls`, `addendum_system_build`,
`addendum_numbers` (JSONB), `ventilation_present`, `ventilation_sheltered_sides`,
`ventilation_has_suspended_timber_floor`, `ventilation_suspended_timber_floor_sealed`,
`ventilation_has_draught_lobby`, `ventilation_air_permeability_ap4_m3_h_m2`,
`ventilation_mechanical_ventilation_kind`; `epc_building_part`: `roof_construction_type`,
`curtain_wall_age`.
- **§2.1 `epc_renewable_heat_incentive` table** (#1137) — now created on the SQLModel and wired
into save/get; the round-trip test asserts **full deep-equality** (no exclusion). DB migration
still required.
**Still open (follow-up issues):** the remaining §2 structural tables (room-in-roof detail, PV
arrays, roof windows) + §3 nested-wall fields (`SapAlternativeWall.u_value`/`wall_thickness_mm`) +
`SapFloorDimension` exposed-floor flags — none populated in the 21.0.0/21.0.1 fixtures, so latent
until a richer fixture exercises them.
---
## 1. Type fidelity — convert `Union[int, str]` code columns to JSONB
These columns hold SAP/RdSAP categorical codes that are **`int` from the gov API** and **`str`
from Site Notes** (`Union[int, str]` in the domain). The forward mapper currently coerces them
with `str(...)` (and `bool(...)` for two window flags), so an API `int` of `26` is stored as
`"26"` and cannot be recovered. Convert each to **JSONB** and drop the `str()`/`bool()` coercion
in the forward mapper so the Python type round-trips exactly (JSON scalars preserve `int` vs
`str` vs `bool` vs `null`). **P0** — these feed the SAP10 calculator's int-keyed dispatch.
| Table | Columns |
|---|---|
| `epc_property` | `heating_cylinder_size`, `heating_immersion_heating_type`, `heating_cylinder_insulation_type`, `heating_secondary_heating_type`, `heating_shower_outlet_type`, `energy_pv_connection` |
| `epc_main_heating_detail` | `main_fuel_type`, `heat_emitter_type`, `emitter_temperature`, `main_heating_control` |
| `epc_building_part` | `wall_construction`, `wall_insulation_type`, `party_wall_construction`, `flat_roof_insulation_thickness`, `roof_insulation_location`, `roof_insulation_thickness` |
| `epc_window` | `glazing_gap`, `orientation`, `window_type`, `glazing_type`, `window_location`, `window_wall_type`, `draught_proofed`, `permanent_shutters_present` |
(`energy_meter_type` and `energy_wind_turbines_terrain_type` are `str` in the domain — leave as
`TEXT`.)
---
## 2. Not stored at all — new tables
### 2.1 `epc_renewable_heat_incentive` — **P0**
Maps `EpcPropertyData.renewable_heat_incentive` (`RenewableHeatIncentive`). Carries the **baseline
space-heating and hot-water kWh** that EPC Energy Derivation consumes — the single most important
gap. One row per `epc_property`.
| Column | Type | Source |
|---|---|---|
| `epc_property_id` | FK → `epc_property.id`, unique | |
| `space_heating_kwh` | float | `space_heating_kwh` |
| `water_heating_kwh` | float | `water_heating_kwh` |
| `impact_of_loft_insulation_kwh` | float, null | `impact_of_loft_insulation_kwh` |
| `impact_of_cavity_insulation_kwh` | float, null | `impact_of_cavity_insulation_kwh` |
| `impact_of_solid_wall_insulation_kwh` | float, null | `impact_of_solid_wall_insulation_kwh` |
### 2.2 `epc_room_in_roof` (+ `epc_room_in_roof_surface`) — **P1**
`SapBuildingPart.sap_room_in_roof` (`SapRoomInRoof`) is currently flattened to just
`room_in_roof_floor_area` + `room_in_roof_construction_age_band` on `epc_building_part`, dropping
the Type-2 geometry and the Detailed-measurement surfaces. Replace with a child table of
`epc_building_part`:
`epc_room_in_roof`: `epc_building_part_id` (FK, unique), `floor_area`, `construction_age_band`,
`common_wall_length_m`, `common_wall_height_m`, `gable_1_length_m`, `gable_1_height_m`,
`gable_2_length_m`, `gable_2_height_m`.
`epc_room_in_roof_surface` (0..n per RIR, from `detailed_surfaces: List[SapRoomInRoofSurface]`):
`epc_room_in_roof_id` (FK), `kind`, `area_m2`, `insulation_thickness_mm` (null),
`insulation_type` (null), `u_value` (null).
### 2.3 `epc_photovoltaic_array` — **P1**
`SapEnergySource.photovoltaic_arrays: List[PhotovoltaicArray]` (measured PV) is not stored at all
— only the `percent_roof_area` fallback is. One row per array: `epc_property_id` (FK),
`peak_power`, `pitch`, `orientation`, `overshading`.
### 2.4 `epc_roof_window` — **P2**
`EpcPropertyData.sap_roof_windows: List[SapRoofWindow]` not stored. One row per roof window:
`epc_property_id` (FK), `area_m2`, `u_value_raw`, `orientation`, `pitch_deg`, `g_perpendicular`,
`frame_factor`.
---
## 3. Not stored at all — new columns
### 3.1 `epc_property` additions
| Column | Type | Source | Pri |
|---|---|---|---|
| `addendum_stone_walls` | bool, null | `addendum.stone_walls` | P2 |
| `addendum_system_build` | bool, null | `addendum.system_build` | P2 |
| `addendum_numbers` | JSONB, null | `addendum.addendum_numbers` (`List[int]`) | P2 |
| `lzc_energy_sources` | JSONB, null | `lzc_energy_sources` (`List[int]`) | P2 |
| `solar_hw_collector_orientation` | text, null | `solar_hw_collector_orientation` | P1 |
| `solar_hw_collector_pitch_deg` | int, null | `solar_hw_collector_pitch_deg` | P1 |
| `solar_hw_overshading` | text, null | `solar_hw_overshading` | P1 |
| `extract_fans_count` | int, null | top-level `extract_fans_count` (distinct from the `ventilation_*` one) | P2 |
| `mechanical_vent_duct_insulation_level` | int, null | `mechanical_vent_duct_insulation_level` | P2 |
### 3.2 `epc_building_part` additions
| Column | Type | Source | Pri |
|---|---|---|---|
| `roof_construction_type` | text, null | `roof_construction_type` (Site-Notes str) | P1 |
| `curtain_wall_age` | text, null | `curtain_wall_age` (RdSAP §5.18) | P1 |
| `alt_wall_1_u_value` | float, null | `sap_alternative_wall_1.u_value` | P1 |
| `alt_wall_1_thickness_mm` | int, null | `sap_alternative_wall_1.wall_thickness_mm` | P1 |
| `alt_wall_2_u_value` | float, null | `sap_alternative_wall_2.u_value` | P1 |
| `alt_wall_2_thickness_mm` | int, null | `sap_alternative_wall_2.wall_thickness_mm` | P1 |
### 3.3 `epc_floor_dimension` additions
| Column | Type | Source | Pri |
|---|---|---|---|
| `is_exposed_floor` | bool, default false | `SapFloorDimension.is_exposed_floor` | P1 |
| `is_above_partially_heated_space` | bool, default false | `SapFloorDimension.is_above_partially_heated_space` | P1 |
---
## 4. Mapper-only gaps (no schema change required)
The table can already hold these; the **save mapper** simply doesn't write them. Fix in the
forward mapper, not the DB:
- **`air_tightness`** (`EnergyElement`) — `epc_energy_element.element_type` is a free string, so add
an `"air_tightness"` element type to the save loop. **P1.**
---
## 5. Scope note
Slice 1 (#1129) asserts faithful round-trip over the **projection the schema is meant to store**,
after applying §1 (JSONB) and the straightforward §3/§4 additions on the SQLModel. The structural
new tables in §2 (RHI, room-in-roof, PV arrays, roof windows) are tracked as their own follow-up
issues — `epc_renewable_heat_incentive` (§2.1) first, as it unblocks EPC Energy Derivation. Each
gap above should become a checkbox on the relevant issue so nothing is silently dropped.

View file

@ -0,0 +1,43 @@
# `property_baseline_performance` table — FE-owned migration
**Context:** Slice 6 (Hestia-Homes/Model#1135) of the `ara_first_run` rebuild. The
`PropertyBaselineOrchestrator` establishes a Property's **Baseline Performance** (ADR-0004) and persists it
via a new `PropertyBaselineRepository` port. This is a brand-new table — no predecessor.
Per ADR-0004's amendment, the lodged/effective pair does **not** land on `property_details_epc`
(which is being retired as too coupled to the legacy EPC-API schema). It lands here, as its own
aggregate's table.
The SQLModel row is defined in `infrastructure/postgres/` so the ephemeral-Postgres tests build it
via `SQLModel.metadata.create_all`. The **production migration is FE-owned (Drizzle ORM)** — a
straight lift-and-shift of the columns below.
## `property_baseline_performance` — one row per Property
| Column | Type | Notes |
|---|---|---|
| `id` | serial PK | |
| `property_id` | int, FK → `property.id`, **unique** | one Baseline Performance per Property |
| `lodged_sap_score` | int | Lodged Performance — gov register, off the Effective EPC |
| `lodged_epc_band` | text | the `Epc` enum, stored as its string value (e.g. `"C"`) |
| `lodged_co2_emissions_t_per_yr` | float | tonnes CO₂/yr (whole dwelling) |
| `lodged_primary_energy_intensity_kwh_per_m2_yr` | int | PEUI (kWh/m²/yr); **not** "heat demand" — see CONTEXT.md |
| `effective_sap_score` | int | Effective Performance — what modelling scored against |
| `effective_epc_band` | text | |
| `effective_co2_emissions_t_per_yr` | float | tonnes CO₂/yr (whole dwelling) |
| `effective_primary_energy_intensity_kwh_per_m2_yr` | int | kWh/m²/yr |
| `rebaseline_reason` | text | `none` \| `pre_sap10` \| `physical_state_changed` \| `both` |
| `space_heating_kwh` | float | off `renewable_heat_incentive`; deterministic (ADR-0006) |
| `water_heating_kwh` | float | off `renewable_heat_incentive` |
This slice has no ML rebaselining, so `effective_* == lodged_*` and `rebaseline_reason = 'none'`
for every row written (a pre-SAP10 cert raises rather than persisting a wrong-but-plausible row —
see #1135). The `effective_*` columns exist now so the table shape is stable when ML lands.
## Deferred (follow-up — EPC Energy Derivation + Fuel Rates)
`fuel_split` and `bills` are **not** in this table yet. They are produced by
`EpcEnergyDerivationService`, which needs a current **Fuel Rates** source (Ofgem-cap ETL) that does
not exist yet. They land together in the follow-up so this table is not migrated twice. Likely
shape: a `bills`-style block (per-fuel kWh + standing charge + SEG) — to be specified in that
slice's migration note.

View file

View file

@ -0,0 +1,15 @@
from __future__ import annotations
from dataclasses import dataclass
@dataclass(frozen=True)
class Coordinates:
"""A WGS84 point for a Property — longitude/latitude in decimal degrees.
Resolved from the Ordnance Survey Open-UPRN reference data and fed to the
Google Solar fetcher by the Ingestion orchestrator.
"""
longitude: float
latitude: float

View file

View file

@ -0,0 +1,25 @@
from __future__ import annotations
from collections.abc import Callable, Iterator
from dataclasses import dataclass
from domain.property.property import Property
@dataclass
class Properties:
"""A first-class collection of Property objects — the unit of bulk operation
in services (CONTEXT.md: Properties). Services take and return `Properties`
rather than bare lists so batch operations read clearly.
"""
items: list[Property]
def __iter__(self) -> Iterator[Property]:
return iter(self.items)
def __len__(self) -> int:
return len(self.items)
def filter(self, predicate: Callable[[Property], bool]) -> "Properties":
return Properties([p for p in self.items if predicate(p)])

View file

@ -0,0 +1,73 @@
from __future__ import annotations
from dataclasses import dataclass
from typing import Literal, Optional
from datatypes.epc.domain.epc_property_data import EpcPropertyData
from domain.property.site_notes import SiteNotes
SourcePath = Literal["site_notes", "epc_with_overlay"]
@dataclass(frozen=True)
class PropertyIdentity:
"""Identifies a single Property within a portfolio.
Keyed by `(portfolio_id, uprn)` or `(portfolio_id, landlord_property_id)`
a UPRN is permanent but each portfolio gets its own Property against it
(CONTEXT.md: UPRN).
"""
portfolio_id: int
postcode: str
address: str
uprn: Optional[int] = None
landlord_property_id: Optional[str] = None
@dataclass
class Property:
"""The Ara modelling aggregate root for a single dwelling (ADR-0002).
Holds identity plus the source data the pipeline reasons about. Enrichments
(geospatial, solar) and modelling outputs (baseline performance, plans) are
added by later slices this is the minimal-and-growing shape for First Run.
"""
identity: PropertyIdentity
epc: Optional[EpcPropertyData] = None
site_notes: Optional[SiteNotes] = None
@property
def source_path(self) -> SourcePath:
"""Which of the two disjoint source paths models this Property (ADR-0001).
Site Notes alone, or the public EPC (with Landlord Overrides, once that
slice lands). When both exist the newer wins (Recency Tie-Break); on an
equal date the survey wins, as it reflects on-site observation.
"""
if self.site_notes is not None and self.epc is not None:
epc_date = self.epc.registration_date or self.epc.inspection_date
if self.site_notes.surveyed_at >= epc_date:
return "site_notes"
return "epc_with_overlay"
if self.site_notes is not None:
return "site_notes"
if self.epc is not None:
return "epc_with_overlay"
raise ValueError(
"Property has neither Site Notes nor an EPC; no source path to model from"
)
@property
def effective_epc(self) -> EpcPropertyData:
"""The EpcPropertyData the modelling pipeline scores against.
Path 1: the Site Notes' surveyed data. Path 2: the public EPC (Landlord
Overrides overlay is a later slice returned as-is for now).
"""
if self.source_path == "site_notes":
assert self.site_notes is not None
return self.site_notes.to_epc_property_data()
assert self.epc is not None
return self.epc

View file

@ -0,0 +1,23 @@
from __future__ import annotations
from dataclasses import dataclass
from datetime import date
from datatypes.epc.domain.epc_property_data import EpcPropertyData
@dataclass
class SiteNotes:
"""A Domna survey of a single Property (CONTEXT.md: Site Notes).
Committed by the domain to being full-coverage it carries every EPC field
the modelling pipeline needs, expressed as an `EpcPropertyData`. When present
(and not older than the public EPC) it is the complete source of truth for
the Property; the public EPC is then irrelevant (ADR-0001).
"""
surveyed_at: date
epc: EpcPropertyData
def to_epc_property_data(self) -> EpcPropertyData:
return self.epc

View file

View file

@ -0,0 +1,53 @@
from __future__ import annotations
from dataclasses import dataclass
from typing import Optional, TypeVar
from datatypes.epc.domain.epc import Epc
from datatypes.epc.domain.epc_property_data import EpcPropertyData
_T = TypeVar("_T")
@dataclass(frozen=True)
class Performance:
"""One half of a Baseline Performance — a single set of SAP10 figures.
The four quantities a Property is rated on (CONTEXT.md: Lodged / Effective
Performance): SAP score, EPC Band, carbon emissions, and Primary Energy
Intensity. Used for both the Lodged half (off the gov register) and the
Effective half (what the modelling pipeline scored against).
"""
sap_score: int
epc_band: Epc
co2_emissions: float
primary_energy_intensity: int
def _require(value: Optional[_T], field: str) -> _T:
if value is None:
raise ValueError(
f"EPC is missing recorded performance field {field!r}; "
"cannot establish Lodged Performance"
)
return value
def lodged_performance(epc: EpcPropertyData) -> Performance:
"""The Lodged Performance recorded on an EPC — what the gov register says.
Reads the four rated quantities straight off the EPC's recorded fields
(CONTEXT.md: Primary Energy Intensity is recorded as `energy_consumption_current`).
Unmodified by modelling.
"""
return Performance(
sap_score=_require(epc.energy_rating_current, "energy_rating_current"),
epc_band=_require(
epc.current_energy_efficiency_band, "current_energy_efficiency_band"
),
co2_emissions=_require(epc.co2_emissions_current, "co2_emissions_current"),
primary_energy_intensity=_require(
epc.energy_consumption_current, "energy_consumption_current"
),
)

View file

@ -0,0 +1,28 @@
from __future__ import annotations
from dataclasses import dataclass
from domain.property_baseline.performance import Performance
from domain.property_baseline.rebaseliner import RebaselineReason
@dataclass(frozen=True)
class PropertyBaselinePerformance:
"""A Property's current performance aggregate (CONTEXT.md, ADR-0004).
Holds both halves ``lodged`` (what the gov register says) and
``effective`` (what the modelling pipeline scored against) plus the
``rebaseline_reason`` recording *why* they differ (``"none"`` when equal).
Both halves are always populated, even when equal.
Carries the part of the energy block that needs no derivation: annual
``space_heating_kwh`` / ``water_heating_kwh`` read off the EPC's RHI.
Fuel split and bills (the rest of EPC Energy Derivation) land in a
follow-up once a Fuel Rates source exists.
"""
lodged: Performance
effective: Performance
rebaseline_reason: RebaselineReason
space_heating_kwh: float
water_heating_kwh: float

View file

@ -0,0 +1,60 @@
from __future__ import annotations
from abc import ABC, abstractmethod
from typing import Literal
from datatypes.epc.domain.epc_property_data import EpcPropertyData
from domain.property_baseline.performance import Performance
RebaselineReason = Literal["none", "pre_sap10", "physical_state_changed", "both"]
# The SAP spec version below which a cert's recorded scores reflect a superseded
# methodology and must be ML-rebaselined (CONTEXT.md: Rebaselining).
_SAP10_FLOOR = 10.0
class RebaselineNotImplemented(Exception):
"""A Property needs Rebaselining, but the ML adapter is not wired yet.
Raised rather than silently recording ``reason="none"`` for a property that
genuinely needs rebaselining a plausible-but-wrong baseline is expensive to
discover downstream. Surfaces how much of a First Run cohort the pipeline can
handle today (#1135).
"""
class Rebaseliner(ABC):
"""Produces a Property's Effective Performance from its Effective EPC.
Rebaselining (CONTEXT.md) re-predicts the rated quantities via ML when the
EPC was lodged pre-SAP10 or its physical state diverged from the lodged EPC;
otherwise Effective Performance equals Lodged. Injected into the
PropertyBaselineOrchestrator (ADR-0011) so the ML adapter can swap in without
touching the orchestrator, and so the single-property re-score-on-override
flow reuses the same port.
"""
@abstractmethod
def rebaseline(
self, effective_epc: EpcPropertyData, lodged: Performance
) -> tuple[Performance, RebaselineReason]: ...
class StubRebaseliner(Rebaseliner):
"""The no-ML stub for the validation phase.
SAP10 certs pass through untouched Effective Performance equals Lodged,
reason ``"none"``. A pre-SAP10 cert genuinely needs ML rebaselining, which is
not implemented yet (#1135), so it raises rather than fabricating a "none".
"""
def rebaseline(
self, effective_epc: EpcPropertyData, lodged: Performance
) -> tuple[Performance, RebaselineReason]:
sap_version = effective_epc.sap_version
if sap_version is not None and sap_version < _SAP10_FLOOR:
raise RebaselineNotImplemented(
f"Property needs rebaselining (pre-SAP10 cert, sap_version="
f"{sap_version}); ML rebaselining is not implemented yet"
)
return lodged, "none"

View file

@ -0,0 +1,3 @@
from infrastructure.epc_client.epc_client_service import EpcClientService
__all__ = ["EpcClientService"]

View file

@ -1,7 +1,7 @@
import time
from typing import Callable, TypeVar
from backend.epc_client.exceptions import EpcRateLimitError
from infrastructure.epc_client.exceptions import EpcRateLimitError
T = TypeVar("T")

View file

@ -5,12 +5,12 @@ from typing import Any, Optional
import httpx
from backend.epc_client.exceptions import (
from infrastructure.epc_client.exceptions import (
EpcApiError,
EpcNotFoundError,
EpcRateLimitError,
)
from backend.epc_client._retry import call_with_retry
from infrastructure.epc_client._retry import call_with_retry
from datatypes.epc.domain.epc_property_data import EpcPropertyData
from datatypes.epc.domain.mapper import EpcPropertyDataMapper
from datatypes.epc.search import EpcSearchResult

View file

@ -2,7 +2,7 @@ import json
import pathlib
import pytest
from backend.epc_client.epc_client_service import EpcClientService
from infrastructure.epc_client.epc_client_service import EpcClientService
SAMPLES_DIR = pathlib.Path("backend/epc_api/json_samples")

View file

@ -1,11 +1,11 @@
from unittest.mock import MagicMock, patch, call
import pytest
from backend.epc_client.epc_client_service import EpcClientService
from infrastructure.epc_client.epc_client_service import EpcClientService
from datatypes.epc.search import EpcSearchResult
from backend.epc_client.exceptions import EpcNotFoundError, EpcRateLimitError
from infrastructure.epc_client.exceptions import EpcNotFoundError, EpcRateLimitError
from datatypes.epc.domain.epc_property_data import EpcPropertyData
from backend.epc_client.tests.conftest import make_search_row
from infrastructure.epc_client.tests.conftest import make_search_row
def _mock_response(status_code=200, json_data=None, headers=None):
@ -78,7 +78,7 @@ def test_429_retry_after_header_drives_sleep_duration(
_mock_response(200, cert_response),
]
with patch("httpx.get", side_effect=responses), patch(
"backend.epc_client._retry.time.sleep"
"infrastructure.epc_client._retry.time.sleep"
) as mock_sleep:
epc_service.get_by_certificate_number("CERT-001")
@ -100,7 +100,7 @@ def test_429_without_retry_after_uses_exponential_backoff(
_mock_response(200, cert_response),
]
with patch("httpx.get", side_effect=responses), patch(
"backend.epc_client._retry.time.sleep"
"infrastructure.epc_client._retry.time.sleep"
) as mock_sleep:
epc_service.get_by_certificate_number("CERT-001")
@ -121,7 +121,7 @@ def test_429_malformed_retry_after_falls_back_to_backoff(
_mock_response(200, cert_response),
]
with patch("httpx.get", side_effect=responses), patch(
"backend.epc_client._retry.time.sleep"
"infrastructure.epc_client._retry.time.sleep"
) as mock_sleep:
epc_service.get_by_certificate_number("CERT-001")
@ -140,7 +140,7 @@ def test_429_retry_after_capped_by_max_backoff(epc_service, rdsap_21_0_1_cert):
_mock_response(200, cert_response),
]
with patch("httpx.get", side_effect=responses), patch(
"backend.epc_client._retry.time.sleep"
"infrastructure.epc_client._retry.time.sleep"
) as mock_sleep:
epc_service.get_by_certificate_number("CERT-001")

View file

@ -0,0 +1,745 @@
from __future__ import annotations
from typing import ClassVar, Optional, Union
from sqlalchemy import Column
from sqlalchemy.dialects.postgresql import JSONB
from sqlmodel import SQLModel, Field
from datatypes.epc.domain.epc_property_data import (
EpcPropertyData,
EnergyElement,
MainHeatingDetail,
RenewableHeatIncentive,
SapBuildingPart,
SapFloorDimension,
SapFlatDetails,
SapWindow,
)
class EpcPropertyModel(SQLModel, table=True):
__tablename__: ClassVar[str] = "epc_property" # pyright: ignore[reportIncompatibleVariableOverride]
id: Optional[int] = Field(default=None, primary_key=True)
property_id: Optional[int] = Field(default=None)
portfolio_id: Optional[int] = Field(default=None)
uploaded_file_id: Optional[int] = Field(default=None)
# Identity / admin
uprn: Optional[int] = Field(default=None)
uprn_source: Optional[str] = Field(default=None)
report_reference: Optional[str] = Field(default=None)
report_type: Optional[str] = Field(default=None)
assessment_type: Optional[str] = Field(default=None)
sap_version: Optional[float] = Field(default=None)
schema_type: Optional[str] = Field(default=None)
schema_versions_original: Optional[str] = Field(default=None)
status: Optional[str] = Field(default=None)
calculation_software_version: Optional[str] = Field(default=None)
# Address
address_line_1: Optional[str] = Field(default=None)
address_line_2: Optional[str] = Field(default=None)
post_town: Optional[str] = Field(default=None)
postcode: Optional[str] = Field(default=None)
region_code: Optional[str] = Field(default=None)
country_code: Optional[str] = Field(default=None)
language_code: Optional[str] = Field(default=None)
# Property description
dwelling_type: str
property_type: Optional[str] = Field(default=None)
built_form: Optional[str] = Field(default=None)
tenure: str
transaction_type: str
inspection_date: str # store as ISO string; cast on read if needed
completion_date: Optional[str] = Field(default=None)
registration_date: Optional[str] = Field(default=None)
total_floor_area_m2: float
measurement_type: Optional[int] = Field(default=None)
# Flags
solar_water_heating: bool
has_hot_water_cylinder: bool
has_fixed_air_conditioning: bool
has_conservatory: Optional[bool] = Field(default=None)
has_heated_separate_conservatory: Optional[bool] = Field(default=None)
conservatory_type: Optional[int] = Field(default=None)
# Counts
door_count: int
wet_rooms_count: int
extensions_count: int
heated_rooms_count: int
open_chimneys_count: int
habitable_rooms_count: int
insulated_door_count: int
cfl_fixed_lighting_bulbs_count: int
led_fixed_lighting_bulbs_count: int
incandescent_fixed_lighting_bulbs_count: int
blocked_chimneys_count: Optional[int] = Field(default=None)
draughtproofed_door_count: Optional[int] = Field(default=None)
energy_rating_average: Optional[int] = Field(default=None)
low_energy_fixed_lighting_bulbs_count: Optional[int] = Field(default=None)
fixed_lighting_outlets_count: Optional[int] = Field(default=None)
low_energy_fixed_lighting_outlets_count: Optional[int] = Field(default=None)
number_of_storeys: Optional[int] = Field(default=None)
any_unheated_rooms: Optional[bool] = Field(default=None)
mechanical_vent_duct_insulation_level: Optional[int] = Field(default=None)
# Addendum (cert-level construction flags)
addendum_stone_walls: Optional[bool] = Field(default=None)
addendum_system_build: Optional[bool] = Field(default=None)
addendum_numbers: Optional[list[int]] = Field(
default=None, sa_column=Column(JSONB, nullable=True)
)
# Misc
hydro: Optional[bool] = Field(default=None)
photovoltaic_array: Optional[bool] = Field(default=None)
waste_water_heat_recovery: Optional[str] = Field(default=None)
pressure_test: Optional[int] = Field(default=None)
pressure_test_certificate_number: Optional[int] = Field(default=None)
percent_draughtproofed: Optional[int] = Field(default=None)
insulated_door_u_value: Optional[float] = Field(default=None)
multiple_glazed_proportion: Optional[int] = Field(default=None)
windows_transmission_u_value: Optional[float] = Field(default=None)
windows_transmission_data_source: Optional[int] = Field(default=None)
windows_transmission_solar_transmittance: Optional[float] = Field(default=None)
# Energy source
energy_mains_gas: bool
energy_meter_type: str
energy_pv_battery_count: int
energy_wind_turbines_count: int
energy_gas_smart_meter_present: bool
energy_is_dwelling_export_capable: bool
energy_wind_turbines_terrain_type: str
energy_electricity_smart_meter_present: bool
energy_pv_connection: Optional[Union[int, str]] = Field(
default=None, sa_column=Column(JSONB, nullable=True)
)
energy_pv_percent_roof_area: Optional[int] = Field(default=None)
energy_pv_battery_capacity: Optional[float] = Field(default=None)
energy_wind_turbine_hub_height: Optional[float] = Field(default=None)
energy_wind_turbine_rotor_diameter: Optional[float] = Field(default=None)
# Heating config
# Union[int, str] code fields stored as JSONB to preserve the int (API) vs
# str (Site Notes) distinction on round-trip (see docs/migrations/epc-property-round-trip-fidelity.md §1).
heating_cylinder_size: Optional[Union[int, str]] = Field(
default=None, sa_column=Column(JSONB, nullable=True)
)
heating_water_heating_code: Optional[int] = Field(default=None)
heating_water_heating_fuel: Optional[int] = Field(default=None)
heating_immersion_heating_type: Optional[Union[int, str]] = Field(
default=None, sa_column=Column(JSONB, nullable=True)
)
heating_cylinder_insulation_type: Optional[Union[int, str]] = Field(
default=None, sa_column=Column(JSONB, nullable=True)
)
heating_cylinder_thermostat: Optional[str] = Field(default=None)
heating_secondary_fuel_type: Optional[int] = Field(default=None)
heating_secondary_heating_type: Optional[Union[int, str]] = Field(
default=None, sa_column=Column(JSONB, nullable=True)
)
heating_cylinder_insulation_thickness_mm: Optional[int] = Field(default=None)
heating_wwhrs_index_number_1: Optional[int] = Field(default=None)
heating_wwhrs_index_number_2: Optional[int] = Field(default=None)
heating_shower_outlet_type: Optional[Union[int, str]] = Field(
default=None, sa_column=Column(JSONB, nullable=True)
)
heating_shower_wwhrs: Optional[int] = Field(default=None)
heating_number_baths: Optional[int] = Field(default=None)
heating_number_baths_wwhrs: Optional[int] = Field(default=None)
heating_electric_shower_count: Optional[int] = Field(default=None)
heating_mixer_shower_count: Optional[int] = Field(default=None)
# Ventilation
ventilation_type: Optional[str] = Field(default=None)
ventilation_draught_lobby: Optional[bool] = Field(default=None)
ventilation_pressure_test: Optional[str] = Field(default=None)
ventilation_open_flues_count: Optional[int] = Field(default=None)
ventilation_closed_flues_count: Optional[int] = Field(default=None)
ventilation_boiler_flues_count: Optional[int] = Field(default=None)
ventilation_other_flues_count: Optional[int] = Field(default=None)
ventilation_extract_fans_count: Optional[int] = Field(default=None)
ventilation_passive_vents_count: Optional[int] = Field(default=None)
ventilation_flueless_gas_fires_count: Optional[int] = Field(default=None)
ventilation_in_pcdf_database: Optional[bool] = Field(default=None)
# SAP 10.2 §2 lodgements + a presence flag so an all-None SapVentilation
# round-trips as present (not collapsed to None).
ventilation_present: bool = Field(default=False)
ventilation_sheltered_sides: Optional[int] = Field(default=None)
ventilation_has_suspended_timber_floor: Optional[bool] = Field(default=None)
ventilation_suspended_timber_floor_sealed: Optional[bool] = Field(default=None)
ventilation_has_draught_lobby: Optional[bool] = Field(default=None)
ventilation_air_permeability_ap4_m3_h_m2: Optional[float] = Field(default=None)
ventilation_mechanical_ventilation_kind: Optional[str] = Field(default=None)
mechanical_ventilation: Optional[int] = Field(default=None)
mechanical_vent_duct_type: Optional[int] = Field(default=None)
mechanical_vent_duct_placement: Optional[int] = Field(default=None)
mechanical_vent_duct_insulation: Optional[int] = Field(default=None)
mechanical_ventilation_index_number: Optional[int] = Field(default=None)
mechanical_vent_measured_installation: Optional[str] = Field(default=None)
@classmethod
def from_epc_property_data(
cls,
data: EpcPropertyData,
property_id: Optional[int] = None,
portfolio_id: Optional[int] = None,
) -> EpcPropertyModel:
es = data.sap_energy_source
h = data.sap_heating
v = data.sap_ventilation
shower = h.shower_outlets.shower_outlet if h.shower_outlets else None
pv = es.photovoltaic_supply
wt = es.wind_turbine_details
pvb = es.pv_batteries
return cls(
property_id=property_id,
portfolio_id=portfolio_id,
uprn=data.uprn,
uprn_source=data.uprn_source,
report_reference=data.report_reference,
report_type=data.report_type,
assessment_type=data.assessment_type,
sap_version=data.sap_version,
schema_type=data.schema_type,
schema_versions_original=data.schema_versions_original,
status=data.status,
calculation_software_version=data.calculation_software_version,
address_line_1=data.address_line_1,
address_line_2=data.address_line_2,
post_town=data.post_town,
postcode=data.postcode,
region_code=data.region_code,
country_code=data.country_code,
language_code=data.language_code,
dwelling_type=data.dwelling_type,
property_type=data.property_type,
built_form=data.built_form,
tenure=data.tenure,
transaction_type=data.transaction_type,
inspection_date=data.inspection_date.isoformat(),
completion_date=(
data.completion_date.isoformat() if data.completion_date else None
),
registration_date=(
data.registration_date.isoformat() if data.registration_date else None
),
total_floor_area_m2=data.total_floor_area_m2,
measurement_type=data.measurement_type,
solar_water_heating=data.solar_water_heating,
has_hot_water_cylinder=data.has_hot_water_cylinder,
has_fixed_air_conditioning=data.has_fixed_air_conditioning,
has_conservatory=data.has_conservatory,
has_heated_separate_conservatory=data.has_heated_separate_conservatory,
conservatory_type=data.conservatory_type,
door_count=data.door_count,
wet_rooms_count=data.wet_rooms_count,
extensions_count=data.extensions_count,
heated_rooms_count=data.heated_rooms_count,
open_chimneys_count=data.open_chimneys_count,
habitable_rooms_count=data.habitable_rooms_count,
insulated_door_count=data.insulated_door_count,
cfl_fixed_lighting_bulbs_count=data.cfl_fixed_lighting_bulbs_count,
led_fixed_lighting_bulbs_count=data.led_fixed_lighting_bulbs_count,
incandescent_fixed_lighting_bulbs_count=data.incandescent_fixed_lighting_bulbs_count,
blocked_chimneys_count=data.blocked_chimneys_count,
draughtproofed_door_count=data.draughtproofed_door_count,
energy_rating_average=data.energy_rating_average,
low_energy_fixed_lighting_bulbs_count=data.low_energy_fixed_lighting_bulbs_count,
fixed_lighting_outlets_count=data.fixed_lighting_outlets_count,
low_energy_fixed_lighting_outlets_count=data.low_energy_fixed_lighting_outlets_count,
number_of_storeys=data.number_of_storeys,
any_unheated_rooms=data.any_unheated_rooms,
mechanical_vent_duct_insulation_level=data.mechanical_vent_duct_insulation_level,
addendum_stone_walls=data.addendum.stone_walls if data.addendum else None,
addendum_system_build=(
data.addendum.system_build if data.addendum else None
),
addendum_numbers=data.addendum.addendum_numbers if data.addendum else None,
hydro=data.hydro,
photovoltaic_array=data.photovoltaic_array,
waste_water_heat_recovery=data.waste_water_heat_recovery,
pressure_test=data.pressure_test,
pressure_test_certificate_number=data.pressure_test_certificate_number,
percent_draughtproofed=data.percent_draughtproofed,
insulated_door_u_value=data.insulated_door_u_value,
multiple_glazed_proportion=data.multiple_glazed_proportion,
windows_transmission_u_value=(
data.windows_transmission_details.u_value
if data.windows_transmission_details
else None
),
windows_transmission_data_source=(
data.windows_transmission_details.data_source
if data.windows_transmission_details
else None
),
windows_transmission_solar_transmittance=(
data.windows_transmission_details.solar_transmittance
if data.windows_transmission_details
else None
),
energy_mains_gas=es.mains_gas,
energy_meter_type=str(es.meter_type),
energy_pv_battery_count=es.pv_battery_count,
energy_wind_turbines_count=es.wind_turbines_count,
energy_gas_smart_meter_present=es.gas_smart_meter_present,
energy_is_dwelling_export_capable=es.is_dwelling_export_capable,
energy_wind_turbines_terrain_type=str(es.wind_turbines_terrain_type),
energy_electricity_smart_meter_present=es.electricity_smart_meter_present,
energy_pv_connection=es.pv_connection,
energy_pv_percent_roof_area=(
pv.none_or_no_details.percent_roof_area if pv else None
),
energy_pv_battery_capacity=pvb.pv_battery.battery_capacity if pvb else None,
energy_wind_turbine_hub_height=wt.hub_height if wt else None,
energy_wind_turbine_rotor_diameter=wt.rotor_diameter if wt else None,
heating_cylinder_size=h.cylinder_size,
heating_water_heating_code=h.water_heating_code,
heating_water_heating_fuel=h.water_heating_fuel,
heating_immersion_heating_type=h.immersion_heating_type,
heating_cylinder_insulation_type=h.cylinder_insulation_type,
heating_cylinder_thermostat=h.cylinder_thermostat,
heating_secondary_fuel_type=h.secondary_fuel_type,
heating_secondary_heating_type=h.secondary_heating_type,
heating_cylinder_insulation_thickness_mm=h.cylinder_insulation_thickness_mm,
heating_wwhrs_index_number_1=h.instantaneous_wwhrs.wwhrs_index_number1,
heating_wwhrs_index_number_2=h.instantaneous_wwhrs.wwhrs_index_number2,
heating_shower_outlet_type=shower.shower_outlet_type if shower else None,
heating_shower_wwhrs=shower.shower_wwhrs if shower else None,
heating_number_baths=h.number_baths,
heating_number_baths_wwhrs=h.number_baths_wwhrs,
heating_electric_shower_count=h.electric_shower_count,
heating_mixer_shower_count=h.mixer_shower_count,
ventilation_type=v.ventilation_type if v else None,
ventilation_draught_lobby=v.draught_lobby if v else None,
ventilation_pressure_test=v.pressure_test if v else None,
ventilation_open_flues_count=v.open_flues_count if v else None,
ventilation_closed_flues_count=v.closed_flues_count if v else None,
ventilation_boiler_flues_count=v.boiler_flues_count if v else None,
ventilation_other_flues_count=v.other_flues_count if v else None,
ventilation_extract_fans_count=v.extract_fans_count if v else None,
ventilation_passive_vents_count=v.passive_vents_count if v else None,
ventilation_flueless_gas_fires_count=(
v.flueless_gas_fires_count if v else None
),
ventilation_in_pcdf_database=v.ventilation_in_pcdf_database if v else None,
ventilation_present=v is not None,
ventilation_sheltered_sides=v.sheltered_sides if v else None,
ventilation_has_suspended_timber_floor=(
v.has_suspended_timber_floor if v else None
),
ventilation_suspended_timber_floor_sealed=(
v.suspended_timber_floor_sealed if v else None
),
ventilation_has_draught_lobby=v.has_draught_lobby if v else None,
ventilation_air_permeability_ap4_m3_h_m2=(
v.air_permeability_ap4_m3_h_m2 if v else None
),
ventilation_mechanical_ventilation_kind=(
v.mechanical_ventilation_kind if v else None
),
mechanical_ventilation=data.mechanical_ventilation,
mechanical_vent_duct_type=data.mechanical_vent_duct_type,
mechanical_vent_duct_placement=data.mechanical_vent_duct_placement,
mechanical_vent_duct_insulation=data.mechanical_vent_duct_insulation,
mechanical_ventilation_index_number=data.mechanical_ventilation_index_number,
mechanical_vent_measured_installation=data.mechanical_vent_measured_installation,
)
class EpcPropertyEnergyPerformanceModel(SQLModel, table=True):
__tablename__: ClassVar[str] = "epc_property_energy_performance" # pyright: ignore[reportIncompatibleVariableOverride]
id: Optional[int] = Field(default=None, primary_key=True)
epc_property_id: int = Field(
foreign_key="epc_property.id", nullable=False, unique=True
)
energy_rating_current: Optional[int] = Field(default=None)
energy_consumption_current: Optional[int] = Field(default=None)
environmental_impact_current: Optional[int] = Field(default=None)
heating_cost_current: Optional[float] = Field(default=None)
lighting_cost_current: Optional[float] = Field(default=None)
hot_water_cost_current: Optional[float] = Field(default=None)
co2_emissions_current: Optional[float] = Field(default=None)
co2_emissions_current_per_floor_area: Optional[int] = Field(default=None)
current_energy_efficiency_band: Optional[str] = Field(default=None)
energy_rating_potential: Optional[float] = Field(default=None)
energy_consumption_potential: Optional[int] = Field(default=None)
environmental_impact_potential: Optional[int] = Field(default=None)
heating_cost_potential: Optional[float] = Field(default=None)
lighting_cost_potential: Optional[float] = Field(default=None)
hot_water_cost_potential: Optional[float] = Field(default=None)
co2_emissions_potential: Optional[float] = Field(default=None)
potential_energy_efficiency_band: Optional[str] = Field(default=None)
@classmethod
def from_epc_property_data(
cls, data: EpcPropertyData, epc_property_id: int
) -> EpcPropertyEnergyPerformanceModel:
return cls(
epc_property_id=epc_property_id,
energy_rating_current=data.energy_rating_current,
energy_consumption_current=data.energy_consumption_current,
environmental_impact_current=data.environmental_impact_current,
heating_cost_current=data.heating_cost_current,
lighting_cost_current=data.lighting_cost_current,
hot_water_cost_current=data.hot_water_cost_current,
co2_emissions_current=data.co2_emissions_current,
co2_emissions_current_per_floor_area=data.co2_emissions_current_per_floor_area,
current_energy_efficiency_band=(
data.current_energy_efficiency_band.value
if data.current_energy_efficiency_band
else None
),
energy_rating_potential=data.energy_rating_potential,
energy_consumption_potential=data.energy_consumption_potential,
environmental_impact_potential=data.environmental_impact_potential,
heating_cost_potential=data.heating_cost_potential,
lighting_cost_potential=data.lighting_cost_potential,
hot_water_cost_potential=data.hot_water_cost_potential,
co2_emissions_potential=data.co2_emissions_potential,
potential_energy_efficiency_band=(
data.potential_energy_efficiency_band.value
if data.potential_energy_efficiency_band
else None
),
)
class EpcRenewableHeatIncentiveModel(SQLModel, table=True):
__tablename__: ClassVar[str] = "epc_renewable_heat_incentive" # pyright: ignore[reportIncompatibleVariableOverride]
id: Optional[int] = Field(default=None, primary_key=True)
epc_property_id: int = Field(
foreign_key="epc_property.id", nullable=False, unique=True
)
space_heating_kwh: float
water_heating_kwh: float
impact_of_loft_insulation_kwh: Optional[float] = Field(default=None)
impact_of_cavity_insulation_kwh: Optional[float] = Field(default=None)
impact_of_solid_wall_insulation_kwh: Optional[float] = Field(default=None)
@classmethod
def from_domain(
cls, rhi: RenewableHeatIncentive, epc_property_id: int
) -> EpcRenewableHeatIncentiveModel:
return cls(
epc_property_id=epc_property_id,
space_heating_kwh=rhi.space_heating_kwh,
water_heating_kwh=rhi.water_heating_kwh,
impact_of_loft_insulation_kwh=rhi.impact_of_loft_insulation_kwh,
impact_of_cavity_insulation_kwh=rhi.impact_of_cavity_insulation_kwh,
impact_of_solid_wall_insulation_kwh=rhi.impact_of_solid_wall_insulation_kwh,
)
class EpcFlatDetailsModel(SQLModel, table=True):
__tablename__: ClassVar[str] = "epc_flat_details" # pyright: ignore[reportIncompatibleVariableOverride]
id: Optional[int] = Field(default=None, primary_key=True)
epc_property_id: int = Field(
foreign_key="epc_property.id", nullable=False, unique=True
)
level: int
top_storey: str
flat_location: int
heat_loss_corridor: int
storey_count: Optional[int] = Field(default=None)
unheated_corridor_length_m: Optional[int] = Field(default=None)
@classmethod
def from_domain(
cls, flat: SapFlatDetails, epc_property_id: int
) -> EpcFlatDetailsModel:
return cls(
epc_property_id=epc_property_id,
level=flat.level,
top_storey=flat.top_storey,
flat_location=flat.flat_location,
heat_loss_corridor=flat.heat_loss_corridor,
storey_count=flat.storey_count,
unheated_corridor_length_m=flat.unheated_corridor_length_m,
)
class EpcMainHeatingDetailModel(SQLModel, table=True):
__tablename__: ClassVar[str] = "epc_main_heating_detail" # pyright: ignore[reportIncompatibleVariableOverride]
id: Optional[int] = Field(default=None, primary_key=True)
epc_property_id: int = Field(foreign_key="epc_property.id", nullable=False)
has_fghrs: bool
# Union[int, str] code fields — JSONB to preserve int/str on round-trip.
main_fuel_type: Union[int, str] = Field(sa_column=Column(JSONB, nullable=False))
heat_emitter_type: Union[int, str] = Field(sa_column=Column(JSONB, nullable=False))
emitter_temperature: Union[int, str] = Field(
sa_column=Column(JSONB, nullable=False)
)
main_heating_control: Union[int, str] = Field(
sa_column=Column(JSONB, nullable=False)
)
fan_flue_present: Optional[bool] = Field(default=None)
boiler_flue_type: Optional[int] = Field(default=None)
boiler_ignition_type: Optional[int] = Field(default=None)
central_heating_pump_age: Optional[int] = Field(default=None)
central_heating_pump_age_str: Optional[str] = Field(default=None)
main_heating_index_number: Optional[int] = Field(default=None)
sap_main_heating_code: Optional[int] = Field(default=None)
main_heating_number: Optional[int] = Field(default=None)
main_heating_category: Optional[int] = Field(default=None)
main_heating_fraction: Optional[int] = Field(default=None)
main_heating_data_source: Optional[int] = Field(default=None)
condensing: Optional[bool] = Field(default=None)
weather_compensator: Optional[bool] = Field(default=None)
@classmethod
def from_domain(
cls, detail: MainHeatingDetail, epc_property_id: int
) -> EpcMainHeatingDetailModel:
return cls(
epc_property_id=epc_property_id,
has_fghrs=detail.has_fghrs,
main_fuel_type=detail.main_fuel_type,
heat_emitter_type=detail.heat_emitter_type,
emitter_temperature=detail.emitter_temperature,
main_heating_control=detail.main_heating_control,
fan_flue_present=detail.fan_flue_present,
boiler_flue_type=detail.boiler_flue_type,
boiler_ignition_type=detail.boiler_ignition_type,
central_heating_pump_age=detail.central_heating_pump_age,
central_heating_pump_age_str=detail.central_heating_pump_age_str,
main_heating_index_number=detail.main_heating_index_number,
sap_main_heating_code=detail.sap_main_heating_code,
main_heating_number=detail.main_heating_number,
main_heating_category=detail.main_heating_category,
main_heating_fraction=detail.main_heating_fraction,
main_heating_data_source=detail.main_heating_data_source,
condensing=detail.condensing,
weather_compensator=detail.weather_compensator,
)
class EpcBuildingPartModel(SQLModel, table=True):
__tablename__: ClassVar[str] = "epc_building_part" # pyright: ignore[reportIncompatibleVariableOverride]
id: Optional[int] = Field(default=None, primary_key=True)
epc_property_id: int = Field(foreign_key="epc_property.id", nullable=False)
identifier: str
construction_age_band: str
# Union[int, str] code fields — JSONB to preserve int/str on round-trip.
wall_construction: Union[int, str] = Field(sa_column=Column(JSONB, nullable=False))
wall_insulation_type: Union[int, str] = Field(
sa_column=Column(JSONB, nullable=False)
)
wall_thickness_measured: bool
party_wall_construction: Optional[Union[int, str]] = Field(
default=None, sa_column=Column(JSONB, nullable=True)
)
building_part_number: Optional[int] = Field(default=None)
wall_dry_lined: Optional[bool] = Field(default=None)
wall_thickness_mm: Optional[int] = Field(default=None)
wall_insulation_thickness: Optional[str] = Field(default=None)
floor_heat_loss: Optional[int] = Field(default=None)
floor_insulation_thickness: Optional[str] = Field(default=None)
flat_roof_insulation_thickness: Optional[Union[str, int]] = Field(
default=None, sa_column=Column(JSONB, nullable=True)
)
floor_type: Optional[str] = Field(default=None)
floor_construction_type: Optional[str] = Field(default=None)
floor_insulation_type_str: Optional[str] = Field(default=None)
floor_u_value_known: Optional[bool] = Field(default=None)
roof_construction: Optional[int] = Field(default=None)
roof_construction_type: Optional[str] = Field(default=None)
curtain_wall_age: Optional[str] = Field(default=None)
roof_insulation_location: Optional[Union[int, str]] = Field(
default=None, sa_column=Column(JSONB, nullable=True)
)
roof_insulation_thickness: Optional[Union[str, int]] = Field(
default=None, sa_column=Column(JSONB, nullable=True)
)
room_in_roof_floor_area: Optional[float] = Field(default=None)
room_in_roof_construction_age_band: Optional[str] = Field(default=None)
alt_wall_1_area: Optional[float] = Field(default=None)
alt_wall_1_dry_lined: Optional[str] = Field(default=None)
alt_wall_1_construction: Optional[int] = Field(default=None)
alt_wall_1_insulation_type: Optional[int] = Field(default=None)
alt_wall_1_thickness_measured: Optional[str] = Field(default=None)
alt_wall_1_insulation_thickness: Optional[str] = Field(default=None)
alt_wall_2_area: Optional[float] = Field(default=None)
alt_wall_2_dry_lined: Optional[str] = Field(default=None)
alt_wall_2_construction: Optional[int] = Field(default=None)
alt_wall_2_insulation_type: Optional[int] = Field(default=None)
alt_wall_2_thickness_measured: Optional[str] = Field(default=None)
alt_wall_2_insulation_thickness: Optional[str] = Field(default=None)
@classmethod
def from_domain(
cls, part: SapBuildingPart, epc_property_id: int
) -> EpcBuildingPartModel:
rir = part.sap_room_in_roof
aw1 = part.sap_alternative_wall_1
aw2 = part.sap_alternative_wall_2
return cls(
epc_property_id=epc_property_id,
identifier=part.identifier.value,
construction_age_band=part.construction_age_band,
wall_construction=part.wall_construction,
wall_insulation_type=part.wall_insulation_type,
wall_thickness_measured=part.wall_thickness_measured,
party_wall_construction=part.party_wall_construction,
building_part_number=part.building_part_number,
wall_dry_lined=part.wall_dry_lined,
wall_thickness_mm=part.wall_thickness_mm,
wall_insulation_thickness=part.wall_insulation_thickness,
floor_heat_loss=part.floor_heat_loss,
floor_insulation_thickness=part.floor_insulation_thickness,
flat_roof_insulation_thickness=part.flat_roof_insulation_thickness,
floor_type=part.floor_type,
floor_construction_type=part.floor_construction_type,
floor_insulation_type_str=part.floor_insulation_type_str,
floor_u_value_known=part.floor_u_value_known,
roof_construction=part.roof_construction,
roof_construction_type=part.roof_construction_type,
curtain_wall_age=part.curtain_wall_age,
roof_insulation_location=part.roof_insulation_location,
roof_insulation_thickness=part.roof_insulation_thickness,
room_in_roof_floor_area=float(rir.floor_area) if rir else None,
room_in_roof_construction_age_band=(
rir.construction_age_band if rir else None
),
alt_wall_1_area=aw1.wall_area if aw1 else None,
alt_wall_1_dry_lined=aw1.wall_dry_lined if aw1 else None,
alt_wall_1_construction=aw1.wall_construction if aw1 else None,
alt_wall_1_insulation_type=aw1.wall_insulation_type if aw1 else None,
alt_wall_1_thickness_measured=aw1.wall_thickness_measured if aw1 else None,
alt_wall_1_insulation_thickness=(
aw1.wall_insulation_thickness if aw1 else None
),
alt_wall_2_area=aw2.wall_area if aw2 else None,
alt_wall_2_dry_lined=aw2.wall_dry_lined if aw2 else None,
alt_wall_2_construction=aw2.wall_construction if aw2 else None,
alt_wall_2_insulation_type=aw2.wall_insulation_type if aw2 else None,
alt_wall_2_thickness_measured=aw2.wall_thickness_measured if aw2 else None,
alt_wall_2_insulation_thickness=(
aw2.wall_insulation_thickness if aw2 else None
),
)
class EpcFloorDimensionModel(SQLModel, table=True):
__tablename__: ClassVar[str] = "epc_floor_dimension" # pyright: ignore[reportIncompatibleVariableOverride]
id: Optional[int] = Field(default=None, primary_key=True)
epc_building_part_id: int = Field(
foreign_key="epc_building_part.id", nullable=False
)
floor: Optional[int] = Field(default=None)
room_height_m: float
total_floor_area_m2: float
party_wall_length_m: float
heat_loss_perimeter_m: float
floor_insulation: Optional[int] = Field(default=None)
floor_construction: Optional[int] = Field(default=None)
@classmethod
def from_domain(
cls, dim: SapFloorDimension, epc_building_part_id: int
) -> EpcFloorDimensionModel:
return cls(
epc_building_part_id=epc_building_part_id,
floor=dim.floor,
room_height_m=dim.room_height_m,
total_floor_area_m2=dim.total_floor_area_m2,
party_wall_length_m=dim.party_wall_length_m,
heat_loss_perimeter_m=dim.heat_loss_perimeter_m,
floor_insulation=dim.floor_insulation,
floor_construction=dim.floor_construction,
)
class EpcWindowModel(SQLModel, table=True):
__tablename__: ClassVar[str] = "epc_window" # pyright: ignore[reportIncompatibleVariableOverride]
id: Optional[int] = Field(default=None, primary_key=True)
epc_property_id: int = Field(foreign_key="epc_property.id", nullable=False)
frame_material: Optional[str] = Field(default=None)
# Union[int, str] / Union[bool, str] code fields — JSONB to preserve type on round-trip.
glazing_gap: Union[int, str] = Field(sa_column=Column(JSONB, nullable=False))
orientation: Union[int, str] = Field(sa_column=Column(JSONB, nullable=False))
window_type: Union[int, str] = Field(sa_column=Column(JSONB, nullable=False))
glazing_type: Union[int, str] = Field(sa_column=Column(JSONB, nullable=False))
window_width: float
window_height: float
draught_proofed: Union[bool, str] = Field(sa_column=Column(JSONB, nullable=False))
window_location: Union[int, str] = Field(sa_column=Column(JSONB, nullable=False))
window_wall_type: Union[int, str] = Field(sa_column=Column(JSONB, nullable=False))
permanent_shutters_present: Union[bool, str] = Field(
sa_column=Column(JSONB, nullable=False)
)
frame_factor: Optional[float] = Field(default=None)
permanent_shutters_insulated: Optional[str] = Field(default=None)
transmission_u_value: Optional[float] = Field(default=None)
transmission_data_source: Optional[Union[int, str]] = Field(
default=None, sa_column=Column(JSONB, nullable=True)
)
transmission_solar_transmittance: Optional[float] = Field(default=None)
@classmethod
def from_domain(cls, window: SapWindow, epc_property_id: int) -> EpcWindowModel:
td = window.window_transmission_details
return cls(
epc_property_id=epc_property_id,
frame_material=window.frame_material,
glazing_gap=window.glazing_gap,
orientation=window.orientation,
window_type=window.window_type,
glazing_type=window.glazing_type,
window_width=window.window_width,
window_height=window.window_height,
draught_proofed=window.draught_proofed,
window_location=window.window_location,
window_wall_type=window.window_wall_type,
permanent_shutters_present=window.permanent_shutters_present,
frame_factor=window.frame_factor,
permanent_shutters_insulated=window.permanent_shutters_insulated,
transmission_u_value=td.u_value if td else None,
transmission_data_source=td.data_source if td else None,
transmission_solar_transmittance=td.solar_transmittance if td else None,
)
class EpcEnergyElementModel(SQLModel, table=True):
__tablename__: ClassVar[str] = "epc_energy_element" # pyright: ignore[reportIncompatibleVariableOverride]
id: Optional[int] = Field(default=None, primary_key=True)
epc_property_id: int = Field(foreign_key="epc_property.id", nullable=False)
element_type: str # roof | wall | floor | main_heating | window | lighting | hot_water | secondary_heating | main_heating_controls
description: str
energy_efficiency_rating: int
environmental_efficiency_rating: int
@classmethod
def from_domain(
cls, element: EnergyElement, element_type: str, epc_property_id: int
) -> EpcEnergyElementModel:
return cls(
epc_property_id=epc_property_id,
element_type=element_type,
description=element.description,
energy_efficiency_rating=element.energy_efficiency_rating,
environmental_efficiency_rating=element.environmental_efficiency_rating,
)

View file

@ -0,0 +1,77 @@
from __future__ import annotations
from typing import ClassVar, Optional, cast
from sqlmodel import Field, SQLModel
from datatypes.epc.domain.epc import Epc
from domain.property_baseline.property_baseline_performance import PropertyBaselinePerformance
from domain.property_baseline.performance import Performance
from domain.property_baseline.rebaseliner import RebaselineReason
class PropertyBaselinePerformanceModel(SQLModel, table=True):
"""The ``property_baseline_performance`` row — one per Property (ADR-0004).
Flat typed columns (not a JSONB blob) so the FE can both surface the block
and query the lodged-vs-effective pair. The production migration is FE-owned
(Drizzle); see docs/migrations/property-baseline-performance-table.md.
"""
__tablename__: ClassVar[str] = "property_baseline_performance" # pyright: ignore[reportIncompatibleVariableOverride]
id: Optional[int] = Field(default=None, primary_key=True)
property_id: int = Field(unique=True, index=True)
lodged_sap_score: int
lodged_epc_band: str
lodged_co2_emissions_t_per_yr: float
lodged_primary_energy_intensity_kwh_per_m2_yr: int
effective_sap_score: int
effective_epc_band: str
effective_co2_emissions_t_per_yr: float
effective_primary_energy_intensity_kwh_per_m2_yr: int
rebaseline_reason: str
space_heating_kwh: float
water_heating_kwh: float
@classmethod
def from_domain(
cls, baseline: PropertyBaselinePerformance, property_id: int
) -> "PropertyBaselinePerformanceModel":
return cls(
property_id=property_id,
lodged_sap_score=baseline.lodged.sap_score,
lodged_epc_band=baseline.lodged.epc_band.value,
lodged_co2_emissions_t_per_yr=baseline.lodged.co2_emissions,
lodged_primary_energy_intensity_kwh_per_m2_yr=baseline.lodged.primary_energy_intensity,
effective_sap_score=baseline.effective.sap_score,
effective_epc_band=baseline.effective.epc_band.value,
effective_co2_emissions_t_per_yr=baseline.effective.co2_emissions,
effective_primary_energy_intensity_kwh_per_m2_yr=baseline.effective.primary_energy_intensity,
rebaseline_reason=baseline.rebaseline_reason,
space_heating_kwh=baseline.space_heating_kwh,
water_heating_kwh=baseline.water_heating_kwh,
)
def to_domain(self) -> PropertyBaselinePerformance:
return PropertyBaselinePerformance(
lodged=Performance(
sap_score=self.lodged_sap_score,
epc_band=Epc(self.lodged_epc_band),
co2_emissions=self.lodged_co2_emissions_t_per_yr,
primary_energy_intensity=self.lodged_primary_energy_intensity_kwh_per_m2_yr,
),
effective=Performance(
sap_score=self.effective_sap_score,
epc_band=Epc(self.effective_epc_band),
co2_emissions=self.effective_co2_emissions_t_per_yr,
primary_energy_intensity=self.effective_primary_energy_intensity_kwh_per_m2_yr,
),
rebaseline_reason=cast(RebaselineReason, self.rebaseline_reason),
space_heating_kwh=self.space_heating_kwh,
water_heating_kwh=self.water_heating_kwh,
)

View file

@ -0,0 +1,25 @@
from __future__ import annotations
from typing import ClassVar, Optional
from sqlmodel import Field, SQLModel
class PropertyRow(SQLModel, table=True):
"""Defensive view of the FE-owned ``property`` table.
The schema and migrations for ``property`` are owned by the front-end
Next.js repo; this declares only the identity columns the modelling backend
reads/writes, so FE-owned migrations to other columns don't ripple into us.
"""
__tablename__: ClassVar[str] = "property" # pyright: ignore[reportIncompatibleVariableOverride]
# Non-Optional: this is a read-only defensive view of the FE-owned ``property``
# table — the backend never inserts rows, so every row read carries an id.
id: int = Field(primary_key=True)
portfolio_id: int
postcode: str
address: str
uprn: Optional[int] = Field(default=None)
landlord_property_id: Optional[str] = Field(default=None)

View file

@ -0,0 +1,22 @@
from __future__ import annotations
from typing import Any, ClassVar, Optional
from sqlalchemy import Column
from sqlalchemy.dialects.postgresql import JSONB
from sqlmodel import Field, SQLModel
class SolarBuildingInsightsRow(SQLModel, table=True):
"""Persisted Google Solar `buildingInsights` response for one Property.
Stored as JSONB the raw fetched insights are retained whole so the
structured projection a future SolarPotential type needs can be derived
without re-fetching. One row per Property.
"""
__tablename__: ClassVar[str] = "solar_building_insights" # pyright: ignore[reportIncompatibleVariableOverride]
id: Optional[int] = Field(default=None, primary_key=True)
property_id: int = Field(index=True, unique=True)
insights: dict[str, Any] = Field(sa_column=Column(JSONB, nullable=False))

View file

@ -0,0 +1,70 @@
from __future__ import annotations
from typing import Protocol
class AraFirstRunCommand(Protocol):
"""The slice of the trigger the pipeline threads downstream.
Only the business fields UPRNs and Scenario definitions are read from
their source-of-truth tables, not carried here. ``task_id``/``sub_task_id``
are deliberately absent: the SubTask lifecycle is the decorator's concern,
not the pipeline's. ``AraFirstRunTriggerBody`` satisfies this structurally,
so ``orchestration`` need not import the application-layer event type.
"""
@property
def portfolio_id(self) -> int: ...
@property
def property_ids(self) -> list[int]: ...
@property
def scenario_ids(self) -> list[int]: ...
class IngestionStage(Protocol):
"""Stage 1 — acquires and persists each Property's external source data."""
def run(self, property_ids: list[int]) -> None: ...
class PropertyBaselineStage(Protocol):
"""Stage 2 — establishes each Property's Baseline Performance."""
def run(self, property_ids: list[int]) -> None: ...
class ModellingStage(Protocol):
"""Stage 3 — scores each Property against its Scenarios into Plans."""
def run(self, property_ids: list[int], scenario_ids: list[int]) -> None: ...
class AraFirstRunPipeline:
"""Composes the First Run stages end-to-end: Ingestion -> Baseline ->
Modelling.
Threads **only** ``property_ids`` between stages (and ``scenario_ids`` into
Modelling, off the command not a prior stage). The stages communicate
through repos, never via in-memory hand-off, which is what makes each stage
independently runnable for the single-property review flow (ADR-0011,
ADR-0003). Stage orchestrators are injected so the handler owns wiring and
tests substitute fakes.
"""
def __init__(
self,
*,
ingestion: IngestionStage,
baseline: PropertyBaselineStage,
modelling: ModellingStage,
) -> None:
self._ingestion = ingestion
self._baseline = baseline
self._modelling = modelling
def run(self, command: AraFirstRunCommand) -> None:
self._ingestion.run(command.property_ids)
self._baseline.run(command.property_ids)
self._modelling.run(command.property_ids, command.scenario_ids)

View file

@ -0,0 +1,97 @@
from __future__ import annotations
from collections.abc import Callable
from dataclasses import dataclass
from typing import Any, Optional, Protocol
from datatypes.epc.domain.epc_property_data import EpcPropertyData
from repositories.geospatial.geospatial_repository import GeospatialRepository
from repositories.unit_of_work import UnitOfWork
class EpcFetcher(Protocol):
"""The slice of the New-EPC-API client Ingestion needs (e.g. EpcClientService)."""
def get_by_uprn(self, uprn: int) -> Optional[EpcPropertyData]: ...
class SolarFetcher(Protocol):
"""The slice of the Google Solar client Ingestion needs (e.g. GoogleSolarApiClient)."""
def get_building_insights(
self, longitude: float, latitude: float
) -> dict[str, Any]: ...
@dataclass
class _Fetched:
"""One property's externally-fetched source data, awaiting the write phase."""
property_id: int
epc: Optional[EpcPropertyData]
solar_insights: Optional[dict[str, Any]]
class IngestionOrchestrator:
"""Stage 1: acquire a batch's external source data and persist it.
Runs in two phases so a DB connection is never held during external IO
(ADR-0012): **fetch** the whole batch read each UPRN, fetch its EPC, resolve
coordinates from the Geospatial reference Repo, thread those into the Solar
fetcher with *no unit open*; then **write** the batch in one Unit of Work
and commit once. Fetchers never call each other (ADR-0011); the orchestrator
threads the coordinate. Coordinates are reference data (deterministic from
UPRN), resolved transiently to drive the Solar fetch, never persisted.
The geospatial repo reads S3 reference data, not the transactional store, so
it is injected separately rather than taken from the unit.
"""
def __init__(
self,
*,
unit_of_work: Callable[[], UnitOfWork],
epc_fetcher: EpcFetcher,
geospatial_repo: GeospatialRepository,
solar_fetcher: SolarFetcher,
) -> None:
self._unit_of_work = unit_of_work
self._epc_fetcher = epc_fetcher
self._geospatial_repo = geospatial_repo
self._solar_fetcher = solar_fetcher
def run(self, property_ids: list[int]) -> None:
uprns = self._uprns_for(property_ids)
fetched = [self._fetch(property_id, uprn) for property_id, uprn in uprns]
self._persist(fetched)
def _uprns_for(self, property_ids: list[int]) -> list[tuple[int, int]]:
# A short read unit; properties with no UPRN (e.g. landlord_property_id
# only) are skipped — a later Site-Notes path covers them.
with self._unit_of_work() as uow:
properties = uow.property.get_many(property_ids)
return [
(property_id, prop.identity.uprn)
for property_id, prop in zip(property_ids, properties, strict=True)
if prop.identity.uprn is not None
]
def _fetch(self, property_id: int, uprn: int) -> _Fetched:
# No unit open here — this is the external-IO phase.
epc = self._epc_fetcher.get_by_uprn(uprn)
solar_insights: Optional[dict[str, Any]] = None
coordinates = self._geospatial_repo.coordinates_for(uprn)
if coordinates is not None:
solar_insights = self._solar_fetcher.get_building_insights(
coordinates.longitude, coordinates.latitude
)
return _Fetched(property_id, epc, solar_insights)
def _persist(self, fetched: list[_Fetched]) -> None:
with self._unit_of_work() as uow:
for item in fetched:
if item.epc is not None:
uow.epc.save(item.epc, property_id=item.property_id)
if item.solar_insights is not None:
uow.solar.save(item.property_id, item.solar_insights)
uow.commit()

View file

@ -0,0 +1,29 @@
from __future__ import annotations
from repositories.materials.materials_repository import MaterialsRepository
from repositories.scenario.scenario_repository import ScenarioRepository
class ModellingOrchestrator:
"""Stage 3 — scores each baselined Property against its Scenarios, producing
Recommendations -> an Optimised Package per Scenario Phase -> Plans
(CONTEXT.md: Modelling).
Stub at this stage (#1136): ``run`` reads its inputs through repos (it takes
only ``property_ids`` + ``scenario_ids``, never an in-memory hand-off from
Baseline) but does no scoring yet. Full Modelling lands via later TDD slices
+ per-service grills. The Scenario / Materials repos are injected now so the
composition and wiring are real even while the body is empty.
"""
def __init__(
self,
*,
scenario_repo: ScenarioRepository,
materials_repo: MaterialsRepository,
) -> None:
self._scenario_repo = scenario_repo
self._materials_repo = materials_repo
def run(self, property_ids: list[int], scenario_ids: list[int]) -> None:
return None

View file

@ -0,0 +1,67 @@
from __future__ import annotations
from collections.abc import Callable
from datatypes.epc.domain.epc_property_data import (
EpcPropertyData,
RenewableHeatIncentive,
)
from domain.property_baseline.property_baseline_performance import PropertyBaselinePerformance
from domain.property_baseline.performance import lodged_performance
from domain.property_baseline.rebaseliner import Rebaseliner
from repositories.unit_of_work import UnitOfWork
class PropertyBaselineOrchestrator:
"""Stage 2: establish each Property's Baseline Performance and persist it.
Runs the whole batch in **one** Unit of Work and commits once (ADR-0012):
for each property it hydrates the Property via the unit's PropertyRepo,
resolves the Effective EPC, reads Lodged Performance off it, runs the
Rebaseliner to produce Effective Performance, and persists the pair plus the
deterministic kWh. Any property raising aborts the batch the unit is left
uncommitted, so nothing persists and the subtask fails noisily.
Reads only from repos never a Fetcher or HTTP (ADR-0003) so it is
byte-identical whether Ingestion ran milliseconds ago (First Run) or last
week. The injected Rebaseliner is the re-score-on-override seam (ADR-0011).
"""
def __init__(
self,
*,
unit_of_work: Callable[[], UnitOfWork],
rebaseliner: Rebaseliner,
) -> None:
self._unit_of_work = unit_of_work
self._rebaseliner = rebaseliner
def run(self, property_ids: list[int]) -> None:
with self._unit_of_work() as uow:
properties = uow.property.get_many(property_ids)
for property_id, prop in zip(property_ids, properties, strict=True):
effective_epc = prop.effective_epc
lodged = lodged_performance(effective_epc)
effective, reason = self._rebaseliner.rebaseline(
effective_epc, lodged
)
rhi = _require_rhi(effective_epc)
baseline = PropertyBaselinePerformance(
lodged=lodged,
effective=effective,
rebaseline_reason=reason,
space_heating_kwh=rhi.space_heating_kwh,
water_heating_kwh=rhi.water_heating_kwh,
)
uow.property_baseline.save(baseline, property_id)
uow.commit()
def _require_rhi(epc: EpcPropertyData) -> RenewableHeatIncentive:
rhi = epc.renewable_heat_incentive
if rhi is None:
raise ValueError(
"Effective EPC is missing renewable_heat_incentive; cannot read "
"baseline space-heating / hot-water kWh"
)
return rhi

View file

View file

@ -0,0 +1,846 @@
from __future__ import annotations
from collections.abc import Sequence
from datetime import date
from typing import Optional, Protocol, TypeVar
from sqlmodel import Session, col, delete, select
from datatypes.epc.domain.epc import Epc
from datatypes.epc.domain.epc_property_data import (
Addendum,
BuildingPartIdentifier,
EnergyElement,
EpcPropertyData,
InstantaneousWwhrs,
MainHeatingDetail,
PhotovoltaicSupply,
PhotovoltaicSupplyNoneOrNoDetails,
PvBatteries,
PvBattery,
RenewableHeatIncentive,
SapAlternativeWall,
SapBuildingPart,
SapEnergySource,
SapFlatDetails,
SapFloorDimension,
SapHeating,
SapRoomInRoof,
SapVentilation,
SapWindow,
ShowerOutlet,
ShowerOutlets,
WindowsTransmissionDetails,
WindowTransmissionDetails,
WindTurbineDetails,
)
from infrastructure.postgres.epc_property_table import (
EpcBuildingPartModel,
EpcEnergyElementModel,
EpcFlatDetailsModel,
EpcFloorDimensionModel,
EpcMainHeatingDetailModel,
EpcPropertyEnergyPerformanceModel,
EpcPropertyModel,
EpcRenewableHeatIncentiveModel,
EpcWindowModel,
)
from repositories.epc.epc_repository import EpcRepository
from utilities.private import private
_T = TypeVar("_T")
def _require(value: Optional[_T], field: str) -> _T:
if value is None:
raise ValueError(f"epc_property row is missing required field {field!r}")
return value
class _HasEpcPropertyId(Protocol):
epc_property_id: int
_RowT = TypeVar("_RowT", bound=_HasEpcPropertyId)
def _group_by_epc(rows: Sequence[_RowT]) -> dict[int, list[_RowT]]:
grouped: dict[int, list[_RowT]] = {}
for row in rows:
grouped.setdefault(row.epc_property_id, []).append(row)
return grouped
class EpcPostgresRepository(EpcRepository):
"""Maps EpcPropertyData to/from the epc_property parent row + child tables.
Round-trip fidelity over the persisted projection is pinned by the Slice-1
round-trip test (Hestia-Homes/Model#1129). Fields the schema does not yet
store (see docs/migrations/epc-property-round-trip-fidelity.md §2) reconstruct
as their dataclass defaults tracked as follow-up migrations.
"""
def __init__(self, session: Session) -> None:
self._session = session
def save(
self,
data: EpcPropertyData,
property_id: Optional[int] = None,
portfolio_id: Optional[int] = None,
) -> int:
# Idempotent on property_id: a re-run replaces the property's EPC graph
# rather than duplicating it (ADR-0012). Anonymous saves (no property_id)
# always insert.
if property_id is not None:
self._delete_for_property(property_id)
parent = EpcPropertyModel.from_epc_property_data(
data, property_id=property_id, portfolio_id=portfolio_id
)
self._session.add(parent)
self._session.flush()
epc_property_id = _require(parent.id, "id")
self._session.add(
EpcPropertyEnergyPerformanceModel.from_epc_property_data(
data, epc_property_id=epc_property_id
)
)
for detail in data.sap_heating.main_heating_details:
self._session.add(
EpcMainHeatingDetailModel.from_domain(detail, epc_property_id)
)
for part in data.sap_building_parts:
bp = EpcBuildingPartModel.from_domain(part, epc_property_id)
self._session.add(bp)
self._session.flush()
bp_id = _require(bp.id, "epc_building_part.id")
for dim in part.sap_floor_dimensions:
self._session.add(EpcFloorDimensionModel.from_domain(dim, bp_id))
for window in data.sap_windows:
self._session.add(EpcWindowModel.from_domain(window, epc_property_id))
for element_type, elements in (
("roof", data.roofs),
("wall", data.walls),
("floor", data.floors),
("main_heating", data.main_heating),
):
for el in elements:
self._session.add(
EpcEnergyElementModel.from_domain(el, element_type, epc_property_id)
)
for el, element_type in (
(data.window, "window"),
(data.lighting, "lighting"),
(data.hot_water, "hot_water"),
(data.secondary_heating, "secondary_heating"),
(data.main_heating_controls, "main_heating_controls"),
):
if el is not None:
self._session.add(
EpcEnergyElementModel.from_domain(el, element_type, epc_property_id)
)
if data.sap_flat_details is not None:
self._session.add(
EpcFlatDetailsModel.from_domain(data.sap_flat_details, epc_property_id)
)
if data.renewable_heat_incentive is not None:
self._session.add(
EpcRenewableHeatIncentiveModel.from_domain(
data.renewable_heat_incentive, epc_property_id
)
)
return epc_property_id
def _delete_for_property(self, property_id: int) -> None:
"""Remove the property's existing EPC graph (parent + child tables) so a
re-save replaces rather than duplicates (ADR-0012)."""
epc_ids = [
i
for i in self._session.exec(
select(EpcPropertyModel.id).where(
EpcPropertyModel.property_id == property_id
)
).all()
if i is not None
]
if not epc_ids:
return
part_ids = [
i
for i in self._session.exec(
select(EpcBuildingPartModel.id).where(
col(EpcBuildingPartModel.epc_property_id).in_(epc_ids)
)
).all()
if i is not None
]
if part_ids:
self._session.exec( # type: ignore[call-overload]
delete(EpcFloorDimensionModel).where(
col(EpcFloorDimensionModel.epc_building_part_id).in_(part_ids)
)
)
for child in (
EpcPropertyEnergyPerformanceModel,
EpcEnergyElementModel,
EpcMainHeatingDetailModel,
EpcBuildingPartModel,
EpcWindowModel,
EpcFlatDetailsModel,
EpcRenewableHeatIncentiveModel,
):
self._session.exec( # type: ignore[call-overload]
delete(child).where(col(child.epc_property_id).in_(epc_ids))
)
self._session.exec( # type: ignore[call-overload]
delete(EpcPropertyModel).where(col(EpcPropertyModel.id).in_(epc_ids))
)
def get_for_property(self, property_id: int) -> Optional[EpcPropertyData]:
row = self._session.exec(
select(EpcPropertyModel)
.where(EpcPropertyModel.property_id == property_id)
.order_by(EpcPropertyModel.id) # type: ignore[arg-type]
).first()
if row is None or row.id is None:
return None
return self.get(row.id)
def get_for_properties(
self, property_ids: list[int]
) -> dict[int, EpcPropertyData]:
"""Bulk-hydrate a batch's EPCs in a handful of per-table IN queries
(ADR-0012), not N x per-property. Load-whole per ADR-0002."""
if not property_ids:
return {}
parents = self._session.exec(
select(EpcPropertyModel)
.where(col(EpcPropertyModel.property_id).in_(property_ids))
.order_by(EpcPropertyModel.id) # type: ignore[arg-type]
).all()
parent_by_property: dict[int, EpcPropertyModel] = {}
for parent in parents:
if parent.property_id is not None and parent.id is not None:
parent_by_property.setdefault(parent.property_id, parent)
epc_ids = [p.id for p in parent_by_property.values() if p.id is not None]
if not epc_ids:
return {}
perf_by = {
r.epc_property_id: r
for r in self._session.exec(
select(EpcPropertyEnergyPerformanceModel).where(
col(EpcPropertyEnergyPerformanceModel.epc_property_id).in_(epc_ids)
)
).all()
}
flat_by = {
r.epc_property_id: r
for r in self._session.exec(
select(EpcFlatDetailsModel).where(
col(EpcFlatDetailsModel.epc_property_id).in_(epc_ids)
)
).all()
}
rhi_by = {
r.epc_property_id: r
for r in self._session.exec(
select(EpcRenewableHeatIncentiveModel).where(
col(EpcRenewableHeatIncentiveModel.epc_property_id).in_(epc_ids)
)
).all()
}
elements_by = _group_by_epc(
self._session.exec(
select(EpcEnergyElementModel)
.where(col(EpcEnergyElementModel.epc_property_id).in_(epc_ids))
.order_by(EpcEnergyElementModel.id) # type: ignore[arg-type]
).all()
)
heating_by = _group_by_epc(
self._session.exec(
select(EpcMainHeatingDetailModel)
.where(col(EpcMainHeatingDetailModel.epc_property_id).in_(epc_ids))
.order_by(EpcMainHeatingDetailModel.id) # type: ignore[arg-type]
).all()
)
parts_by = _group_by_epc(
self._session.exec(
select(EpcBuildingPartModel)
.where(col(EpcBuildingPartModel.epc_property_id).in_(epc_ids))
.order_by(EpcBuildingPartModel.id) # type: ignore[arg-type]
).all()
)
windows_by = _group_by_epc(
self._session.exec(
select(EpcWindowModel)
.where(col(EpcWindowModel.epc_property_id).in_(epc_ids))
.order_by(EpcWindowModel.id) # type: ignore[arg-type]
).all()
)
part_ids = [
bp.id
for parts in parts_by.values()
for bp in parts
if bp.id is not None
]
floor_dims_by_part = self._floor_dims_by_part(part_ids)
result: dict[int, EpcPropertyData] = {}
for property_id, parent in parent_by_property.items():
epc_id = _require(parent.id, "id")
result[property_id] = self._compose(
p=parent,
perf=perf_by.get(epc_id),
elements=elements_by.get(epc_id, []),
heating_rows=heating_by.get(epc_id, []),
part_rows=parts_by.get(epc_id, []),
floor_dims_by_part=floor_dims_by_part,
window_rows=windows_by.get(epc_id, []),
flat_row=flat_by.get(epc_id),
rhi_row=rhi_by.get(epc_id),
)
return result
def _floor_dims_by_part(
self, part_ids: list[int]
) -> dict[int, list[EpcFloorDimensionModel]]:
if not part_ids:
return {}
rows = self._session.exec(
select(EpcFloorDimensionModel)
.where(col(EpcFloorDimensionModel.epc_building_part_id).in_(part_ids))
.order_by(EpcFloorDimensionModel.id) # type: ignore[arg-type]
).all()
grouped: dict[int, list[EpcFloorDimensionModel]] = {}
for row in rows:
grouped.setdefault(row.epc_building_part_id, []).append(row)
return grouped
def get(self, epc_property_id: int) -> EpcPropertyData:
p = self._session.get(EpcPropertyModel, epc_property_id)
if p is None:
raise ValueError(f"epc_property {epc_property_id} not found")
perf = self._session.exec(
select(EpcPropertyEnergyPerformanceModel).where(
EpcPropertyEnergyPerformanceModel.epc_property_id == epc_property_id
)
).first()
elements = list(
self._session.exec(
select(EpcEnergyElementModel)
.where(EpcEnergyElementModel.epc_property_id == epc_property_id)
.order_by(EpcEnergyElementModel.id) # type: ignore[arg-type]
).all()
)
heating_rows = list(
self._session.exec(
select(EpcMainHeatingDetailModel)
.where(EpcMainHeatingDetailModel.epc_property_id == epc_property_id)
.order_by(EpcMainHeatingDetailModel.id) # type: ignore[arg-type]
).all()
)
part_rows = list(
self._session.exec(
select(EpcBuildingPartModel)
.where(EpcBuildingPartModel.epc_property_id == epc_property_id)
.order_by(EpcBuildingPartModel.id) # type: ignore[arg-type]
).all()
)
flat_row = self._session.exec(
select(EpcFlatDetailsModel).where(
EpcFlatDetailsModel.epc_property_id == epc_property_id
)
).first()
rhi_row = self._session.exec(
select(EpcRenewableHeatIncentiveModel).where(
EpcRenewableHeatIncentiveModel.epc_property_id == epc_property_id
)
).first()
window_rows = self._windows(epc_property_id)
floor_dims_by_part = self._floor_dims_by_part(
[bp.id for bp in part_rows if bp.id is not None]
)
return self._compose(
p=p,
perf=perf,
elements=elements,
heating_rows=heating_rows,
part_rows=part_rows,
floor_dims_by_part=floor_dims_by_part,
window_rows=window_rows,
flat_row=flat_row,
rhi_row=rhi_row,
)
def _compose(
self,
*,
p: EpcPropertyModel,
perf: Optional[EpcPropertyEnergyPerformanceModel],
elements: list[EpcEnergyElementModel],
heating_rows: list[EpcMainHeatingDetailModel],
part_rows: list[EpcBuildingPartModel],
floor_dims_by_part: dict[int, list[EpcFloorDimensionModel]],
window_rows: list[EpcWindowModel],
flat_row: Optional[EpcFlatDetailsModel],
rhi_row: Optional[EpcRenewableHeatIncentiveModel],
) -> EpcPropertyData:
def _elements(element_type: str) -> list[EnergyElement]:
return [self._to_energy_element(e) for e in elements if e.element_type == element_type]
def _single(element_type: str) -> Optional[EnergyElement]:
found = _elements(element_type)
return found[0] if found else None
return EpcPropertyData(
dwelling_type=p.dwelling_type,
inspection_date=date.fromisoformat(p.inspection_date),
tenure=p.tenure,
transaction_type=p.transaction_type,
address_line_1=_require(p.address_line_1, "address_line_1"),
postcode=_require(p.postcode, "postcode"),
post_town=_require(p.post_town, "post_town"),
roofs=_elements("roof"),
walls=_elements("wall"),
floors=_elements("floor"),
main_heating=_elements("main_heating"),
door_count=p.door_count,
sap_heating=self._to_sap_heating(p, heating_rows),
sap_windows=[self._to_window(w) for w in window_rows],
sap_energy_source=self._to_energy_source(p),
sap_building_parts=[
self._to_building_part(
bp, floor_dims_by_part.get(bp.id, []) if bp.id is not None else []
)
for bp in part_rows
],
solar_water_heating=p.solar_water_heating,
has_hot_water_cylinder=p.has_hot_water_cylinder,
has_fixed_air_conditioning=p.has_fixed_air_conditioning,
wet_rooms_count=p.wet_rooms_count,
extensions_count=p.extensions_count,
heated_rooms_count=p.heated_rooms_count,
open_chimneys_count=p.open_chimneys_count,
habitable_rooms_count=p.habitable_rooms_count,
insulated_door_count=p.insulated_door_count,
cfl_fixed_lighting_bulbs_count=p.cfl_fixed_lighting_bulbs_count,
led_fixed_lighting_bulbs_count=p.led_fixed_lighting_bulbs_count,
incandescent_fixed_lighting_bulbs_count=p.incandescent_fixed_lighting_bulbs_count,
total_floor_area_m2=p.total_floor_area_m2,
assessment_type=p.assessment_type,
sap_version=p.sap_version,
uprn=p.uprn,
status=p.status,
window=_single("window"),
lighting=_single("lighting"),
hot_water=_single("hot_water"),
secondary_heating=_single("secondary_heating"),
main_heating_controls=_single("main_heating_controls"),
schema_type=p.schema_type,
schema_versions_original=p.schema_versions_original,
report_type=p.report_type,
report_reference=p.report_reference,
uprn_source=p.uprn_source,
address_line_2=p.address_line_2,
region_code=p.region_code,
country_code=p.country_code,
built_form=p.built_form,
property_type=p.property_type,
pressure_test=p.pressure_test,
language_code=p.language_code,
completion_date=(
date.fromisoformat(p.completion_date) if p.completion_date else None
),
registration_date=(
date.fromisoformat(p.registration_date)
if p.registration_date
else None
),
measurement_type=p.measurement_type,
conservatory_type=p.conservatory_type,
has_conservatory=p.has_conservatory,
has_heated_separate_conservatory=p.has_heated_separate_conservatory,
blocked_chimneys_count=p.blocked_chimneys_count,
energy_rating_average=p.energy_rating_average,
current_energy_efficiency_band=(
Epc(perf.current_energy_efficiency_band)
if perf and perf.current_energy_efficiency_band
else None
),
environmental_impact_current=(
perf.environmental_impact_current if perf else None
),
heating_cost_current=perf.heating_cost_current if perf else None,
co2_emissions_current=perf.co2_emissions_current if perf else None,
energy_consumption_current=(
perf.energy_consumption_current if perf else None
),
energy_rating_current=perf.energy_rating_current if perf else None,
lighting_cost_current=perf.lighting_cost_current if perf else None,
hot_water_cost_current=perf.hot_water_cost_current if perf else None,
insulated_door_u_value=p.insulated_door_u_value,
mechanical_ventilation=p.mechanical_ventilation,
percent_draughtproofed=p.percent_draughtproofed,
heating_cost_potential=perf.heating_cost_potential if perf else None,
co2_emissions_potential=perf.co2_emissions_potential if perf else None,
energy_consumption_potential=(
perf.energy_consumption_potential if perf else None
),
energy_rating_potential=perf.energy_rating_potential if perf else None,
lighting_cost_potential=perf.lighting_cost_potential if perf else None,
hot_water_cost_potential=perf.hot_water_cost_potential if perf else None,
environmental_impact_potential=(
perf.environmental_impact_potential if perf else None
),
potential_energy_efficiency_band=(
Epc(perf.potential_energy_efficiency_band)
if perf and perf.potential_energy_efficiency_band
else None
),
draughtproofed_door_count=p.draughtproofed_door_count,
mechanical_vent_duct_type=p.mechanical_vent_duct_type,
windows_transmission_details=(
WindowsTransmissionDetails(
u_value=p.windows_transmission_u_value,
data_source=_require(
p.windows_transmission_data_source,
"windows_transmission_data_source",
),
solar_transmittance=_require(
p.windows_transmission_solar_transmittance,
"windows_transmission_solar_transmittance",
),
)
if p.windows_transmission_u_value is not None
else None
),
multiple_glazed_proportion=p.multiple_glazed_proportion,
calculation_software_version=p.calculation_software_version,
mechanical_vent_duct_placement=p.mechanical_vent_duct_placement,
mechanical_vent_duct_insulation=p.mechanical_vent_duct_insulation,
pressure_test_certificate_number=p.pressure_test_certificate_number,
mechanical_ventilation_index_number=p.mechanical_ventilation_index_number,
mechanical_vent_measured_installation=p.mechanical_vent_measured_installation,
co2_emissions_current_per_floor_area=(
perf.co2_emissions_current_per_floor_area if perf else None
),
low_energy_fixed_lighting_bulbs_count=p.low_energy_fixed_lighting_bulbs_count,
sap_flat_details=(
self._to_flat_details(flat_row) if flat_row is not None else None
),
fixed_lighting_outlets_count=p.fixed_lighting_outlets_count,
low_energy_fixed_lighting_outlets_count=p.low_energy_fixed_lighting_outlets_count,
sap_ventilation=self._to_ventilation(p),
number_of_storeys=p.number_of_storeys,
any_unheated_rooms=p.any_unheated_rooms,
waste_water_heat_recovery=p.waste_water_heat_recovery,
hydro=p.hydro,
photovoltaic_array=p.photovoltaic_array,
renewable_heat_incentive=(
RenewableHeatIncentive(
space_heating_kwh=rhi_row.space_heating_kwh,
water_heating_kwh=rhi_row.water_heating_kwh,
impact_of_loft_insulation_kwh=rhi_row.impact_of_loft_insulation_kwh,
impact_of_cavity_insulation_kwh=rhi_row.impact_of_cavity_insulation_kwh,
impact_of_solid_wall_insulation_kwh=rhi_row.impact_of_solid_wall_insulation_kwh,
)
if rhi_row is not None
else None
),
mechanical_vent_duct_insulation_level=p.mechanical_vent_duct_insulation_level,
addendum=(
Addendum(
stone_walls=p.addendum_stone_walls,
system_build=p.addendum_system_build,
addendum_numbers=p.addendum_numbers,
)
if (
p.addendum_stone_walls is not None
or p.addendum_system_build is not None
or p.addendum_numbers is not None
)
else None
),
)
@private
def _windows(self, epc_property_id: int) -> list[EpcWindowModel]:
return list(
self._session.exec(
select(EpcWindowModel)
.where(EpcWindowModel.epc_property_id == epc_property_id)
.order_by(EpcWindowModel.id) # type: ignore[arg-type]
).all()
)
@private
def _to_energy_element(self, e: EpcEnergyElementModel) -> EnergyElement:
return EnergyElement(
description=e.description,
energy_efficiency_rating=e.energy_efficiency_rating,
environmental_efficiency_rating=e.environmental_efficiency_rating,
)
@private
def _to_sap_heating(
self, p: EpcPropertyModel, heating_rows: list[EpcMainHeatingDetailModel]
) -> SapHeating:
shower_outlets = (
ShowerOutlets(
shower_outlet=ShowerOutlet(
shower_outlet_type=p.heating_shower_outlet_type,
shower_wwhrs=p.heating_shower_wwhrs,
)
)
if p.heating_shower_outlet_type is not None
else None
)
return SapHeating(
instantaneous_wwhrs=InstantaneousWwhrs(
wwhrs_index_number1=p.heating_wwhrs_index_number_1,
wwhrs_index_number2=p.heating_wwhrs_index_number_2,
),
main_heating_details=[self._to_main_heating(m) for m in heating_rows],
has_fixed_air_conditioning=p.has_fixed_air_conditioning,
cylinder_size=p.heating_cylinder_size,
water_heating_code=p.heating_water_heating_code,
water_heating_fuel=p.heating_water_heating_fuel,
immersion_heating_type=p.heating_immersion_heating_type,
shower_outlets=shower_outlets,
cylinder_insulation_type=p.heating_cylinder_insulation_type,
cylinder_thermostat=p.heating_cylinder_thermostat,
secondary_fuel_type=p.heating_secondary_fuel_type,
secondary_heating_type=p.heating_secondary_heating_type,
cylinder_insulation_thickness_mm=p.heating_cylinder_insulation_thickness_mm,
number_baths=p.heating_number_baths,
number_baths_wwhrs=p.heating_number_baths_wwhrs,
electric_shower_count=p.heating_electric_shower_count,
mixer_shower_count=p.heating_mixer_shower_count,
)
@private
def _to_main_heating(self, m: EpcMainHeatingDetailModel) -> MainHeatingDetail:
return MainHeatingDetail(
has_fghrs=m.has_fghrs,
main_fuel_type=m.main_fuel_type,
heat_emitter_type=m.heat_emitter_type,
emitter_temperature=m.emitter_temperature,
main_heating_control=m.main_heating_control,
fan_flue_present=m.fan_flue_present,
boiler_flue_type=m.boiler_flue_type,
boiler_ignition_type=m.boiler_ignition_type,
central_heating_pump_age=m.central_heating_pump_age,
central_heating_pump_age_str=m.central_heating_pump_age_str,
main_heating_index_number=m.main_heating_index_number,
sap_main_heating_code=m.sap_main_heating_code,
main_heating_number=m.main_heating_number,
main_heating_category=m.main_heating_category,
main_heating_fraction=m.main_heating_fraction,
main_heating_data_source=m.main_heating_data_source,
condensing=m.condensing,
weather_compensator=m.weather_compensator,
)
@private
def _to_window(self, w: EpcWindowModel) -> SapWindow:
return SapWindow(
frame_material=w.frame_material,
glazing_gap=w.glazing_gap,
orientation=w.orientation,
window_type=w.window_type,
glazing_type=w.glazing_type,
window_width=w.window_width,
window_height=w.window_height,
draught_proofed=w.draught_proofed,
window_location=w.window_location,
window_wall_type=w.window_wall_type,
permanent_shutters_present=w.permanent_shutters_present,
frame_factor=w.frame_factor,
window_transmission_details=(
WindowTransmissionDetails(
u_value=w.transmission_u_value,
data_source=_require(
w.transmission_data_source, "window.transmission_data_source"
),
solar_transmittance=_require(
w.transmission_solar_transmittance,
"window.transmission_solar_transmittance",
),
)
if w.transmission_u_value is not None
else None
),
permanent_shutters_insulated=w.permanent_shutters_insulated,
)
@private
def _to_building_part(
self, bp: EpcBuildingPartModel, floor_rows: list[EpcFloorDimensionModel]
) -> SapBuildingPart:
return SapBuildingPart(
identifier=BuildingPartIdentifier(bp.identifier),
construction_age_band=bp.construction_age_band,
wall_construction=bp.wall_construction,
wall_insulation_type=bp.wall_insulation_type,
wall_thickness_measured=bp.wall_thickness_measured,
party_wall_construction=bp.party_wall_construction,
sap_floor_dimensions=[self._to_floor_dimension(f) for f in floor_rows],
building_part_number=bp.building_part_number,
wall_dry_lined=bp.wall_dry_lined,
wall_thickness_mm=bp.wall_thickness_mm,
wall_insulation_thickness=bp.wall_insulation_thickness,
sap_alternative_wall_1=self._to_alt_wall(bp, 1),
sap_alternative_wall_2=self._to_alt_wall(bp, 2),
floor_heat_loss=bp.floor_heat_loss,
floor_insulation_thickness=bp.floor_insulation_thickness,
flat_roof_insulation_thickness=bp.flat_roof_insulation_thickness,
floor_type=bp.floor_type,
floor_construction_type=bp.floor_construction_type,
floor_insulation_type_str=bp.floor_insulation_type_str,
floor_u_value_known=bp.floor_u_value_known,
roof_construction=bp.roof_construction,
roof_construction_type=bp.roof_construction_type,
curtain_wall_age=bp.curtain_wall_age,
roof_insulation_location=bp.roof_insulation_location,
roof_insulation_thickness=bp.roof_insulation_thickness,
sap_room_in_roof=(
SapRoomInRoof(
floor_area=bp.room_in_roof_floor_area,
construction_age_band=_require(
bp.room_in_roof_construction_age_band,
"room_in_roof_construction_age_band",
),
)
if bp.room_in_roof_floor_area is not None
else None
),
)
@private
def _to_alt_wall(
self, bp: EpcBuildingPartModel, n: int
) -> Optional[SapAlternativeWall]:
area = bp.alt_wall_1_area if n == 1 else bp.alt_wall_2_area
if area is None:
return None
dry_lined = bp.alt_wall_1_dry_lined if n == 1 else bp.alt_wall_2_dry_lined
construction = (
bp.alt_wall_1_construction if n == 1 else bp.alt_wall_2_construction
)
insulation_type = (
bp.alt_wall_1_insulation_type if n == 1 else bp.alt_wall_2_insulation_type
)
thickness_measured = (
bp.alt_wall_1_thickness_measured
if n == 1
else bp.alt_wall_2_thickness_measured
)
insulation_thickness = (
bp.alt_wall_1_insulation_thickness
if n == 1
else bp.alt_wall_2_insulation_thickness
)
return SapAlternativeWall(
wall_area=area,
wall_dry_lined=_require(dry_lined, f"alt_wall_{n}_dry_lined"),
wall_construction=_require(construction, f"alt_wall_{n}_construction"),
wall_insulation_type=_require(
insulation_type, f"alt_wall_{n}_insulation_type"
),
wall_thickness_measured=_require(
thickness_measured, f"alt_wall_{n}_thickness_measured"
),
wall_insulation_thickness=insulation_thickness,
)
@private
def _to_floor_dimension(self, f: EpcFloorDimensionModel) -> SapFloorDimension:
return SapFloorDimension(
room_height_m=f.room_height_m,
total_floor_area_m2=f.total_floor_area_m2,
party_wall_length_m=f.party_wall_length_m,
heat_loss_perimeter_m=f.heat_loss_perimeter_m,
floor=f.floor,
floor_insulation=f.floor_insulation,
floor_construction=f.floor_construction,
)
@private
def _to_energy_source(self, p: EpcPropertyModel) -> SapEnergySource:
return SapEnergySource(
mains_gas=p.energy_mains_gas,
meter_type=p.energy_meter_type,
pv_battery_count=p.energy_pv_battery_count,
wind_turbines_count=p.energy_wind_turbines_count,
gas_smart_meter_present=p.energy_gas_smart_meter_present,
is_dwelling_export_capable=p.energy_is_dwelling_export_capable,
wind_turbines_terrain_type=p.energy_wind_turbines_terrain_type,
electricity_smart_meter_present=p.energy_electricity_smart_meter_present,
pv_connection=p.energy_pv_connection,
photovoltaic_supply=(
PhotovoltaicSupply(
none_or_no_details=PhotovoltaicSupplyNoneOrNoDetails(
percent_roof_area=p.energy_pv_percent_roof_area
)
)
if p.energy_pv_percent_roof_area is not None
else None
),
wind_turbine_details=(
WindTurbineDetails(
hub_height=p.energy_wind_turbine_hub_height,
rotor_diameter=_require(
p.energy_wind_turbine_rotor_diameter,
"energy_wind_turbine_rotor_diameter",
),
)
if p.energy_wind_turbine_hub_height is not None
else None
),
pv_batteries=(
PvBatteries(
pv_battery=PvBattery(battery_capacity=p.energy_pv_battery_capacity)
)
if p.energy_pv_battery_capacity is not None
else None
),
)
@private
def _to_ventilation(self, p: EpcPropertyModel) -> Optional[SapVentilation]:
if not p.ventilation_present:
return None
return SapVentilation(
ventilation_type=p.ventilation_type,
draught_lobby=p.ventilation_draught_lobby,
pressure_test=p.ventilation_pressure_test,
open_flues_count=p.ventilation_open_flues_count,
closed_flues_count=p.ventilation_closed_flues_count,
boiler_flues_count=p.ventilation_boiler_flues_count,
other_flues_count=p.ventilation_other_flues_count,
extract_fans_count=p.ventilation_extract_fans_count,
passive_vents_count=p.ventilation_passive_vents_count,
flueless_gas_fires_count=p.ventilation_flueless_gas_fires_count,
ventilation_in_pcdf_database=p.ventilation_in_pcdf_database,
sheltered_sides=p.ventilation_sheltered_sides,
has_suspended_timber_floor=p.ventilation_has_suspended_timber_floor,
suspended_timber_floor_sealed=p.ventilation_suspended_timber_floor_sealed,
has_draught_lobby=p.ventilation_has_draught_lobby,
air_permeability_ap4_m3_h_m2=p.ventilation_air_permeability_ap4_m3_h_m2,
mechanical_ventilation_kind=p.ventilation_mechanical_ventilation_kind,
)
@private
def _to_flat_details(self, f: EpcFlatDetailsModel) -> SapFlatDetails:
return SapFlatDetails(
level=f.level,
top_storey=f.top_storey,
flat_location=f.flat_location,
heat_loss_corridor=f.heat_loss_corridor,
storey_count=f.storey_count,
unheated_corridor_length_m=f.unheated_corridor_length_m,
)

View file

@ -0,0 +1,38 @@
from __future__ import annotations
from abc import ABC, abstractmethod
from typing import Optional
from datatypes.epc.domain.epc_property_data import EpcPropertyData
class EpcRepository(ABC):
"""Persists and loads the structured EPC Property Data slice.
`save` writes the `EpcPropertyData` to the `epc_property` parent row and its
child tables; `get` reconstructs the persisted projection back into an
`EpcPropertyData`. Round-trip fidelity over that projection is pinned by the
Slice-1 round-trip test (Hestia-Homes/Model#1129).
"""
@abstractmethod
def save(
self,
data: EpcPropertyData,
property_id: int | None = None,
portfolio_id: int | None = None,
) -> int: ...
@abstractmethod
def get(self, epc_property_id: int) -> EpcPropertyData: ...
@abstractmethod
def get_for_property(self, property_id: int) -> Optional[EpcPropertyData]: ...
@abstractmethod
def get_for_properties(
self, property_ids: list[int]
) -> dict[int, EpcPropertyData]:
"""Bulk-hydrate a batch's EPCs, keyed by property_id (only those with an
EPC are present). A handful of per-table queries, not N per property."""
...

View file

View file

@ -0,0 +1,17 @@
from __future__ import annotations
from abc import ABC, abstractmethod
from typing import Optional
from domain.geospatial.coordinates import Coordinates
class GeospatialRepository(ABC):
"""Resolves a Property's coordinates from hosted reference data by UPRN.
A Repo, not a Fetcher (ADR-0011): it reads stored Ordnance Survey Open-UPRN
data, with no live API call. Returns None when the UPRN is not covered.
"""
@abstractmethod
def coordinates_for(self, uprn: int) -> Optional[Coordinates]: ...

View file

@ -0,0 +1,43 @@
from __future__ import annotations
from collections.abc import Callable
from typing import Optional
import pandas as pd
from domain.geospatial.coordinates import Coordinates
from repositories.geospatial.geospatial_repository import GeospatialRepository
ParquetReader = Callable[[str], pd.DataFrame]
_META_KEY = "spatial/filename_meta.parquet"
class GeospatialS3Repository(GeospatialRepository):
"""Reads the partitioned Ordnance Survey Open-UPRN parquet dataset.
`spatial/filename_meta.parquet` maps a UPRN range (lower/upper) to a
partition file; that partition carries `UPRN`/`LATITUDE`/`LONGITUDE`. The
parquet reader is injected so the dataset can be sourced from S3 in
production or a fixture directory in tests the Repo holds no S3/HTTP code.
"""
def __init__(self, read_parquet: ParquetReader) -> None:
self._read_parquet = read_parquet
def coordinates_for(self, uprn: int) -> Optional[Coordinates]:
meta = self._read_parquet(_META_KEY)
covering = meta[(meta["lower"] <= uprn) & (meta["upper"] >= uprn)]
if covering.empty:
return None
filename = str(covering["filenames"].iloc[0])
partition = self._read_parquet(f"spatial/{filename}")
rows = partition[partition["UPRN"] == uprn]
if rows.empty:
return None
row = rows.iloc[0]
return Coordinates(
longitude=float(row["LONGITUDE"]),
latitude=float(row["LATITUDE"]),
)

View file

View file

@ -0,0 +1,13 @@
from __future__ import annotations
from abc import ABC
class MaterialsRepository(ABC):
"""Loads the retrofit Materials catalogue the Modelling stage draws measures
and costs from.
Seam only at this stage (#1136): the method shape is deferred to the
Modelling per-service grill. Declared now so the pipeline can be composed
end-to-end with Modelling stubbed.
"""

View file

@ -0,0 +1,56 @@
from __future__ import annotations
from collections.abc import Callable
from types import TracebackType
from typing import Optional
from sqlmodel import Session
from repositories.property_baseline.property_baseline_postgres_repository import (
PropertyBaselinePostgresRepository,
)
from repositories.epc.epc_postgres_repository import EpcPostgresRepository
from repositories.property.property_postgres_repository import (
PropertyPostgresRepository,
)
from repositories.solar.solar_postgres_repository import SolarPostgresRepository
from repositories.unit_of_work import UnitOfWork
class PostgresUnitOfWork(UnitOfWork):
"""Postgres-backed Unit of Work: one ``Session``, all repos bound to it.
Built from a session factory (a module-scoped engine + sessionmaker in
production, ADR-0012) so the connection pool is reused across warm Lambda
invocations. The session is opened on ``__enter__`` and closed on
``__exit__``; a fresh instance is one single-use unit.
"""
def __init__(self, session_factory: Callable[[], Session]) -> None:
self._session_factory = session_factory
def __enter__(self) -> "PostgresUnitOfWork":
self._session = self._session_factory()
epc_repo = EpcPostgresRepository(self._session)
self.property = PropertyPostgresRepository(self._session, epc_repo)
self.epc = epc_repo
self.solar = SolarPostgresRepository(self._session)
self.property_baseline = PropertyBaselinePostgresRepository(self._session)
return self
def __exit__(
self,
exc_type: Optional[type[BaseException]],
exc: Optional[BaseException],
tb: Optional[TracebackType],
) -> None:
try:
self._session.rollback()
finally:
self._session.close()
def commit(self) -> None:
self._session.commit()
def rollback(self) -> None:
self._session.rollback()

View file

View file

@ -0,0 +1,64 @@
from __future__ import annotations
from sqlmodel import Session, col, select
from domain.property.properties import Properties
from domain.property.property import Property, PropertyIdentity
from infrastructure.postgres.property_table import PropertyRow
from repositories.epc.epc_repository import EpcRepository
from repositories.property.property_repository import PropertyRepository
class PropertyPostgresRepository(PropertyRepository):
"""Hydrates the Property aggregate from the FE-owned ``property`` row plus the
EPC slice (via an injected `EpcRepository`). Reads only from repos no
external IO so a hydrated Property is a pure function of repository state
(ADR-0003).
"""
def __init__(self, session: Session, epc_repo: EpcRepository) -> None:
self._session = session
self._epc_repo = epc_repo
def get(self, property_id: int) -> Property:
row = self._session.get(PropertyRow, property_id)
if row is None:
raise ValueError(f"property {property_id} not found")
identity = PropertyIdentity(
portfolio_id=row.portfolio_id,
postcode=row.postcode,
address=row.address,
uprn=row.uprn,
landlord_property_id=row.landlord_property_id,
)
return Property(
identity=identity,
epc=self._epc_repo.get_for_property(property_id),
)
def get_many(self, property_ids: list[int]) -> Properties:
if not property_ids:
return Properties([])
rows = self._session.exec(
select(PropertyRow).where(col(PropertyRow.id).in_(property_ids))
).all()
row_by_id = {row.id: row for row in rows}
epcs = self._epc_repo.get_for_properties(property_ids)
items: list[Property] = []
for property_id in property_ids:
row = row_by_id.get(property_id)
if row is None:
raise ValueError(f"property {property_id} not found")
items.append(
Property(
identity=PropertyIdentity(
portfolio_id=row.portfolio_id,
postcode=row.postcode,
address=row.address,
uprn=row.uprn,
landlord_property_id=row.landlord_property_id,
),
epc=epcs.get(property_id),
)
)
return Properties(items)

View file

@ -0,0 +1,25 @@
from __future__ import annotations
from abc import ABC, abstractmethod
from domain.property.properties import Properties
from domain.property.property import Property
class PropertyRepository(ABC):
"""Loads and saves the Property aggregate.
Composes the aggregate whole from the FE-owned ``property`` identity row plus
its source-data slices (EPC today; Site Notes / enrichments as later slices
land). Aggregates load whole never half a Property (ADR-0002).
"""
@abstractmethod
def get(self, property_id: int) -> Property: ...
@abstractmethod
def get_many(self, property_ids: list[int]) -> Properties:
"""Load a batch of Properties whole, in a handful of per-table queries
rather than one round-trip per property (ADR-0012). Order follows the
input ids."""
...

View file

@ -0,0 +1,43 @@
from __future__ import annotations
from typing import Optional
from sqlmodel import Session, col, delete, select
from domain.property_baseline.property_baseline_performance import PropertyBaselinePerformance
from infrastructure.postgres.property_baseline_performance_table import (
PropertyBaselinePerformanceModel,
)
from repositories.property_baseline.property_baseline_repository import PropertyBaselineRepository
class PropertyBaselinePostgresRepository(PropertyBaselineRepository):
"""Maps PropertyBaselinePerformance to/from the ``property_baseline_performance`` table."""
def __init__(self, session: Session) -> None:
self._session = session
def save(self, baseline: PropertyBaselinePerformance, property_id: int) -> int:
# Idempotent on property_id: a re-run (or re-score) replaces the row
# rather than hitting the unique constraint (ADR-0012).
self._session.exec( # type: ignore[call-overload]
delete(PropertyBaselinePerformanceModel).where(
col(PropertyBaselinePerformanceModel.property_id) == property_id
)
)
row = PropertyBaselinePerformanceModel.from_domain(baseline, property_id)
self._session.add(row)
self._session.flush()
if row.id is None:
raise ValueError("property_baseline_performance row did not receive an id")
return row.id
def get_for_property(
self, property_id: int
) -> Optional[PropertyBaselinePerformance]:
row = self._session.exec(
select(PropertyBaselinePerformanceModel).where(
PropertyBaselinePerformanceModel.property_id == property_id
)
).first()
return row.to_domain() if row is not None else None

View file

@ -0,0 +1,23 @@
from __future__ import annotations
from abc import ABC, abstractmethod
from typing import Optional
from domain.property_baseline.property_baseline_performance import PropertyBaselinePerformance
class PropertyBaselineRepository(ABC):
"""Persists and loads a Property's Baseline Performance.
One Baseline Performance per Property (ADR-0004: persisted as one row). The
Postgres adapter writes the standalone ``property_baseline_performance`` table not
columns on the retiring ``property_details_epc``.
"""
@abstractmethod
def save(self, baseline: PropertyBaselinePerformance, property_id: int) -> int: ...
@abstractmethod
def get_for_property(
self, property_id: int
) -> Optional[PropertyBaselinePerformance]: ...

View file

View file

@ -0,0 +1,14 @@
from __future__ import annotations
from abc import ABC
class ScenarioRepository(ABC):
"""Loads the Scenarios (and Scenario Snapshots) the Modelling stage scores
a Property against.
Seam only at this stage (#1136): the method shape is deferred to the
Modelling per-service grill, where Scenario / Scenario Phase / Scenario
Snapshot are designed (CONTEXT.md). Declared now so the pipeline can be
composed end-to-end with Modelling stubbed.
"""

View file

View file

@ -0,0 +1,35 @@
from __future__ import annotations
from typing import Any, Optional
from sqlmodel import Session, select
from infrastructure.postgres.solar_table import SolarBuildingInsightsRow
from repositories.solar.solar_repository import SolarRepository
class SolarPostgresRepository(SolarRepository):
def __init__(self, session: Session) -> None:
self._session = session
def save(self, property_id: int, insights: dict[str, Any]) -> None:
existing = self._session.exec(
select(SolarBuildingInsightsRow).where(
SolarBuildingInsightsRow.property_id == property_id
)
).first()
if existing is None:
self._session.add(
SolarBuildingInsightsRow(property_id=property_id, insights=insights)
)
else:
existing.insights = insights
self._session.add(existing)
def get(self, property_id: int) -> Optional[dict[str, Any]]:
row = self._session.exec(
select(SolarBuildingInsightsRow).where(
SolarBuildingInsightsRow.property_id == property_id
)
).first()
return row.insights if row is not None else None

View file

@ -0,0 +1,19 @@
from __future__ import annotations
from abc import ABC, abstractmethod
from typing import Any, Optional
class SolarRepository(ABC):
"""Persists and loads a Property's Google Solar building insights.
Thin save/get over the raw fetched insights (a future SolarPotential domain
type will derive its fields from these). Written by Ingestion, read by
Baseline/Modelling never re-fetched downstream (ADR-0003).
"""
@abstractmethod
def save(self, property_id: int, insights: dict[str, Any]) -> None: ...
@abstractmethod
def get(self, property_id: int) -> Optional[dict[str, Any]]: ...

View file

@ -0,0 +1,47 @@
from __future__ import annotations
from abc import ABC, abstractmethod
from types import TracebackType
from typing import Optional
from repositories.property_baseline.property_baseline_repository import PropertyBaselineRepository
from repositories.epc.epc_repository import EpcRepository
from repositories.property.property_repository import PropertyRepository
from repositories.solar.solar_repository import SolarRepository
class UnitOfWork(ABC):
"""A single batch transaction across the DB-backed repos (ADR-0012).
A context manager that exposes the repos bound to one session. A stage runs
its whole batch inside one unit and calls ``commit()`` once; leaving the
block without committing including via an exception rolls back, so a
failed batch persists nothing and the subtask fails noisily.
The non-DB dependencies (EPC/Solar fetchers, the geospatial S3 repo, the
Rebaseliner) are *not* part of the unit only transactional DB work is.
"""
property: PropertyRepository
epc: EpcRepository
solar: SolarRepository
property_baseline: PropertyBaselineRepository
@abstractmethod
def commit(self) -> None: ...
@abstractmethod
def rollback(self) -> None: ...
def __enter__(self) -> "UnitOfWork":
return self
def __exit__(
self,
exc_type: Optional[type[BaseException]],
exc: Optional[BaseException],
tb: Optional[TracebackType],
) -> None:
# Roll back whatever was not explicitly committed (a no-op after a
# successful commit). All-or-nothing per batch.
self.rollback()

View file

@ -18,9 +18,9 @@ from dotenv import load_dotenv
REPO_ROOT = Path(__file__).resolve().parents[1]
sys.path.insert(0, str(REPO_ROOT))
from backend.epc_client._retry import call_with_retry
from backend.epc_client.epc_client_service import EpcClientService
from backend.epc_client.exceptions import (
from infrastructure.epc_client._retry import call_with_retry
from infrastructure.epc_client.epc_client_service import EpcClientService
from infrastructure.epc_client.exceptions import (
EpcApiError,
EpcNotFoundError,
EpcRateLimitError,

View file

View file

@ -0,0 +1,97 @@
from __future__ import annotations
from uuid import UUID
import pytest
from pydantic import ValidationError
from applications.ara_first_run.ara_first_run_trigger_body import (
AraFirstRunTriggerBody,
)
def test_validates_well_formed_body_into_typed_fields() -> None:
# Arrange
body = {
"task_id": "e295d89b-a7c5-4a9a-8b4e-b405fab1f298",
"sub_task_id": "f4a9944f-41f0-4a33-8669-5016ec574068",
"portfolio_id": 42,
"property_ids": [101, 102, 103],
"scenario_ids": [7, 8],
}
# Act
trigger = AraFirstRunTriggerBody.model_validate(body)
# Assert
assert trigger.task_id == UUID("e295d89b-a7c5-4a9a-8b4e-b405fab1f298")
assert trigger.sub_task_id == UUID("f4a9944f-41f0-4a33-8669-5016ec574068")
assert trigger.portfolio_id == 42
assert trigger.property_ids == [101, 102, 103]
assert trigger.scenario_ids == [7, 8]
def test_ignores_unknown_extra_fields() -> None:
# Arrange — the FastAPI backend may add fields the deployed lambda predates.
body = {
"task_id": "e295d89b-a7c5-4a9a-8b4e-b405fab1f298",
"sub_task_id": "f4a9944f-41f0-4a33-8669-5016ec574068",
"portfolio_id": 42,
"property_ids": [101],
"scenario_ids": [7],
"a_field_added_later_by_the_backend": "ignore me",
}
# Act
trigger = AraFirstRunTriggerBody.model_validate(body)
# Assert — the unknown field is dropped, not retained or rejected.
assert not hasattr(trigger, "a_field_added_later_by_the_backend")
assert trigger.portfolio_id == 42
def test_rejects_body_missing_a_required_field() -> None:
# Arrange — scenario_ids omitted.
body = {
"task_id": "e295d89b-a7c5-4a9a-8b4e-b405fab1f298",
"sub_task_id": "f4a9944f-41f0-4a33-8669-5016ec574068",
"portfolio_id": 42,
"property_ids": [101],
}
# Act / Assert
with pytest.raises(ValidationError) as exc_info:
AraFirstRunTriggerBody.model_validate(body)
assert "scenario_ids" in str(exc_info.value)
def test_rejects_non_uuid_task_id() -> None:
# Arrange
body = {
"task_id": "not-a-uuid",
"sub_task_id": "f4a9944f-41f0-4a33-8669-5016ec574068",
"portfolio_id": 42,
"property_ids": [101],
"scenario_ids": [7],
}
# Act / Assert
with pytest.raises(ValidationError) as exc_info:
AraFirstRunTriggerBody.model_validate(body)
assert "task_id" in str(exc_info.value)
def test_rejects_non_int_portfolio_id() -> None:
# Arrange — business IDs are integers, not strings.
body = {
"task_id": "e295d89b-a7c5-4a9a-8b4e-b405fab1f298",
"sub_task_id": "f4a9944f-41f0-4a33-8669-5016ec574068",
"portfolio_id": "not-an-int",
"property_ids": [101],
"scenario_ids": [7],
}
# Act / Assert
with pytest.raises(ValidationError) as exc_info:
AraFirstRunTriggerBody.model_validate(body)
assert "portfolio_id" in str(exc_info.value)

View file

@ -0,0 +1,44 @@
from __future__ import annotations
from typing import Optional
from uuid import UUID
from applications.ara_first_run.ara_first_run_trigger_body import (
AraFirstRunTriggerBody,
)
from applications.ara_first_run.handler import dispatch_first_run
from orchestration.ara_first_run_pipeline import AraFirstRunCommand
class _SpyPipeline:
"""Records the command it is asked to run, instead of composing stages."""
def __init__(self) -> None:
self.received: Optional[AraFirstRunCommand] = None
def run(self, command: AraFirstRunCommand) -> None:
self.received = command
def test_validates_the_event_body_and_delegates_the_command_to_the_pipeline() -> None:
# Arrange — a raw SQS body, as the decorator hands it to the handler.
body = {
"task_id": "e295d89b-a7c5-4a9a-8b4e-b405fab1f298",
"sub_task_id": "f4a9944f-41f0-4a33-8669-5016ec574068",
"portfolio_id": 42,
"property_ids": [101, 102],
"scenario_ids": [7],
}
pipeline = _SpyPipeline()
# Act
dispatch_first_run(body, pipeline=pipeline)
# Assert — the raw body was validated into the typed trigger and handed
# straight on, untouched.
received = pipeline.received
assert isinstance(received, AraFirstRunTriggerBody)
assert received.task_id == UUID("e295d89b-a7c5-4a9a-8b4e-b405fab1f298")
assert received.portfolio_id == 42
assert received.property_ids == [101, 102]
assert received.scenario_ids == [7]

View file

View file

@ -0,0 +1,127 @@
"""Property aggregate — source-path precedence and Effective EPC resolution.
The two disjoint source paths (ADR-0001): a Property is modelled either from its
Site Notes alone, or from the public EPC (with Landlord Overrides, once that slice
lands). When both exist, the newer wins (Recency Tie-Break).
"""
from __future__ import annotations
import json
from datetime import date
from pathlib import Path
from typing import Any
from datatypes.epc.domain.epc_property_data import EpcPropertyData
from datatypes.epc.domain.mapper import EpcPropertyDataMapper
from domain.property.properties import Properties
from domain.property.property import Property, PropertyIdentity
from domain.property.site_notes import SiteNotes
_JSON_SAMPLES = Path(__file__).resolve().parents[3] / "backend/epc_api/json_samples"
def _epc(inspection: str = "2023-12-01") -> EpcPropertyData:
raw: dict[str, Any] = json.loads(
(_JSON_SAMPLES / "RdSAP-Schema-21.0.0" / "epc.json").read_text()
)
return EpcPropertyDataMapper.from_api_response(raw)
def _identity() -> PropertyIdentity:
return PropertyIdentity(
portfolio_id=1, postcode="A0 0AA", address="1 Some Street", uprn=12345
)
def test_source_path_is_epc_with_overlay_when_only_epc_present() -> None:
# Arrange
prop = Property(identity=_identity(), epc=_epc())
# Act
path = prop.source_path
# Assert
assert path == "epc_with_overlay"
def test_source_path_is_site_notes_when_only_site_notes_present() -> None:
# Arrange
prop = Property(
identity=_identity(),
site_notes=SiteNotes(surveyed_at=date(2024, 6, 1), epc=_epc()),
)
# Act
path = prop.source_path
# Assert
assert path == "site_notes"
def test_recency_tie_break_newer_site_notes_win_over_older_epc() -> None:
# Arrange — EPC inspected 2023-12-01; survey is newer
prop = Property(
identity=_identity(),
epc=_epc(),
site_notes=SiteNotes(surveyed_at=date(2025, 1, 1), epc=_epc()),
)
# Act / Assert
assert prop.source_path == "site_notes"
def test_recency_tie_break_older_site_notes_lose_to_newer_epc() -> None:
# Arrange — survey predates the EPC's inspection date
prop = Property(
identity=_identity(),
epc=_epc(),
site_notes=SiteNotes(surveyed_at=date(2020, 1, 1), epc=_epc()),
)
# Act / Assert
assert prop.source_path == "epc_with_overlay"
def test_effective_epc_follows_the_selected_source_path() -> None:
# Arrange
survey_epc = _epc()
public_epc = _epc()
site_notes_property = Property(
identity=_identity(),
site_notes=SiteNotes(surveyed_at=date(2025, 1, 1), epc=survey_epc),
)
epc_property = Property(identity=_identity(), epc=public_epc)
# Act / Assert
assert site_notes_property.effective_epc is survey_epc
assert epc_property.effective_epc is public_epc
def test_property_with_no_source_raises() -> None:
# Arrange
prop = Property(identity=_identity())
# Act / Assert
try:
_ = prop.source_path
except ValueError:
pass
else: # pragma: no cover
raise AssertionError("expected ValueError when no source is present")
def test_properties_collection_iterates_and_filters() -> None:
# Arrange
with_epc = Property(identity=_identity(), epc=_epc())
without = Property(identity=_identity())
properties = Properties([with_epc, without])
# Act
with_source = properties.filter(lambda p: p.epc is not None)
# Assert
assert len(properties) == 2
assert list(properties) == [with_epc, without]
assert len(with_source) == 1
assert list(with_source) == [with_epc]

View file

@ -0,0 +1,34 @@
from __future__ import annotations
from datatypes.epc.domain.epc import Epc
from datatypes.epc.domain.epc_property_data import EpcPropertyData
from domain.property_baseline.performance import Performance, lodged_performance
def _epc_with_recorded_performance(
*, sap: int, band: Epc, co2: float, peui: int
) -> EpcPropertyData:
# A bare instance with only the recorded-performance fields the reader
# touches — mirrors the opaque-EPC idiom used in the ingestion tests.
epc = object.__new__(EpcPropertyData)
epc.energy_rating_current = sap
epc.current_energy_efficiency_band = band
epc.co2_emissions_current = co2
epc.energy_consumption_current = peui
return epc
def test_lodged_performance_reads_the_four_recorded_quantities_off_the_epc() -> None:
# Arrange
epc = _epc_with_recorded_performance(sap=72, band=Epc.C, co2=1.8, peui=180)
# Act
performance = lodged_performance(epc)
# Assert
assert performance == Performance(
sap_score=72,
epc_band=Epc.C,
co2_emissions=1.8,
primary_energy_intensity=180,
)

View file

@ -0,0 +1,48 @@
from __future__ import annotations
from typing import Optional
import pytest
from datatypes.epc.domain.epc import Epc
from datatypes.epc.domain.epc_property_data import EpcPropertyData
from domain.property_baseline.performance import Performance
from domain.property_baseline.rebaseliner import RebaselineNotImplemented, StubRebaseliner
def _epc(*, sap_version: Optional[float]) -> EpcPropertyData:
epc = object.__new__(EpcPropertyData)
epc.sap_version = sap_version
return epc
def _lodged() -> Performance:
return Performance(
sap_score=72, epc_band=Epc.C, co2_emissions=1.8, primary_energy_intensity=180
)
def test_sap10_epc_is_not_rebaselined_so_effective_equals_lodged() -> None:
# Arrange — a SAP 10.2 cert: no rebaselining trigger fires.
epc = _epc(sap_version=10.2)
lodged = _lodged()
rebaseliner = StubRebaseliner()
# Act
effective, reason = rebaseliner.rebaseline(epc, lodged)
# Assert — Effective Performance equals Lodged, reason "none".
assert effective == lodged
assert reason == "none"
def test_pre_sap10_epc_raises_because_rebaselining_is_not_implemented() -> None:
# Arrange — a cert lodged under a pre-SAP10 schema genuinely needs ML
# rebaselining, which does not exist yet; the stub must not fabricate a
# "none" answer for it.
epc = _epc(sap_version=9.94)
rebaseliner = StubRebaseliner()
# Act / Assert
with pytest.raises(RebaselineNotImplemented):
rebaseliner.rebaseline(epc, _lodged())

View file

@ -0,0 +1,123 @@
"""In-memory fakes for orchestrator unit tests (no DB, no network).
A `FakeUnitOfWork` exposes dict-backed fake repos and records commits, so a
test can drive an orchestrator and then assert what was persisted and that the
batch committed exactly once (ADR-0012)."""
from __future__ import annotations
from types import TracebackType
from typing import Any, Optional
from datatypes.epc.domain.epc_property_data import EpcPropertyData
from domain.property_baseline.property_baseline_performance import PropertyBaselinePerformance
from domain.property.properties import Properties
from domain.property.property import Property
from repositories.property_baseline.property_baseline_repository import PropertyBaselineRepository
from repositories.epc.epc_repository import EpcRepository
from repositories.property.property_repository import PropertyRepository
from repositories.solar.solar_repository import SolarRepository
from repositories.unit_of_work import UnitOfWork
class FakePropertyRepo(PropertyRepository):
def __init__(self, by_id: dict[int, Property]) -> None:
self._by_id = by_id
def get(self, property_id: int) -> Property:
return self._by_id[property_id]
def get_many(self, property_ids: list[int]) -> Properties:
return Properties([self._by_id[property_id] for property_id in property_ids])
class FakeEpcRepo(EpcRepository):
def __init__(self, by_property: Optional[dict[int, EpcPropertyData]] = None) -> None:
self.saved: list[tuple[EpcPropertyData, Optional[int]]] = []
self._by_property = by_property or {}
def save(
self,
data: EpcPropertyData,
property_id: Optional[int] = None,
portfolio_id: Optional[int] = None,
) -> int:
self.saved.append((data, property_id))
if property_id is not None:
self._by_property[property_id] = data
return len(self.saved)
def get(self, epc_property_id: int) -> EpcPropertyData: # pragma: no cover
raise NotImplementedError
def get_for_property(self, property_id: int) -> Optional[EpcPropertyData]:
return self._by_property.get(property_id)
def get_for_properties(
self, property_ids: list[int]
) -> dict[int, EpcPropertyData]:
return {
property_id: self._by_property[property_id]
for property_id in property_ids
if property_id in self._by_property
}
class FakeSolarRepo(SolarRepository):
def __init__(self) -> None:
self.saved: list[tuple[int, dict[str, Any]]] = []
def save(self, property_id: int, insights: dict[str, Any]) -> None:
self.saved.append((property_id, insights))
def get(self, property_id: int) -> Optional[dict[str, Any]]: # pragma: no cover
raise NotImplementedError
class FakePropertyBaselineRepo(PropertyBaselineRepository):
def __init__(self) -> None:
self.saved: list[tuple[PropertyBaselinePerformance, int]] = []
def save(self, baseline: PropertyBaselinePerformance, property_id: int) -> int:
self.saved.append((baseline, property_id))
return len(self.saved)
def get_for_property(
self, property_id: int
) -> Optional[PropertyBaselinePerformance]: # pragma: no cover
raise NotImplementedError
class FakeUnitOfWork(UnitOfWork):
"""A unit that holds in-memory repos and counts commits."""
def __init__(
self,
*,
property: FakePropertyRepo,
epc: Optional[FakeEpcRepo] = None,
solar: Optional[FakeSolarRepo] = None,
property_baseline: Optional[FakePropertyBaselineRepo] = None,
) -> None:
self.property = property
self.epc = epc or FakeEpcRepo()
self.solar = solar or FakeSolarRepo()
self.property_baseline = property_baseline or FakePropertyBaselineRepo()
self.commits = 0
def __enter__(self) -> "FakeUnitOfWork":
return self
def __exit__(
self,
exc_type: Optional[type[BaseException]],
exc: Optional[BaseException],
tb: Optional[TracebackType],
) -> None:
return None
def commit(self) -> None:
self.commits += 1
def rollback(self) -> None:
return None

View file

@ -0,0 +1,64 @@
from __future__ import annotations
from dataclasses import dataclass
from orchestration.ara_first_run_pipeline import AraFirstRunCommand, AraFirstRunPipeline
@dataclass
class _FakeCommand:
"""A stand-in for AraFirstRunTriggerBody — structurally a AraFirstRunCommand."""
portfolio_id: int
property_ids: list[int]
scenario_ids: list[int]
class _SpyIngestion:
def __init__(self, log: list[tuple[object, ...]]) -> None:
self._log = log
def run(self, property_ids: list[int]) -> None:
self._log.append(("ingestion", property_ids))
class _SpyBaseline:
def __init__(self, log: list[tuple[object, ...]]) -> None:
self._log = log
def run(self, property_ids: list[int]) -> None:
self._log.append(("baseline", property_ids))
class _SpyModelling:
def __init__(self, log: list[tuple[object, ...]]) -> None:
self._log = log
def run(self, property_ids: list[int], scenario_ids: list[int]) -> None:
self._log.append(("modelling", property_ids, scenario_ids))
def test_run_sequences_the_three_stages_threading_only_property_ids() -> None:
# Arrange
log: list[tuple[object, ...]] = []
command: AraFirstRunCommand = _FakeCommand(
portfolio_id=1, property_ids=[10, 11], scenario_ids=[7]
)
pipeline = AraFirstRunPipeline(
ingestion=_SpyIngestion(log),
baseline=_SpyBaseline(log),
modelling=_SpyModelling(log),
)
# Act
pipeline.run(command)
# Assert — Ingestion -> Baseline -> Modelling, in order. Ingestion and
# Baseline receive only property_ids; Modelling additionally gets the
# scenario_ids (off the command, not a prior stage). Nothing else is
# threaded between stages — they communicate through repos (ADR-0011).
assert log == [
("ingestion", [10, 11]),
("baseline", [10, 11]),
("modelling", [10, 11], [7]),
]

View file

@ -0,0 +1,145 @@
"""End-to-end through-repos integration for First Run (ADR-0012, #1138).
Real PostgresUnitOfWork over an ephemeral DB: Ingestion writes the EPC, Baseline
reads it back *through the repo* (not in memory), and a re-run replaces rather
than duplicates. Stub Modelling. The source clients are faked (no IO)."""
from __future__ import annotations
import dataclasses
import json
from dataclasses import dataclass
from pathlib import Path
from typing import Any, Optional
from sqlalchemy import Engine
from sqlmodel import Session, select
from datatypes.epc.domain.epc import Epc
from datatypes.epc.domain.epc_property_data import EpcPropertyData
from datatypes.epc.domain.mapper import EpcPropertyDataMapper
from domain.property_baseline.rebaseliner import StubRebaseliner
from domain.geospatial.coordinates import Coordinates
from infrastructure.postgres.property_baseline_performance_table import (
PropertyBaselinePerformanceModel,
)
from infrastructure.postgres.epc_property_table import EpcPropertyModel
from infrastructure.postgres.property_table import PropertyRow
from orchestration.property_baseline_orchestrator import PropertyBaselineOrchestrator
from orchestration.ara_first_run_pipeline import AraFirstRunPipeline
from orchestration.ingestion_orchestrator import IngestionOrchestrator
from orchestration.modelling_orchestrator import ModellingOrchestrator
from repositories.property_baseline.property_baseline_postgres_repository import (
PropertyBaselinePostgresRepository,
)
from repositories.geospatial.geospatial_repository import GeospatialRepository
from repositories.materials.materials_repository import MaterialsRepository
from repositories.postgres_unit_of_work import PostgresUnitOfWork
from repositories.scenario.scenario_repository import ScenarioRepository
_JSON_SAMPLES = Path(__file__).resolve().parents[2] / "backend/epc_api/json_samples"
@dataclass
class _FakeCommand:
portfolio_id: int
property_ids: list[int]
scenario_ids: list[int]
class _FetcherReturning:
def __init__(self, epc: EpcPropertyData) -> None:
self._epc = epc
def get_by_uprn(self, uprn: int) -> Optional[EpcPropertyData]:
return self._epc
class _NoCoordinates(GeospatialRepository):
def coordinates_for(self, uprn: int) -> Optional[Coordinates]:
return None # skip the solar leg — not under test here
class _UnusedSolarFetcher:
def get_building_insights(
self, longitude: float, latitude: float
) -> dict[str, Any]: # pragma: no cover
return {}
def _lodged_epc() -> EpcPropertyData:
# A real, persistable EPC (so it round-trips through the EPC repo), with the
# recorded-performance fields the sample leaves blank filled in so Baseline
# can read its Lodged Performance.
raw: dict[str, Any] = json.loads(
(_JSON_SAMPLES / "RdSAP-Schema-21.0.0" / "epc.json").read_text()
)
epc = EpcPropertyDataMapper.from_api_response(raw)
return dataclasses.replace(
epc,
energy_rating_current=72,
current_energy_efficiency_band=Epc.C,
co2_emissions_current=1.8,
energy_consumption_current=180,
)
def test_first_run_baselines_through_repos_and_is_idempotent_on_rerun(
db_engine: Engine,
) -> None:
# Arrange — a property row to ingest against, and the EPC its fetcher returns.
with Session(db_engine) as session:
session.add(
PropertyRow(
id=10,
portfolio_id=1,
postcode="A0 0AA",
address="1 Some Street",
uprn=12345,
)
)
session.commit()
def unit_of_work() -> PostgresUnitOfWork:
return PostgresUnitOfWork(lambda: Session(db_engine))
pipeline = AraFirstRunPipeline(
ingestion=IngestionOrchestrator(
unit_of_work=unit_of_work,
epc_fetcher=_FetcherReturning(_lodged_epc()),
geospatial_repo=_NoCoordinates(),
solar_fetcher=_UnusedSolarFetcher(),
),
baseline=PropertyBaselineOrchestrator(
unit_of_work=unit_of_work, rebaseliner=StubRebaseliner()
),
modelling=ModellingOrchestrator(
scenario_repo=ScenarioRepository(),
materials_repo=MaterialsRepository(),
),
)
command = _FakeCommand(portfolio_id=1, property_ids=[10], scenario_ids=[7])
# Act — First Run, then a re-run over the same batch.
pipeline.run(command)
pipeline.run(command)
# Assert — Baseline read the EPC Ingestion persisted (through the repo, only
# property_ids crossed the stage boundary), and the re-run replaced rather
# than duplicated either row.
with Session(db_engine) as session:
baseline = PropertyBaselinePostgresRepository(session).get_for_property(10)
epc_rows = session.exec(
select(EpcPropertyModel).where(EpcPropertyModel.property_id == 10)
).all()
baseline_rows = session.exec(
select(PropertyBaselinePerformanceModel).where(
PropertyBaselinePerformanceModel.property_id == 10
)
).all()
assert baseline is not None
assert baseline.lodged.sap_score == 72
assert baseline.space_heating_kwh == 13120.0
assert len(epc_rows) == 1
assert len(baseline_rows) == 1

View file

@ -0,0 +1,141 @@
"""IngestionOrchestrator fetches the batch (no DB unit open), then writes it in
one Unit of Work and commits once (ADR-0012). Tested against fakes no IO."""
from __future__ import annotations
from typing import Any, Optional
from datatypes.epc.domain.epc_property_data import EpcPropertyData
from domain.geospatial.coordinates import Coordinates
from domain.property.property import Property, PropertyIdentity
from orchestration.ingestion_orchestrator import IngestionOrchestrator
from repositories.geospatial.geospatial_repository import GeospatialRepository
from tests.orchestration.fakes import (
FakeEpcRepo,
FakePropertyRepo,
FakeSolarRepo,
FakeUnitOfWork,
)
class _FakeEpcFetcher:
def __init__(self, epc: Optional[EpcPropertyData]) -> None:
self.epc = epc
self.uprns: list[int] = []
def get_by_uprn(self, uprn: int) -> Optional[EpcPropertyData]:
self.uprns.append(uprn)
return self.epc
class _FakeGeospatialRepo(GeospatialRepository):
def __init__(self, coordinates: Optional[Coordinates]) -> None:
self._coordinates = coordinates
def coordinates_for(self, uprn: int) -> Optional[Coordinates]:
return self._coordinates
class _FakeSolarFetcher:
def __init__(self, insights: dict[str, Any]) -> None:
self.insights = insights
self.calls: list[tuple[float, float]] = []
def get_building_insights(
self, longitude: float, latitude: float
) -> dict[str, Any]:
self.calls.append((longitude, latitude))
return self.insights
def _property(uprn: Optional[int]) -> Property:
return Property(
identity=PropertyIdentity(
portfolio_id=1, postcode="A0 0AA", address="1 Some Street", uprn=uprn
)
)
def test_ingestion_persists_epc_and_threads_coords_into_solar() -> None:
# Arrange
epc = object.__new__(EpcPropertyData)
insights = {"name": "buildings/X"}
epc_repo = FakeEpcRepo()
solar_repo = FakeSolarRepo()
solar_fetcher = _FakeSolarFetcher(insights)
uow = FakeUnitOfWork(
property=FakePropertyRepo({10: _property(uprn=12345)}),
epc=epc_repo,
solar=solar_repo,
)
orchestrator = IngestionOrchestrator(
unit_of_work=lambda: uow,
epc_fetcher=_FakeEpcFetcher(epc),
geospatial_repo=_FakeGeospatialRepo(
Coordinates(longitude=-0.1278, latitude=51.5074)
),
solar_fetcher=solar_fetcher,
)
# Act
orchestrator.run([10])
# Assert — EPC persisted, coords threaded from the repo into the solar
# fetcher, solar persisted, batch committed once.
assert epc_repo.saved == [(epc, 10)]
assert solar_fetcher.calls == [(-0.1278, 51.5074)]
assert solar_repo.saved == [(10, insights)]
assert uow.commits == 1
def test_ingestion_skips_property_without_uprn() -> None:
# Arrange
epc_repo = FakeEpcRepo()
solar_repo = FakeSolarRepo()
solar_fetcher = _FakeSolarFetcher({})
uow = FakeUnitOfWork(
property=FakePropertyRepo({10: _property(uprn=None)}),
epc=epc_repo,
solar=solar_repo,
)
orchestrator = IngestionOrchestrator(
unit_of_work=lambda: uow,
epc_fetcher=_FakeEpcFetcher(object.__new__(EpcPropertyData)),
geospatial_repo=_FakeGeospatialRepo(None),
solar_fetcher=solar_fetcher,
)
# Act
orchestrator.run([10])
# Assert — nothing fetched or persisted for a UPRN-less property.
assert epc_repo.saved == []
assert solar_repo.saved == []
assert solar_fetcher.calls == []
def test_ingestion_persists_epc_but_skips_solar_when_no_coordinates() -> None:
# Arrange
epc = object.__new__(EpcPropertyData)
epc_repo = FakeEpcRepo()
solar_repo = FakeSolarRepo()
solar_fetcher = _FakeSolarFetcher({})
uow = FakeUnitOfWork(
property=FakePropertyRepo({10: _property(uprn=12345)}),
epc=epc_repo,
solar=solar_repo,
)
orchestrator = IngestionOrchestrator(
unit_of_work=lambda: uow,
epc_fetcher=_FakeEpcFetcher(epc),
geospatial_repo=_FakeGeospatialRepo(None),
solar_fetcher=solar_fetcher,
)
# Act
orchestrator.run([10])
# Assert
assert epc_repo.saved == [(epc, 10)]
assert solar_repo.saved == []
assert solar_fetcher.calls == []

View file

@ -0,0 +1,90 @@
from __future__ import annotations
import pytest
from datatypes.epc.domain.epc import Epc
from datatypes.epc.domain.epc_property_data import (
EpcPropertyData,
RenewableHeatIncentive,
)
from domain.property_baseline.property_baseline_performance import PropertyBaselinePerformance
from domain.property_baseline.performance import Performance
from domain.property_baseline.rebaseliner import RebaselineNotImplemented, StubRebaseliner
from domain.property.property import Property, PropertyIdentity
from orchestration.property_baseline_orchestrator import PropertyBaselineOrchestrator
from tests.orchestration.fakes import (
FakePropertyBaselineRepo,
FakePropertyRepo,
FakeUnitOfWork,
)
def _property(*, sap_version: float) -> Property:
epc = object.__new__(EpcPropertyData)
epc.energy_rating_current = 72
epc.current_energy_efficiency_band = Epc.C
epc.co2_emissions_current = 1.8
epc.energy_consumption_current = 180
epc.sap_version = sap_version
epc.renewable_heat_incentive = RenewableHeatIncentive(
space_heating_kwh=5000.0, water_heating_kwh=2000.0
)
return Property(
identity=PropertyIdentity(
portfolio_id=1, postcode="A0 0AA", address="1 Some Street", uprn=123
),
epc=epc,
)
def test_run_establishes_persists_and_commits_the_batch_once() -> None:
# Arrange
property_baseline_repo = FakePropertyBaselineRepo()
uow = FakeUnitOfWork(
property=FakePropertyRepo({10: _property(sap_version=10.2)}),
property_baseline=property_baseline_repo,
)
orchestrator = PropertyBaselineOrchestrator(
unit_of_work=lambda: uow, rebaseliner=StubRebaseliner()
)
# Act
orchestrator.run([10])
# Assert — one Baseline Performance persisted (both halves equal, kWh off the
# RHI), and the batch committed exactly once.
lodged = Performance(
sap_score=72, epc_band=Epc.C, co2_emissions=1.8, primary_energy_intensity=180
)
assert property_baseline_repo.saved == [
(
PropertyBaselinePerformance(
lodged=lodged,
effective=lodged,
rebaseline_reason="none",
space_heating_kwh=5000.0,
water_heating_kwh=2000.0,
),
10,
)
]
assert uow.commits == 1
def test_run_raises_on_a_pre_sap10_property_and_does_not_commit() -> None:
# Arrange — a pre-SAP10 cert needs ML rebaselining, which is not wired yet.
property_baseline_repo = FakePropertyBaselineRepo()
uow = FakeUnitOfWork(
property=FakePropertyRepo({10: _property(sap_version=9.94)}),
property_baseline=property_baseline_repo,
)
orchestrator = PropertyBaselineOrchestrator(
unit_of_work=lambda: uow, rebaseliner=StubRebaseliner()
)
# Act / Assert — the raise propagates; the batch is neither persisted nor
# committed (all-or-nothing).
with pytest.raises(RebaselineNotImplemented):
orchestrator.run([10])
assert property_baseline_repo.saved == []
assert uow.commits == 0

View file

View file

@ -0,0 +1,81 @@
"""Bulk EPC read: get_for_properties hydrates a batch in a handful of per-table
queries, not N x per-property (ADR-0012, #1138)."""
from __future__ import annotations
import json
from collections.abc import Callable
from pathlib import Path
from typing import Any
from sqlalchemy import Engine, event
from sqlmodel import Session
from datatypes.epc.domain.epc_property_data import EpcPropertyData
from datatypes.epc.domain.mapper import EpcPropertyDataMapper
from repositories.epc.epc_postgres_repository import EpcPostgresRepository
_JSON_SAMPLES = Path(__file__).resolve().parents[3] / "backend/epc_api/json_samples"
def _load_epc() -> EpcPropertyData:
raw: dict[str, Any] = json.loads(
(_JSON_SAMPLES / "RdSAP-Schema-21.0.0" / "epc.json").read_text()
)
return EpcPropertyDataMapper.from_api_response(raw)
def _count_queries(engine: Engine, work: Callable[[], None]) -> int:
count = 0
def _before(*_args: Any, **_kwargs: Any) -> None:
nonlocal count
count += 1
event.listen(engine, "before_cursor_execute", _before)
try:
work()
finally:
event.remove(engine, "before_cursor_execute", _before)
return count
def test_get_for_properties_hydrates_the_whole_batch(db_engine: Engine) -> None:
# Arrange — the same sample EPC persisted for two properties.
epc = _load_epc()
with Session(db_engine) as session:
repo = EpcPostgresRepository(session)
repo.save(epc, property_id=10)
repo.save(epc, property_id=11)
session.commit()
# Act
with Session(db_engine) as session:
result = EpcPostgresRepository(session).get_for_properties([10, 11])
# Assert — both fully hydrated (load-whole, ADR-0002).
assert result == {10: epc, 11: epc}
def test_get_for_properties_round_trips_do_not_scale_with_batch_size(
db_engine: Engine,
) -> None:
# Arrange
epc = _load_epc()
with Session(db_engine) as session:
repo = EpcPostgresRepository(session)
repo.save(epc, property_id=10)
repo.save(epc, property_id=11)
session.commit()
def _read(property_ids: list[int]) -> None:
with Session(db_engine) as session:
EpcPostgresRepository(session).get_for_properties(property_ids)
# Act — count queries for a 1-property batch vs a 2-property batch.
one = _count_queries(db_engine, lambda: _read([10]))
two = _count_queries(db_engine, lambda: _read([10, 11]))
# Assert — same number of round-trips regardless of batch size (one query
# per table, not per property).
assert one == two

View file

@ -0,0 +1,52 @@
"""A re-run of First Run re-saves a property's EPC; that must replace the prior
row, not duplicate it (ADR-0012 idempotent batch writes, #1138)."""
from __future__ import annotations
import dataclasses
import json
from pathlib import Path
from typing import Any
from sqlalchemy import Engine
from sqlmodel import Session, select
from datatypes.epc.domain.epc_property_data import EpcPropertyData
from datatypes.epc.domain.mapper import EpcPropertyDataMapper
from infrastructure.postgres.epc_property_table import EpcPropertyModel
from repositories.epc.epc_postgres_repository import EpcPostgresRepository
_JSON_SAMPLES = Path(__file__).resolve().parents[3] / "backend/epc_api/json_samples"
def _load_epc() -> EpcPropertyData:
raw: dict[str, Any] = json.loads(
(_JSON_SAMPLES / "RdSAP-Schema-21.0.0" / "epc.json").read_text()
)
return EpcPropertyDataMapper.from_api_response(raw)
def test_resaving_an_epc_for_a_property_replaces_rather_than_duplicates(
db_engine: Engine,
) -> None:
# Arrange — same property re-ingested with a changed field.
original = _load_epc()
updated = dataclasses.replace(original, status="re-run-sentinel")
# Act — save twice for the same property_id (a re-run).
with Session(db_engine) as session:
repo = EpcPostgresRepository(session)
repo.save(original, property_id=10)
repo.save(updated, property_id=10)
session.commit()
# Assert — exactly one EPC row for the property, holding the latest data.
with Session(db_engine) as session:
rows = session.exec(
select(EpcPropertyModel).where(EpcPropertyModel.property_id == 10)
).all()
reloaded = EpcPostgresRepository(session).get_for_property(10)
assert len(rows) == 1
assert reloaded is not None
assert reloaded.status == "re-run-sentinel"

View file

@ -0,0 +1,50 @@
"""Persistence round-trip fidelity for EPC Property Data (Slice 1, #1129).
The load-bearing risk of the ara_first_run rebuild: an EpcPropertyData mapped to
the epc_property tables, saved, reloaded and mapped back must reconstruct the
original object exactly. A failure here is either a missing column (a migration
the FE repo must make) or a mapper gap either way we want it to fail loudly,
inside First Run, rather than be deferred to a later Refresh.
"""
from __future__ import annotations
import json
from pathlib import Path
from typing import Any
import pytest
from sqlalchemy import Engine
from sqlmodel import Session
from datatypes.epc.domain.epc_property_data import EpcPropertyData
from datatypes.epc.domain.mapper import EpcPropertyDataMapper
from repositories.epc.epc_postgres_repository import EpcPostgresRepository
_JSON_SAMPLES = Path(__file__).resolve().parents[3] / "backend/epc_api/json_samples"
def _load_epc(schema_dir: str) -> EpcPropertyData:
raw: dict[str, Any] = json.loads(
(_JSON_SAMPLES / schema_dir / "epc.json").read_text()
)
return EpcPropertyDataMapper.from_api_response(raw)
@pytest.mark.parametrize(
"schema_dir",
["RdSAP-Schema-21.0.0", "RdSAP-Schema-21.0.1"],
)
def test_epc_property_data_round_trips(schema_dir: str, db_engine: Engine) -> None:
# Arrange
original = _load_epc(schema_dir)
# Act
with Session(db_engine) as session:
epc_property_id = EpcPostgresRepository(session).save(original)
session.commit()
with Session(db_engine) as session:
reloaded = EpcPostgresRepository(session).get(epc_property_id)
# Assert
assert reloaded == original

View file

@ -0,0 +1,71 @@
"""GeospatialRepo resolves a Property's coordinates from the OS Open-UPRN data.
A reference-data lookup, not a Fetcher (ADR-0011): no live OS API call. The
adapter reads the partitioned Open-UPRN parquet via an injected reader, so the
test exercises the partition lookup + filter against real fixture parquets with
no network.
"""
from __future__ import annotations
from collections.abc import Callable
from pathlib import Path
import pandas as pd
from domain.geospatial.coordinates import Coordinates
from repositories.geospatial.geospatial_s3_repository import GeospatialS3Repository
def _reader(base: Path) -> Callable[[str], pd.DataFrame]:
def read(key: str) -> pd.DataFrame:
return pd.read_parquet(base / key)
return read
def _write_open_uprn(base: Path) -> None:
spatial = base / "spatial"
spatial.mkdir(parents=True, exist_ok=True)
pd.DataFrame(
{"lower": [0], "upper": [100000], "filenames": ["0_100000.parquet"]}
).to_parquet(spatial / "filename_meta.parquet")
pd.DataFrame(
{
"UPRN": [12345, 12346],
"LATITUDE": [51.5074, 51.6000],
"LONGITUDE": [-0.1278, -0.2000],
}
).to_parquet(spatial / "0_100000.parquet")
def test_coordinates_for_returns_lon_lat(tmp_path: Path) -> None:
# Arrange
_write_open_uprn(tmp_path)
repo = GeospatialS3Repository(_reader(tmp_path))
# Act
coords = repo.coordinates_for(12345)
# Assert
assert coords == Coordinates(longitude=-0.1278, latitude=51.5074)
def test_coordinates_for_returns_none_when_uprn_absent(tmp_path: Path) -> None:
# Arrange
_write_open_uprn(tmp_path)
repo = GeospatialS3Repository(_reader(tmp_path))
# Act / Assert — uprn inside the partition range but not present in the data
assert repo.coordinates_for(99999) is None
def test_coordinates_for_returns_none_when_no_partition_covers_uprn(
tmp_path: Path,
) -> None:
# Arrange
_write_open_uprn(tmp_path)
repo = GeospatialS3Repository(_reader(tmp_path))
# Act / Assert — uprn beyond every partition's range
assert repo.coordinates_for(500000) is None

View file

View file

@ -0,0 +1,49 @@
"""PropertyRepository hydrates the aggregate whole from the property row + EPC slice."""
from __future__ import annotations
import json
from pathlib import Path
from typing import Any
from sqlalchemy import Engine
from sqlmodel import Session
from datatypes.epc.domain.mapper import EpcPropertyDataMapper
from infrastructure.postgres.property_table import PropertyRow
from repositories.epc.epc_postgres_repository import EpcPostgresRepository
from repositories.property.property_postgres_repository import (
PropertyPostgresRepository,
)
_JSON_SAMPLES = Path(__file__).resolve().parents[3] / "backend/epc_api/json_samples"
def test_get_hydrates_identity_and_epc_slice(db_engine: Engine) -> None:
# Arrange
raw: dict[str, Any] = json.loads(
(_JSON_SAMPLES / "RdSAP-Schema-21.0.0" / "epc.json").read_text()
)
epc = EpcPropertyDataMapper.from_api_response(raw)
with Session(db_engine) as session:
row = PropertyRow(
portfolio_id=7, postcode="A0 0AA", address="1 Some Street", uprn=12345
)
session.add(row)
session.commit()
property_id = row.id
assert property_id is not None
EpcPostgresRepository(session).save(epc, property_id=property_id)
session.commit()
# Act
with Session(db_engine) as session:
repo = PropertyPostgresRepository(session, EpcPostgresRepository(session))
prop = repo.get(property_id)
# Assert
assert prop.identity.portfolio_id == 7
assert prop.identity.uprn == 12345
assert prop.epc == epc
assert prop.source_path == "epc_with_overlay"
assert prop.effective_epc == epc

View file

@ -0,0 +1,91 @@
from __future__ import annotations
from sqlalchemy import Engine
from sqlmodel import Session
from datatypes.epc.domain.epc import Epc
from domain.property_baseline.property_baseline_performance import PropertyBaselinePerformance
from domain.property_baseline.performance import Performance
from repositories.property_baseline.property_baseline_postgres_repository import (
PropertyBaselinePostgresRepository,
)
def _baseline() -> PropertyBaselinePerformance:
lodged = Performance(
sap_score=72, epc_band=Epc.C, co2_emissions=1.8, primary_energy_intensity=180
)
# A rebaselined property — distinct halves so the round-trip proves both are
# persisted independently (not collapsed to one set).
effective = Performance(
sap_score=64, epc_band=Epc.D, co2_emissions=2.4, primary_energy_intensity=210
)
return PropertyBaselinePerformance(
lodged=lodged,
effective=effective,
rebaseline_reason="pre_sap10",
space_heating_kwh=5000.0,
water_heating_kwh=2000.0,
)
def test_baseline_performance_round_trips(db_engine: Engine) -> None:
# Arrange
baseline = _baseline()
with Session(db_engine) as session:
PropertyBaselinePostgresRepository(session).save(baseline, property_id=10)
session.commit()
# Act
with Session(db_engine) as session:
loaded = PropertyBaselinePostgresRepository(session).get_for_property(10)
# Assert — the full aggregate reconstructs, both halves intact.
assert loaded == baseline
def test_resaving_baseline_for_a_property_replaces_rather_than_duplicating(
db_engine: Engine,
) -> None:
# Arrange — a re-run re-establishes the same property's baseline with a
# different rating.
first = _baseline()
rerun = PropertyBaselinePerformance(
lodged=Performance(
sap_score=80,
epc_band=Epc.B,
co2_emissions=1.2,
primary_energy_intensity=150,
),
effective=Performance(
sap_score=80,
epc_band=Epc.B,
co2_emissions=1.2,
primary_energy_intensity=150,
),
rebaseline_reason="none",
space_heating_kwh=4000.0,
water_heating_kwh=1800.0,
)
# Act — save twice for the same property_id (must not hit the unique
# constraint, must overwrite).
with Session(db_engine) as session:
repo = PropertyBaselinePostgresRepository(session)
repo.save(first, property_id=10)
repo.save(rerun, property_id=10)
session.commit()
# Assert
with Session(db_engine) as session:
loaded = PropertyBaselinePostgresRepository(session).get_for_property(10)
assert loaded == rerun
def test_get_for_property_returns_none_when_absent(db_engine: Engine) -> None:
# Arrange / Act
with Session(db_engine) as session:
loaded = PropertyBaselinePostgresRepository(session).get_for_property(999)
# Assert
assert loaded is None

View file

View file

@ -0,0 +1,41 @@
"""SolarRepo round-trips Google Solar building insights for a Property."""
from __future__ import annotations
from typing import Any
from sqlalchemy import Engine
from sqlmodel import Session
from repositories.solar.solar_postgres_repository import SolarPostgresRepository
def test_building_insights_round_trip(db_engine: Engine) -> None:
# Arrange
insights: dict[str, Any] = {
"name": "buildings/ChIJ",
"solarPotential": {
"maxArrayPanelsCount": 42,
"panelCapacityWatts": 250.0,
"roofSegmentStats": [{"pitchDegrees": 30.0, "azimuthDegrees": 180.0}],
},
}
# Act
with Session(db_engine) as session:
SolarPostgresRepository(session).save(property_id=5, insights=insights)
session.commit()
with Session(db_engine) as session:
reloaded = SolarPostgresRepository(session).get(5)
# Assert
assert reloaded == insights
def test_get_returns_none_when_no_insights_stored(db_engine: Engine) -> None:
# Arrange / Act
with Session(db_engine) as session:
reloaded = SolarPostgresRepository(session).get(999)
# Assert
assert reloaded is None

View file

@ -0,0 +1,73 @@
from __future__ import annotations
from collections.abc import Callable
import pytest
from sqlalchemy import Engine
from sqlmodel import Session
from datatypes.epc.domain.epc import Epc
from domain.property_baseline.property_baseline_performance import PropertyBaselinePerformance
from domain.property_baseline.performance import Performance
from repositories.postgres_unit_of_work import PostgresUnitOfWork
def _session_factory(db_engine: Engine) -> Callable[[], Session]:
return lambda: Session(db_engine)
def _baseline() -> PropertyBaselinePerformance:
perf = Performance(
sap_score=72, epc_band=Epc.C, co2_emissions=1.8, primary_energy_intensity=180
)
return PropertyBaselinePerformance(
lodged=perf,
effective=perf,
rebaseline_reason="none",
space_heating_kwh=5000.0,
water_heating_kwh=2000.0,
)
def test_committed_work_is_visible_to_a_later_unit(db_engine: Engine) -> None:
# Arrange
new_unit = lambda: PostgresUnitOfWork(_session_factory(db_engine))
baseline = _baseline()
# Act
with new_unit() as uow:
uow.property_baseline.save(baseline, property_id=10)
uow.commit()
# Assert — a fresh unit reads back what the first one committed.
with new_unit() as uow:
loaded = uow.property_baseline.get_for_property(10)
assert loaded == baseline
def test_an_exception_in_the_block_rolls_the_batch_back(db_engine: Engine) -> None:
# Arrange
new_unit = lambda: PostgresUnitOfWork(_session_factory(db_engine))
# Act — a property mid-batch raises after a write but before commit.
with pytest.raises(RuntimeError, match="boom"):
with new_unit() as uow:
uow.property_baseline.save(_baseline(), property_id=10)
raise RuntimeError("boom")
# Assert — nothing from the aborted batch is persisted.
with new_unit() as uow:
assert uow.property_baseline.get_for_property(10) is None
def test_leaving_the_block_without_commit_persists_nothing(db_engine: Engine) -> None:
# Arrange
new_unit = lambda: PostgresUnitOfWork(_session_factory(db_engine))
# Act — write but never commit.
with new_unit() as uow:
uow.property_baseline.save(_baseline(), property_id=10)
# Assert
with new_unit() as uow:
assert uow.property_baseline.get_for_property(10) is None