Model/packages
Khalim Conn-Kowlessar f4a8d2a017 tests: golden-fixture regression set — 7 currently-correct corpus certs
Pins 7 certs from a 1000-cert random sample that satisfy:
  |SAP rounded-int residual| ≤ 1
  |PE residual| ≤ 10 kWh/m²
  main_heating_category != 4 OR main_heating_data_source != 1
    (non-PCDB-heat-pump — PCDB lookup is deferred)

Cert mix: 6 cat=2 gas/oil boilers (3 PCDB, 3 Table 4b) + 1 cat=6 heat
network. Age bands A, C, D (×3), F, J, L. TFAs 75-526. Mix of
detached / semi-detached / mid-terrace / mid-floor flat. The cleanest
PE match in the set (cert 7536-3827) has PE residual -0.29 kWh/m².

Purpose: regression anchor. Future slices that improve aggregate MAE
silently break individual certs unless caught here. Each cert's
expected residual is recorded in `_EXPECTATIONS` so the diff is
human-inspectable when a regression fires.

The set is acknowledged to contain compensating-errors cases: some
certs match SAP within ±1 because the cert-calibration prices absorb
multiple structural deviations from spec. Hand-trace of 7536-3827
showed PE matched (-0.29) but cost was £143 (12%) under cert's implied
cost — a multi-factor gap (price calibration + missing gas standing
charge + lighting over-prediction) that cancels back into SAP ±1. We
accept this with the tolerance choice: tightening to PE ±5 in our
sample would have yielded zero fixtures.

Tolerance can tighten over the session as we close the PE bias
(currently +38 kWh/m² systematic).

All 301 domain tests pass; no behaviour changed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 07:06:58 +00:00
..
domain tests: golden-fixture regression set — 7 currently-correct corpus certs 2026-05-19 07:06:58 +00:00
fetchers added potential file scaffolding: 2026-05-15 10:56:53 +00:00
repos added potential file scaffolding: 2026-05-15 10:56:53 +00:00
utils added potential file scaffolding: 2026-05-15 10:56:53 +00:00
README.md added potential file scaffolding: 2026-05-15 10:56:53 +00:00

Shared packages

Workspace packages consumed by services/*. Each package is its own Python distribution with its own pyproject.toml; services import via the workspace dependency mechanism ({ workspace = true }).

Package Purpose
domain/ Shared domain types — Property, BaselinePerformance, Plan, Scenario, EpcPropertyData, etc. No persistence, no IO, no business logic.
repos/ Persistence layer — one repo per aggregate. Owns the SQL. Depends on domain.
fetchers/ External API clients (gov EPC, Ofgem, Google Solar, etc.). Depend on domain for response shapes.
utils/ Cross-cutting infra — logging, S3, CloudWatch URL builders, SQS task helpers.

Adding a new shared package

Only when a real second consumer materialises. Don't pre-shatter (repos-epc, repos-property, ...) — split when a deployment needs to drop a dep, not before.

See ../ara_backend_design.md §11 for the broader monorepo layout and ../CONTEXT.md for the domain glossary that names the types living in domain/.