Model/services
Khalim Conn-Kowlessar 136f149d46 tooling: widen parity probe sap_score range to (5, 99)
Previous bound (20, 95) excluded full-SAP new-builds (sap_score 90+,
which carry the dramatic wall U-value gap) and deepest-tail heritage
certs (sap_score ≤ 20). Widening so the sample reflects the
populations where the calculator's biggest spec gaps live.

New baseline at 300 certs, seed=7:
  SAP MAE 5.34 → 4.59 (-0.75)
  PE MAE  48.99 → 46.78 (-2.21)
  PE bias 42.07 → 41.78 (-0.29)

Note: the v18a parquet only contains ~0.7% certs with age_band=None,
while the raw bulk zip has 15% full-SAP "Average thermal transmittance"
certs. The parquet is filtering them somewhere upstream — to be chased
in separate work. Until then, parity-probe MAE will under-show the true
corpus impact of slices that target full-SAP certs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 20:38:22 +00:00
..
ara added potential file scaffolding: 2026-05-15 10:56:53 +00:00
ml_training_data tooling: widen parity probe sap_score range to (5, 99) 2026-05-18 20:38:22 +00:00
README.md added potential file scaffolding: 2026-05-15 10:56:53 +00:00

Services

Each subdirectory is a deployable unit — typically a Lambda image. Own pyproject.toml, own Dockerfile, own deps. Lambda bundle contains only that service's deps + its workspace deps.

Service Purpose
ara/ The Domna retrofit modelling backend — ingestion + modelling pipelines, all 9 services in PRD §9.2.

Other Domna services (address2uprn, hubspot, pashub, ecmk, magicplan) live in the legacy backend/ and etl/ trees for now; they are slated to migrate here as their owners pick them up — see PRD §11. When that work starts, scaffold the service under services/<name>/ and add it to the workspace members in the root pyproject.toml.

Service boundary

A service can import domain.*, import repos.*, import fetchers.*, import utils.* (workspace deps). It cannot import another service's modules — they are separate distributions with no cross-import path. This is the structural enforcement of the modelling/ingestion separation (ADR-0003).