Two production fixes surfaced by the live run: - mapper.from_rdsap_schema_21_0_1 now sets the three ML target scalars (energy_rating_current, co2_emissions_current, energy_consumption_current). They were silently None for every cert before, leaving the only labels as the kWh fields from renewable_heat_incentive. - train_baseline coerces object-dtype columns to numeric (None -> NaN) and drops rows with null target per fit, so LightGBM accepts the frame. E2E on 500 real certs (~1s): sap_score R^2=0.604 MAPE=0.084 co2_emissions R^2=0.813 MAPE=0.130 peui_raw R^2=0.979 MAPE=0.026 space_heating_kwh R^2=0.823 MAPE=0.213 hot_water_kwh R^2=0.519 MAPE=0.115 peui_ucl excluded: UCL correction still needs wiring. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| ara | ||
| ml_training_data | ||
| README.md | ||
Services
Each subdirectory is a deployable unit — typically a Lambda image. Own pyproject.toml, own Dockerfile, own deps. Lambda bundle contains only that service's deps + its workspace deps.
| Service | Purpose |
|---|---|
ara/ |
The Domna retrofit modelling backend — ingestion + modelling pipelines, all 9 services in PRD §9.2. |
Other Domna services (address2uprn, hubspot, pashub, ecmk, magicplan) live in the legacy backend/ and etl/ trees for now; they are slated to migrate here as their owners pick them up — see PRD §11. When that work starts, scaffold the service under services/<name>/ and add it to the workspace members in the root pyproject.toml.
Service boundary
A service can import domain.*, import repos.*, import fetchers.*, import utils.* (workspace deps). It cannot import another service's modules — they are separate distributions with no cross-import path. This is the structural enforcement of the modelling/ingestion separation (ADR-0003).