Model/services
Khalim Conn-Kowlessar 6072d8795a slice 16i: MAE + RMSE in metrics; sample_weight_fn + low_sap_tail_weight
train_baseline now returns mae + rmse alongside mape/smape/r2.  MAE is the
user-facing metric ("predicted SAP within N points"); RMSE the quadratic
counterpart.  Both come straight from sklearn.

New sample_weight_fn parameter: callable(y_train) -> per-row weights.
Threads into LGBMRegressor.fit's sample_weight argument.  Default None
preserves existing behaviour.

Default tail strategy exposed as low_sap_tail_weight(y, threshold=58,
weight=3): 3x weight where SAP < 58.  Threshold picked from slice 16h's
per-decile residuals — decile 0 (SAP 1-58) carries 17% MAPE vs <5% body.

Three TDD tracers, all AAA.
2026-05-17 14:48:00 +00:00
..
ara added potential file scaffolding: 2026-05-15 10:56:53 +00:00
ml_training_data slice 16i: MAE + RMSE in metrics; sample_weight_fn + low_sap_tail_weight 2026-05-17 14:48:00 +00:00
README.md added potential file scaffolding: 2026-05-15 10:56:53 +00:00

Services

Each subdirectory is a deployable unit — typically a Lambda image. Own pyproject.toml, own Dockerfile, own deps. Lambda bundle contains only that service's deps + its workspace deps.

Service Purpose
ara/ The Domna retrofit modelling backend — ingestion + modelling pipelines, all 9 services in PRD §9.2.

Other Domna services (address2uprn, hubspot, pashub, ecmk, magicplan) live in the legacy backend/ and etl/ trees for now; they are slated to migrate here as their owners pick them up — see PRD §11. When that work starts, scaffold the service under services/<name>/ and add it to the workspace members in the root pyproject.toml.

Service boundary

A service can import domain.*, import repos.*, import fetchers.*, import utils.* (workspace deps). It cannot import another service's modules — they are separate distributions with no cross-import path. This is the structural enforcement of the modelling/ingestion separation (ADR-0003).