train_baseline now returns mae + rmse alongside mape/smape/r2. MAE is the
user-facing metric ("predicted SAP within N points"); RMSE the quadratic
counterpart. Both come straight from sklearn.
New sample_weight_fn parameter: callable(y_train) -> per-row weights.
Threads into LGBMRegressor.fit's sample_weight argument. Default None
preserves existing behaviour.
Default tail strategy exposed as low_sap_tail_weight(y, threshold=58,
weight=3): 3x weight where SAP < 58. Threshold picked from slice 16h's
per-decile residuals — decile 0 (SAP 1-58) carries 17% MAPE vs <5% body.
Three TDD tracers, all AAA.
250k retrain showed objective='mape' loses ~0.6 percentage points of
global sap_score MAPE (3.92% with regression vs 4.50% with mape) and
~0.7 pts on peui_ucl. The mape objective over-weights the low-SAP tail
(weight ~1/y) and drags the body MAPE up by more than it gains in the
tail.
Body MAPE on v16 features is already strong (2.38% on deciles 1-8); the
remaining tail bias at decile 0 (SAP<58, +3.1 bias) needs a different
fix -- sample weights or stratified loss -- queued as slice 16i.
Per ADR-0008: the v15 baseline reports MAPE but optimises MSE, which
under-weights tail rows. Switching to objective='mape' applies gradient
proportional to 1/|y| and lets the model focus where MAPE penalises.
Targets co2_emissions, space_heating_kwh, hot_water_kwh, and peui_raw
retain the default 'regression' objective (some rows have ~zero CO2 from
heavy PV; MAPE objective destabilises near zero).
Sample weights deferred to slice 16i if slice 16h's per-decile residuals
still show tail bias after the objective switch.
Adds `_per_decile_residuals` and writes `residuals_<target>.json` next to
metrics.json. Buckets test-set rows by deciles of the true target value;
each bucket carries count + MAPE + MAE + mean residual + true_min/max.
Lets us tell whether errors concentrate in the tails of the true distribution
(e.g. SAP<40 / SAP>85) vs the mid-band — which the global MAPE alone hides.
Baseline for slice 16's MAPE-improvement ablations.
Bulk entries are NDJSON of wrapper records, not a JSON array. Each wrapper
carries certificate_number, assessment_type, and a stringified document with
the actual EPC schema payload. Filter to RdSAP, unwrap document, then map.
remote_bulk_fetcher: per-entry presigned-URL refresh (30s S3 TTL).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
ijson use_float fixes Decimal/float coercion when streaming JSON.
pyright extraPaths so the new pkg type-checks against domna-domain.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>