Model

mirror of https://github.com/Hestia-Homes/Model.git synced 2026-07-27 23:35:01 +00:00

Author	SHA1	Message	Date
Khalim Conn-Kowlessar	1a6996abbb	slice S-B12: water-heating eff inherits main_heating_category cascade The legacy water_heating_efficiency(901, main_code) returns 0.80 (gas boiler default) when sap_main_heating_code is None — even if the main system is a heat pump (category=4, efficiency 2.30). For "from main system" water codes (901/902/914), we must inherit through the FULL main-heating cascade including the category fallback. Discovered by hand-tracing cert 0320-2850 (Semi-detached bungalow, heat-pump main with no SAP code lodged, actual SAP 70, predicted 49). HW was being charged at 0.80 eff for a 2.30-eff dwelling — 2.9× too much HW fuel. 100-cert parity probe: MAE 4.66 → 4.53 (-0.13) RMSE 6.27 → 5.96 bias -0.70 → -0.57 within ±10: 94% → 95% Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 15:48:42 +00:00
Khalim Conn-Kowlessar	737e5d6bf5	slice S-B11: e7_eligible_main_codes on PriceTable; cert calibration adds 191-196 Hand-tracing cert 0800-1364 (Detached bungalow, code 191/direct-electric, actual SAP 71, predicted 37) showed the cert assessor applies off-peak rates to direct-electric main heating despite SAP 10.2 Table 12a specifying 90% high-rate. Adds e7_eligible_main_codes to PriceTable so each price source carries its own rule: - SAP_10_2_SPEC_PRICES: {401-409, 421-425} (storage only, per Table 12a) - CERT_CALIBRATION: {191-196, 401-409, 421-425} (empirically what the cert software does) 100-cert parity probe: MAE 4.99 → 4.66 (recovered to pre-S-B9 best state) bias -1.03 → -0.70 within ±1: 23% → 24% within ±3: 47% → 48% within ±10: 93% → 94% Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 15:43:15 +00:00
Khalim Conn-Kowlessar	92727568a3	slice S-B10: price-table seam for cert-calibration parity validation Separates the SAP-spec source of truth from the empirical cert- calibration prices. cert_to_inputs() now accepts a `prices: PriceTable` parameter defaulting to SAP_10_2_SPEC_PRICES (3.64 gas, 16.49 elec, 9.40 7h-low — verbatim from SAP 10.2 §12.2 / Table 12). Parity probe passes the empirical cert_calibration_prices() factory from domain.sap.tables.table_12_cert_calibration which carries the lower prices that match the cert assessor software's actual output (3.48, 13.19, 5.50). This split is documented in both table modules: cert calibration is explicitly NOT spec-correct, it just matches observed cert behaviour for parity testing. 100-cert parity probe with cert-calibration prices: MAE 6.66 → 4.99 (recovered from spec-price regression; also -0.41 from absolute baseline thanks to other S-B fixes) RMSE 10.29 → 7.13 bias -4.66 → -1.03 within ±1: 20% → 23% within ±3: 38% → 47% within ±5: 63% → 67% within ±10: 82% → 93% Session-B progress overall (S-B2 baseline → here): MAE 8.41 → 4.99, within ±1 doubled (10% → 23%). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 15:20:46 +00:00
Khalim Conn-Kowlessar	c74857ac14	slice S-B9: SAP 10.2/10.3 Table 12 spec-correct prices + Table 12a fix Verified against the SAP 10.2 spec (14-03-2025): Table 12 unit prices are IDENTICAL to SAP 10.3 Table 12. Both specs mandate (§12.2): "Fuel costs are calculated using the fuel prices given in Table 12. Other prices must not be used for calculation of SAP ratings." The legacy ML-pipeline prices in domain.ml.sap_efficiencies (3.48 gas, 13.19 elec, 5.50 E7-low) do NOT match either SAP 10.2 or 10.3 and appear to be a pre-2022 holdover. New module domain.sap.tables.table_12 carries the spec-correct values: mains gas: 3.64 (was 3.48 legacy) standard electricity: 16.49 (was 13.19) 7h-low / Economy-7: 9.40 (was 5.50) 24h-heating: 14.04 (was 6.61) Also corrects an S-B4 bug: SAP 10.2 Table 12a shows direct-acting electric heating (codes 191-196) runs at 90% high-rate on 7h tariffs, not 0% — only true storage heaters (401-409, 421-425) bill at the low rate. _E7_SPACE_HEATING_CODES narrowed accordingly. 100-cert parity probe with spec-correct prices: MAE 4.66 → 6.66 (regression vs legacy prices) bias -0.70 → -4.66 (over-counting cost) spec-correctness: SAP 10.2 verbatim The MAE regression confirms the corpus's lodged ratings were NOT calculated against the published SAP 10.2 Table 12 prices. The cert ratings appear to use the legacy lower prices despite reporting sap_version=10.2. Three paths forward documented in next commit's discussion thread. Also adds the SAP 10.2 spec PDF to docs/sap-spec/. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 15:14:11 +00:00
Khalim Conn-Kowlessar	6d256ab2bc	slice S-B8: extend E7 off-peak rate to HW for E7-tariff dwellings When the main heating is electric storage / direct-electric (codes 191-196, 401-409, 421-425), the cert almost always carries an Economy-7 tariff and the immersion HW cylinder runs on the off-peak timer. Bill HW at the 7h-low rate (5.5 p/kWh) in that case, falling back to the lower of {7h-low, water_heating_fuel rate} so we never over-charge an HW fuel that's already cheaper than off-peak. 100-cert parity probe: MAE 4.90 → 4.66 (-0.24) bias -1.44 → -0.70 (over-correction halved) within ±3: 46% → 48% within ±5: 67% → 68% within ±10: 93% → 94% Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 14:59:09 +00:00
Khalim Conn-Kowlessar	aa2c7a9171	slice S-B7: per-end-use fuel cost — HW uses water-fuel, lighting always electric SAP 10.3 §12 charges fuel costs by end-use, not by main heating fuel. For a gas-heated dwelling with an electric immersion hot-water cylinder, HW bills at the electric rate (13.19 p/kWh) not the gas main-heating rate (3.48 p/kWh) — a 3.8× cost difference for HW that propagates straight to ECF. Lighting, central-heating pumps, and fans always electric regardless of main fuel. Discovered by hand-tracing cert 8035-9023 (Detached bungalow, actual SAP 43, predicted 63). Trace showed our hot-water + lighting + pumps lines were charging mains-gas rates throughout, under-counting cost by ~£290/yr. 100-cert parity probe (biggest single Session-B slice so far): MAE 5.70 → 4.90 (-0.80, -14%) RMSE 7.48 → 6.68 (-11%) within ±1: 20% → 24% within ±3: 37% → 46% within ±5: 54% → 67% bias +1.50 → -1.44 (over-corrected by ~3 SAP points) The over-correction (bias now slightly negative) means we're now under-predicting on average. Next slice tackles where we're charging too much electricity — probably HW on dwellings with combi boilers (no immersion, water still on main fuel) and the water_heating_code 901 ("from main system") inheritance path. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 14:54:24 +00:00
Khalim Conn-Kowlessar	29c776bb23	slice S-B6: glazing g_perpendicular + frame_factor lookups (Tables 6b/6c) Replaces the two hardcoded glazing defaults (g⊥=0.63, FF=0.7) in the cert→inputs mapper with spec-driven lookups: - g_perpendicular by glazing_type (Table 6b): single → 0.85, double 2002+ → 0.72, low-E soft → 0.63, secondary → 0.76, triple → 0.68. Default 0.72 when missing. - frame_factor by frame_material (Table 6c): wood/PVC/composite → 0.70, aluminium/steel/metal → 0.83. Measured values from window_transmission_details / SapWindow.frame_factor still take precedence. Overshading factor stays at 0.77 ("average") since RdSAP 10 doesn't lodge a per-window overshading code. 100-cert parity probe: MAE 5.65 → 5.70 (flat) exact-match within ±1: 18% → 20% bias +1.13 → +1.50 Slight bias drift toward over-prediction is expected — bigger solar gains reduce predicted heating demand. Net: the engine is now more spec-correct (more exact matches), but composition of errors elsewhere needs the next slice to bring bias back toward 0. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 14:48:11 +00:00
Khalim Conn-Kowlessar	f3baa51a9b	slice S-B5: main_heating_control code → SAP control type Maps the Table 9 main_heating_control code to SAP control type 1/2/3: codes 2101-2104 = type 1, 2105-2109 = type 2, 2110+ = type 3. Default remains type 2 when code is missing or unrecognised. Two other fixes tried-and-reverted in this slice based on the 100-cert parity probe: - NI-thickness → None (the "wall insulated but thickness unknown, use 50mm row" path): over-corrected in aggregate because many "NI" certs are genuinely uninsulated. Reverted to legacy NI→0 with a note to revisit once wall_insulation_type is used as a stronger signal. - boiler-age efficiency rescue (cat 1/2, A-F → 0.74, K-M → 0.85): same issue — stacked with NI fix it over-shot, on its own it gave marginal MAE without bias improvement. Dropped pending further investigation. 100-cert parity probe: MAE 5.72 → 5.65 (-0.07; control-type-only is a small net win) RMSE 7.58 → 7.48 (-0.10) bias +1.20 → +1.13 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 14:37:44 +00:00
Khalim Conn-Kowlessar	8e1d30c97d	slice S-B4: per-end-use fuel cost (Economy-7 for electric storage) Splits the single CalculatorInputs.fuel_unit_cost_gbp_per_kwh into three end-use lines — space_heating, hot_water, other — to match SAP 10.3 §12 which charges different tariffs per end-use on Economy-7 dwellings. cert→inputs rule: when sap_main_heating_code is in the electric-storage (401-409), high-heat-retention storage (421-425), or direct-electric (191-196) ranges, space heating bills at the 7h-low rate (5.5p/kWh) while hot water + lighting + pumps stay on standard electricity (13.19p/kWh). All other fuels use a single rate across all three end- uses. 100-cert parity probe impact: MAE 7.53 → 5.72 (-1.81, -24%) RMSE 11.60 → 7.58 (-4.02, -35%) worst residual -56 → -25 (Semi-detached bungalow) within ±10: 85% → 91% Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 14:18:56 +00:00
Khalim Conn-Kowlessar	ccdaba5acd	slice S-B3: flat heat-loss surface awareness DwellingExposure flags on heat_transmission_from_cert suppress the floor and/or roof channels when those surfaces are party with a neighbouring dwelling. Cert mapper derives the flags from EpcPropertyData.dwelling_type prefix: - "Mid-floor " → floor=False, roof=False - "Top-floor " → floor=False, roof=True - "Ground-floor *" → floor=True, roof=False - everything else → both exposed 100-cert parity probe impact: MAE 8.41 → 7.53 (-0.88) RMSE 13.98 → 11.60 (-2.38) bias -2.65 → -0.61 (system bias on flats essentially eliminated) Bungalow outliers (-56 worst residual) untouched — different failure mode (full envelope, but cascade U-values too conservative or storey count over-counted). Next slice tackles that. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 14:10:45 +00:00
Khalim Conn-Kowlessar	dde8ae30fa	S-B2: parity probe + first-pass findings (100-cert baseline) Adds services/ml_training_data/src/ml_training_data/sap_parity_probe.py — samples N certs from the v18a corpus, streams them via BulkZipReader, runs Sap10Calculator, prints MAE/RMSE/bias + worst-N residuals. Baseline across 100 certs: MAE 8.41, RMSE 13.98, bias -2.65, 0 errors. docs/sap-spec/PARITY_FINDINGS.md captures the dominant failure pattern (flats + bungalows under-predicted, 10 of the worst-15 are flats whose floor/roof are party with neighbouring dwellings) and the priority- ordered Session B iteration backlog (S-B-flat-surfaces first). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 13:59:23 +00:00
Khalim Conn-Kowlessar	57f18a8773	slice S-B1: parity-validation report aggregator Pure-function ParityCase / ParityReport / build_parity_report for the Session B 1000-cert parity check (ADR-0009). Aggregates per-cert (predicted, actual) sap pairs into global + typical-subset MAE, RMSE, bias, and the worst-N residuals for spec-iteration. Cert→case mapping (corpus load, calculator run, actual-sap lookup) sits at a higher layer; this module is trivial to test so the harder integration code inherits its testing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 13:22:45 +00:00
Khalim Conn-Kowlessar	a243055de7	slice S-A7b: RdSAP cert→inputs mapper + Sap10Calculator.calculate(epc) Adds domain.sap.rdsap.cert_to_inputs.cert_to_inputs(epc) which produces a typed CalculatorInputs from an EpcPropertyData, and a thin Sap10Calculator.calculate(epc) entry point that wraps the mapper + the S-A7a orchestrator. Defaults follow RdSAP 10 (Table 27 for living-area fraction, Table 5 for ventilation, Table 12 for fuel cost + CO2 factor) and SAP 10.3 Tables 4a/4b for heating efficiency via the existing domain.ml.sap_efficiencies cascade. Deferred to Session B: conservatory modes, room-in-roof, secondary heating split (Table 11), multi-fuel weighted cost, thermal-mass parameter from construction type, control-temp adjustment from main_heating_control code. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 09:34:41 +00:00
Khalim Conn-Kowlessar	684e2945ae	slice S-A7a: Sap10Calculator orchestrator (synthetic-input) Wires SAP 10.3 §§5-13 into a 12-month heat-balance loop driven by a typed CalculatorInputs aggregate, returning a typed SapResult with the score, ECF, costs/CO2 totals, and a 12-entry monthly breakdown. Physics assembly only — the cert→inputs mapper lands in S-A7b. η/T_internal solved with two-pass iteration per SAP 10.3 §7.3. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 09:27:28 +00:00
Khalim Conn-Kowlessar	9106621aee	slice S-A6: SAP10.3 rating + EI rating formulas (§13 + §14) Tenth slice of the SAP10 Calculator Session A (ADR-0009). Ships four pure functions under domain.sap.worksheet.rating implementing the SAP 10.3 rating formulas: energy_cost_factor(total_cost_gbp, total_floor_area_m2) -> equation (7): ECF = 0.36 × cost / (TFA + 45) Deflator 0.36 sourced from Table 12 (page 191). sap_rating(ecf) -> equations (8)/(9), continuous (un-rounded) SAP value: ECF ≥ 3.5: 108.8 − 120.5 × log10(ECF) ECF < 3.5: 100 − 16.21 × ECF Naturally rises above 100 for net energy exporters (negative ECF). sap_rating_integer(ecf) -> integer SAP value as published on the EPC: round to nearest, clamp to minimum 1 per §13. environmental_impact_rating(co2_emissions_kg_per_yr, total_floor_area_m2) -> equations (10)-(12), continuous EI rating: CF = CO2 / (TFA + 45) CF ≥ 28.3: 200 − 95 × log10(CF) CF < 28.3: 100 − 1.34 × CF 8 AAA cycles cover: ECF formula hand-computed, SAP linear branch (typical home), SAP log branch (high cost), boundary continuity at ECF=3.5, net-exporter SAP > 100, integer rounding + min-1 clamp, EI linear branch, EI log branch. Orchestrator (S-A7) wires these into Sap10Calculator alongside the monthly heat balance loop from S-A5e.	2026-05-18 09:12:25 +00:00
Khalim Conn-Kowlessar	c0afe3592f	slice S-A5e: monthly space-heating requirement (SAP 10.3 Table 9c step 10) Ninth slice of the SAP10 Calculator Session A (ADR-0009). Ships monthly_heat_requirement_kwh implementing the Table 9c step-10 formula: L_m = H × (T_i,m − T_e,m) (W) Q_heat,m = 0.024 × (L_m − η_m × G_m) × n_m (kWh) with the table's clamp: Q_heat is set to 0 when negative or below 1 kWh per month (summer months and well-insulated dwellings in shoulder months). The orchestrator (S-A6) iterates utilisation factor + mean internal temperature until they converge before calling this function. 5 AAA cycles cover: typical-winter-month hand-computed worked example, summer month with gains exceeding losses clamping to 0, gains-scaling direction check, external-temperature direction check, and the sub-1-kWh clamp per the Table 9c note.	2026-05-18 09:00:05 +00:00
Khalim Conn-Kowlessar	8c21b399c6	slice S-A5d: mean internal temperature (SAP 10.3 Tables 9 + 9b + 9c) Eighth slice of the SAP10 Calculator Session A (ADR-0009). Implements SAP 10.3 mean internal temperature with three public helpers under domain.sap.worksheet.mean_internal_temperature: elsewhere_heating_temperature_c(hlp, control_type) -> Table 9 T_h2 formula: control type 1: T_h2 = 21 − 0.5 × HLP control type 2 or 3: T_h2 = 21 − HLP + HLP² / 12 HLP clamped to 6.0 per Table 9 note (e). off_period_temperature_reduction_c(t_off, T_h, T_e, R, G, H, η, τ) -> Table 9b u value (°C drop below T_h over an off-period): t_c = 4 + 0.25·τ T_sc = (1−R)(T_h−2) + R·(T_e + η·G/H) quadratic branch when t_off ≤ t_c, linear when t_off > t_c. mean_internal_temperature_c(...) -> Table 9c steps 1-8: living-area zone (off 7+8 h, T_h1=21°C) and elsewhere zone (off 7+8 h for control 1/2 or 9+8 h for control 3, T_h2 from above), blended by living_area_fraction, plus the Table 4e control-type temperature adjustment. Step 9 (re-compute utilisation factor with the new T_i) and step 10 (Q_heat = 0.024 × (L − η·G) × n_m) live in the next slice's monthly loop. 7 AAA cycles cover: T_h2 formulas for control types 1 vs 2, HLP > 6 clamp per note (e), off-period u quadratic branch (t_off ≤ t_c), off-period u linear branch (t_off > t_c), full mean_internal_temperature hand-computed worked example, and control-type-3 longer first off-period dropping mean temp slightly below control-type-2.	2026-05-18 08:52:11 +00:00
Khalim Conn-Kowlessar	e403e2302c	slice S-A5c: heating utilisation factor η (SAP 10.3 Table 9a) Seventh slice of the SAP10 Calculator Session A (ADR-0009). Ships utilisation_factor(*, total_gains_w, heat_loss_rate_w, time_constant_h) implementing SAP 10.3 Table 9a: a = 1 + τ / 15 γ = G / L if γ > 0 and γ ≠ 1: η = (1 − γ^a) / (1 − γ^(a+1)) if γ = 1: η = a / (a + 1) if heat_loss_rate ≤ 0: η = 1 (dwelling in net surplus) η caps the contribution of internal + solar gains when they outpace the heat-loss rate. The orchestrator computes time_constant_h = TMP / (3.6 × HLP) and passes it in here; that's a future slice. 5 AAA cycles cover: small γ → η ≈ 1, γ = 1 special-case formula, zero/negative heat loss returning η = 1, large γ dropping η well below 0.5, and higher τ (more thermal mass) raising η for the same γ.	2026-05-18 08:38:03 +00:00
Khalim Conn-Kowlessar	57bf7833a9	slice S-A5b: solar gains (SAP 10.3 §6 + Appendix U §U3.2) Sixth slice of the SAP10 Calculator Session A (ADR-0009). Two layers under domain.sap.worksheet.solar_gains: 1. surface_solar_flux_w_per_m2(orientation, pitch_deg, region, month) — implements Appendix U §U3.2 polynomial that converts the horizontal solar irradiance from Table U3 to per-orientation per-pitch surface flux: S(orient, p, m) = S_h,m × R_h-inc R_h-inc = A cos²(φ-δ) + B cos(φ-δ) + C where A, B, C are cubics in sin(p/2) with coefficients k1-k9 from Table U5. Reads latitude φ from Table U4 and solar declination δ from Table U3 footer (already in domain.sap.climate.appendix_u). 2. window_solar_gain_w(area_m2, surface_flux, g⊥, FF, Z) — implements §6.1 equation (5): G = 0.9 × A × S × g⊥ × FF × Z. Orientation enum maps the 8 SAP cardinal codes to the 5 Table U5 columns: N/S to their own column; NE/NW share; E/W share; SE/SW share. 7 AAA cycles cover: UK average South vertical July hand-computed flux, rooflight pitch=0 collapses to horizontal Table U3 directly, North-vertical summer > winter (diffuse signal), NE/NW share constants symmetry, equation (5) window gain, zero-area edge case, out-of-range region validation. Tables 6b (g⊥), 6c (frame factor), 6d (overshading Z) defaults deferred to the cert→inputs mapper slice — callers pass them explicitly here so the physics stays cert-shape-independent.	2026-05-17 22:59:25 +00:00
Khalim Conn-Kowlessar	c317a72b71	slice S-A5a: internal gains (SAP 10.3 §5 + Appendix L) Fifth slice of the SAP10 Calculator Session A (ADR-0009). Ships internal_gains_w(*, total_floor_area_m2, month, occupancy=None) returning an InternalGainsBreakdown over four named SAP 10.3 components: metabolic_w — 60 W × N (SAP convention; constant year-round) cooking_w — 35 + 7N per Appendix L equation (L18) appliances_w — Appendix L (L13) E_A = 207.8 × (TFA × N)^0.4714 with the (L14) monthly cosine variation, converted to watts via (L16a) lighting_w — Appendix L existing-dwelling fallback chain (L5b, L8c, L9c-d, L10, L12). Default efficacy 21.3 lm/W, no daylight bonus, 85% internal fraction. Occupancy defaults via Appendix J Table 1b when not supplied: N = 1 + 1.76 × (1 - exp(-0.000349 × (TFA - 13.9)²)) + 0.0013 × (TFA - 13.9) for TFA > 13.9 m², else N = 1. Daylight-factor + occupancy override remain caller's responsibility for later slices (solar_gains will populate G_L; cert-to-inputs mapper will choose between RdSAP default and explicit assessor input). 8 AAA cycles cover: cooking constant, metabolic 60W/N, Appendix J occupancy default for typical and tiny TFA, appliances monthly variation, lighting existing-dwelling fallback, total = sum, month-range validation.	2026-05-17 22:42:20 +00:00
Khalim Conn-Kowlessar	732eef6adb	slice S-A4: heat-transmission HLC breakdown (SAP 10.3 §3) Fourth slice of the SAP10 Calculator Session A (ADR-0009). Ports the per-element conduction HLC logic out of domain.ml.envelope into a typed HeatTransmission breakdown under domain.sap.worksheet. Aggregates Σ U×A across walls, roof, floor, party walls, windows, doors, plus thermal- bridging y × total exposed area, summed across every building part. The orchestrator can now read walls_w_per_k / roof_w_per_k / floor_w_per_k etc. directly off the result for audit + monthly-loop wiring, rather than seeing a single envelope_heat_loss scalar. U-value cascade still routes through domain.ml.rdsap_uvalues (migrates to domain.sap.rdsap.cascade_defaults in Session B per ADR-0009 module-layout plan). domain.ml.envelope stays in place to keep the ML transform's physics-feature pipeline running until Session B. 6 AAA cycles cover: per-element breakdown for a baseline age-G cavity mid-terrace, window net-wall subtraction, insulated-door U-value blending, cavity-party-wall contribution per Table 15, thermal-bridging scaling by age band per Table 21, and multi-part (main + extension) aggregation. 192 tests pass across domain.sap + domain.ml — no regressions.	2026-05-17 22:30:56 +00:00
Khalim Conn-Kowlessar	3fcec7ef22	slice S-A3: infiltration worksheet lines (6a)-(16) (SAP 10.3 §2) Third slice of the SAP10 Calculator Session A (ADR-0009). Ports the SAP 10.2 / RdSAP10 §4.1 air-change-rate worksheet for the no-pressure-test path. Returns an InfiltrationBreakdown carrying each named worksheet line so callers can audit per SAP convention: (8) openings_ach — Table 2.1 rate × count / volume (10) additional_ach — (storey_count − 1) × 0.1 (11) structural_ach — 0.25 steel/timber-frame, 0.35 masonry (12) floor_ach — 0.2 unsealed timber / 0.1 sealed / 0 (13) draught_lobby_ach — 0.05 absent, 0.0 present (15) window_ach — 0.25 − 0.2 × (pct_dp / 100) (16) total_ach — sum of all of the above Table 2.1 rates: open chimney 80, open flue 20, closed-fire chimney 10, solid-fuel-boiler chimney 20, other-heater chimney 35, blocked chimney 20, intermittent fan 10, passive vent 10, flueless gas fire 40 (all m³/hour per opening). 9 AAA cycles cover the baseline calculation, each Table 2.1 opening contribution, frame-vs-masonry structural baseline, suspended-timber floor sealed/unsealed split, draught-lobby presence, window draught- proofing scale, multi-opening aggregation, and volume_m3 ≤ 0 validation. Pressure-test override (worksheet lines 17-21) and mechanical-ventilation adjustments (Table 4g, n_eff formula §2.6.6) are out of scope for this slice — separate later slices per ADR-0009.	2026-05-17 22:00:10 +00:00
Khalim Conn-Kowlessar	fa5bdcc26f	slice S-A2: dimensions module (SAP 10.3 §1) Second slice of the SAP10 Calculator Session A (ADR-0009). Ships a frozen Dimensions dataclass + dimensions_from_cert(epc) pure function under domain/sap/worksheet/. Aggregates geometry across every sap_building_parts entry (main dwelling + each extension): total floor area, volume, storey count, area-weighted average storey height, ground/top floor area, ground-floor heat-loss perimeter, gross wall area, party wall area. Top-level epc.total_floor_area_m2 is the authoritative TFA; per-storey sums drive the wall-area calculations. Volume = TFA × avg_storey_height. 5 AAA cycles cover: single-storey single-part, two-storey scaling, main+extension aggregation, empty-cert fallback to default 2.5 m height, and a non-default-height terrace exercising party-wall scaling. Edge cases (porches, conservatories, integral garages, RIR storey treatment) deferred to later slices per ADR-0009 Session A scope.	2026-05-17 21:49:29 +00:00
Khalim Conn-Kowlessar	2661481625	slice S-A1: Appendix U climate tables (U1/U2/U3) First slice of the SAP10 Calculator Session A (ADR-0009). Ships the three SAP 10.3 Appendix U monthly tables across 22 climate regions (region 0 = UK average; 1-21 named per spec) as a pure-data module under the new domain/sap/ package: - Table U1: mean external temperature (°C) - Table U2: wind speed (m/s) - Table U3: mean global solar irradiance on horizontal plane (W/m²) - Table U3 footer: monthly solar declination (°, region-independent) Lookups validate region (0..21) and month (1..12) and raise ValueError on out-of-range inputs. 11 AAA tests cover happy-path lookups across multiple regions/months plus boundary and error cases.	2026-05-17 21:43:09 +00:00
Khalim Conn-Kowlessar	8dbe873daf	ADR-0009: pivot to deterministic SAP 10.3 calculator (Accepted) Promotes ADR-0009 from Proposed to Accepted after the grill-with-docs session resolved all seven open questions. Bundles the SAP 10.3 and RdSAP 10 specifications under docs/sap-spec/ plus a calculator design sketch (module layout, monthly-loop pseudo-code, status table). CONTEXT.md adds three new domain terms parallel to existing performance language: - Calculated SAP10 Performance (parallel to Effective / Lodged) - SAP10 Calculation (process; implemented by Sap10Calculator) - Measure Application (process; implemented by MeasureApplicator) ML pipeline is NOT retired — it stays as the residual head once the calculator reaches parity in Session B. ADR-0009 §"Grill outcomes" carries the seven binding scope decisions plus three Session-A-scope changes discovered during the grill (RdSAP §19 EER formula, SAP 10.2 Appendix A cross-reference, RdSAP Table 29 cascade defaults).	2026-05-17 21:27:21 +00:00
Khalim Conn-Kowlessar	244f4555ac	slice 20a.1: route ventilation through predicted_space_heating_kwh (v2.7.1) v20a added ventilation_heat_loss_w_per_k as a standalone feature but never connected it to the HLC inside predicted_space_heating_kwh, so the downstream physics aggregates (predicted_ecf, predicted_total_fuel_cost, predicted_log10_ecf — the top-10 model features) never saw the infiltration signal. Importance for ventilation_heat_loss_w_per_k was rank 58/196 (importance 30) vs envelope's rank 21 (86). Adds the ventilation column to the envelope-conduction HLC before applying HDH and efficiency, so chimney + draught-proofing signals flow through the physics aggregates the model actually uses. Default 0 keeps backwards compatibility.	2026-05-17 18:48:57 +00:00
Khalim Conn-Kowlessar	4d838bb03c	slice 20a: ventilation_heat_loss_w_per_k feature (v2.7.0) Adds SAP10.2 §C tracer-bullet infiltration model as a new physics-as-feature column alongside envelope_heat_loss_w_per_k. ACH = structural baseline (0.35 masonry / 0.25 timber-or-system-built) + open chimneys at 40 m³/h each minus a draught-proofing reduction scaled by window_pct_draught_proofed, then volumed and converted to W/K. Targets the d0 catastrophic-low-SAP tail where chimney + leakage signals dominate but envelope conduction alone under-counts heat loss. Scope deferred to follow-ups: MVHR/MEV factors (mechanical_ventilation is 100% null in the corpus), pressure-test override (pressure_test also 100% null - slice 18e mapper fix), open flues / passive vents / flueless gas fires (sap_ventilation sparsely populated).	2026-05-17 18:30:02 +00:00
Khalim Conn-Kowlessar	831ebac2ae	slice 18d: seasonal_efficiency category fallback for null SAP code (v2.6.0) Many real certs carry main_heating_category=4 (heat pump) but null sap_main_heating_code, so seasonal_efficiency() was returning the 0.80 gas-boiler default — a 3x COP under-count that dragged the high-SAP heat-pump tail. Adds main_heating_category + main_fuel_type fallbacks: cat=4 -> 2.30, cat=7 -> 1.00, cat=10 routes by fuel (electric=1.00, gas=0.55, oil=0.65), cat=5 warm air -> 0.76. Explicit SAP codes still win.	2026-05-17 18:13:47 +00:00
Khalim Conn-Kowlessar	d11d4df3df	slice 18c: description-aware u_wall material fallback (v2.5.0) When wall_construction integer is missing or WALL_UNKNOWN, u_wall now parses the top-level walls[i].description for material keywords (sandstone/limestone/granite/whinstone/cob/system built/timber frame/ solid brick/cavity) before falling through to the cavity-by-age default. Explicit construction codes still win. Threaded through envelope_heat_loss_w_per_k via a joined wall description string off the top-level walls list.	2026-05-17 17:55:09 +00:00
Khalim Conn-Kowlessar	60eea0f52b	slice 18b: description-aware u_roof for catastrophic roofs (v2.4.0) Table 18 age-band roof defaults assume joist insulation >= 100mm, which mis-rates heritage roofs the surveyor explicitly described as uninsulated. u_roof now reads roofs[i].description and routes "no insulation" / "uninsulated" -> 2.30 W/m^2K and "limited insulation" -> 1.50 W/m^2K, threaded through envelope_heat_loss_w_per_k via a single joined description string off the top-level roofs list. Explicit insulation_thickness_mm still wins over description.	2026-05-17 17:32:57 +00:00
Khalim Conn-Kowlessar	696d43112e	fix: translate gov EPC API fuel codes to SAP10.2 Table 32 (v2.3.0) predicted_total_fuel_cost_gbp was silently mispricing every non-gas property because primary_main_fuel_type / water_heating_fuel store the gov EPC API enum (26=mains gas, 27=LPG, 28=oil, 29=electricity) and our _FUEL_UNIT_PRICE dict is keyed by Table 32 codes (1=gas, 4=oil, 30=elec). Codes 26-29 hit the dict's default 3.48 p/kWh -- silently treating electric immersion as gas. Concrete impact on OX1 5LR Sep 2025 cert (worst-predicted SAP=41, model 84): water_heating_fuel=29 (electric immersion). Real DHW cost 2941 kWh * 13.19p = £388/yr; we computed 2941 * 3.48 = £102 (4x under). Net predicted_total_fuel_cost £292 vs implied real £2513 -- predicted_ecf 0.49 (~SAP 93) vs real ECF 4.24 (SAP 41). Effect: every off-gas property's predicted_ecf was systematically too low, dragging the model's catastrophic-low-SAP predictions toward mid-band. Expected to substantially reduce decile-0 bias on retrain. New _API_TO_TABLE32 map covers codes 0-29. 4 new AAA tests; VERSION 2.2.0 -> 2.3.0 (MINOR; behavioural fix to existing column values).	2026-05-17 17:02:21 +00:00
Khalim Conn-Kowlessar	4df1ee78b7	slice 17b: SAP Appendix J port for predicted_hot_water_kwh (v2.2.0) The 17a-baseline residuals showed cylinder_insulation_thickness_mm, cylinder_size and cylinder_insulation_type at ranks 3/6/9 for hot_water_kwh because the crude 16d formula didn't use them -- the model had to learn storage physics from raw features. Now predicted_hot_water_kwh sums: useful_demand (existing, unchanged) + distribution_loss = useful * 0.15 + storage_loss = volume * insulation_factor * 365 * 0.6 (volume from cylinder_size, factor from cylinder_insulation_thickness_mm or age-default) + primary_circuit_loss = 245 (age A-J) / 60 (age K-M) - wwhrs_credit = useful * 0.12 if number_baths_wwhrs > 0 - solar_hw_credit = 250 if solar_water_heating all / efficiency_water = delivered kWh Same inputs we already extract; just plumbed through. Expected: predicted_hot_water_kwh feature usage jumps from rank 10 to top tier, hot_water_kwh MAPE drops from 7.17%, and predicted_ecf gets tighter for gas-heat + electric-DHW mid-band homes -> SAP MAPE marginally better. 5 new AAA tests; VERSION 2.1.0 -> 2.2.0 (MINOR; column semantics enriched).	2026-05-17 15:54:42 +00:00
Khalim Conn-Kowlessar	06ce3205b1	slice 17a: PV-export credit in predicted_total_fuel_cost (v2.1.0) Closes the high-SAP under-prediction gap diagnosed in 16h. 40% of SAP-85+ properties have PV; predicted_ecf was 1.74 mean at that band -> SAP ~88 via the formula, vs label SAP 90+. Inverse: PV homes had HIGHER predicted_ecf than non-PV at the same band because cost reconstruction had zero export credit. New helper: predicted_pv_generation_kwh(kWp, region) -> kWh/yr from a SAP10.2 Table 6e regional yield factor (UK avg 850 kWh/kWp/yr; Highland 650; Thames 920). predicted_total_fuel_cost_gbp now subtracts pv_kwh * standard electricity price (Table 32 code 30, both self-consumption and export at 13.19 p/kWh). New feature column predicted_pv_generation_kwh exposed alongside the adjusted cost so the model sees both signals. VERSION 2.0.0 -> 2.1.0 (MINOR: column added; existing column semantics shifted but pre-deploy so no consumer break).	2026-05-17 15:28:09 +00:00
Khalim Conn-Kowlessar	6072d8795a	slice 16i: MAE + RMSE in metrics; sample_weight_fn + low_sap_tail_weight train_baseline now returns mae + rmse alongside mape/smape/r2. MAE is the user-facing metric ("predicted SAP within N points"); RMSE the quadratic counterpart. Both come straight from sklearn. New sample_weight_fn parameter: callable(y_train) -> per-row weights. Threads into LGBMRegressor.fit's sample_weight argument. Default None preserves existing behaviour. Default tail strategy exposed as low_sap_tail_weight(y, threshold=58, weight=3): 3x weight where SAP < 58. Threshold picked from slice 16h's per-decile residuals — decile 0 (SAP 1-58) carries 17% MAPE vs <5% body. Three TDD tracers, all AAA.	2026-05-17 14:48:00 +00:00
Khalim Conn-Kowlessar	ece1279475	revert slice 16g: drop mape objective per 16h ablation 250k retrain showed objective='mape' loses ~0.6 percentage points of global sap_score MAPE (3.92% with regression vs 4.50% with mape) and ~0.7 pts on peui_ucl. The mape objective over-weights the low-SAP tail (weight ~1/y) and drags the body MAPE up by more than it gains in the tail. Body MAPE on v16 features is already strong (2.38% on deciles 1-8); the remaining tail bias at decile 0 (SAP<58, +3.1 bias) needs a different fix -- sample weights or stratified loss -- queued as slice 16i.	2026-05-17 14:34:04 +00:00
Khalim Conn-Kowlessar	05ef54bb02	restore transaction_type; keep tenure dropped (v2.0.0 stands) User reverted the transaction_type drop after noting that it doesn't help detect full-SAP assessments (that's `assessment_type` on the bulk-register record, filtered out at build_features.py:37). tenure removal stays; v2.0.0 still MAJOR (a column was removed).	2026-05-17 12:41:14 +00:00
Khalim Conn-Kowlessar	6aa3ddfbf4	drop tenure + transaction_type from features (v2.0.0) Neither field physically affects SAP rating; they're dataset-side metadata (owner-occupied vs rented, sale vs marketed) and any correlation with sap_score is confounded with age/condition that the model already sees through built_form / property_type / construction_age_band. Dropping reduces feature count and removes a source of spurious split-gain. MAJOR per ADR-0007 versioning policy (column removal): 1.0.0 -> 2.0.0.	2026-05-17 12:37:52 +00:00
Khalim Conn-Kowlessar	e8b6f19a3a	fix(16d): predicted_lighting_kwh handles None bulb counts EPC bulb-count fields are Optional[int]; 1k-cert sanity-check from slice 16h hit None + None TypeError. Coerce to 0 before sum.	2026-05-17 12:25:59 +00:00
Khalim Conn-Kowlessar	700ff4640c	slice 16g: LightGBM objective=mape for sap_score + peui_ucl Per ADR-0008: the v15 baseline reports MAPE but optimises MSE, which under-weights tail rows. Switching to objective='mape' applies gradient proportional to 1/\|y\| and lets the model focus where MAPE penalises. Targets co2_emissions, space_heating_kwh, hot_water_kwh, and peui_raw retain the default 'regression' objective (some rows have ~zero CO2 from heavy PV; MAPE objective destabilises near zero). Sample weights deferred to slice 16i if slice 16h's per-decile residuals still show tail bias after the objective switch.	2026-05-17 12:06:13 +00:00
Khalim Conn-Kowlessar	5c20e323da	slice 16f: rename secondary_dwelling_* -> extension_1_* (v1.0.0 MAJOR bump) 12 columns renamed; extension_2_* not added (88% null on 250k corpus; envelope_heat_loss_w_per_k already sums extension_2+ via part-iterator). ADR-0008. VERSION 0.4.0 -> 1.0.0 (MAJOR per ADR-0007 versioning policy). Coordinated cutover with AutoGluon repo + scoring lambda required at deploy time. features_v16.txt is regenerated from transform.schema() at write-parquet time (data/ml_training is gitignored; not committed).	2026-05-17 12:05:01 +00:00
Khalim Conn-Kowlessar	cda469dd7d	slice 16e: predicted_total_fuel_cost / predicted_ecf / predicted_log10_ecf ECF reconstruction per SAP10 §20.1 (Mid physics, ADR-0008): total_cost_gbp = (space_kwhp_space + dhw_kwhp_dhw + light_kwhp_elec) / 100 ECF = 0.42 total_cost / (TFA + 45) log10_ecf = log10(ECF) [0 for non-positive] p_* are Table 32 unit prices via fuel_unit_price_p_per_kwh. Standing charges deliberately omitted (constant fuel-mix offset; ADR-0008). predicted_sap_score is NOT emitted as a feature (ADR-0008 Mid not Deep): the model is left to learn the piecewise log/linear transform from log10_ecf -> SAP itself, keeping the data layer SAP-version-agnostic. VERSION 0.3.0 -> 0.4.0 (MINOR).	2026-05-17 12:00:06 +00:00
Khalim Conn-Kowlessar	eee5421112	slice 16d: predicted_space/hot_water/lighting_kwh + seasonal-efficiency features New module domain.ml.demand emits crude annual demand approximations (ADR-0008 "crude annual"): predicted_space_heating_kwh = HLC * HDH_region * 1e-3 / efficiency_main predicted_hot_water_kwh = SAP10.2 J simplified (Vd, dT, +10% losses) predicted_lighting_kwh = 9.3 * TFA reduced by LED/CFL share HDH lookup covers SAP10.2's 22 regions; fallback UK avg = 53,000 K*h/yr. Plus two seasonal-efficiency features straight off the Table 4a/4b lookup from slice 16b (seasonal_efficiency_main_heating / seasonal_efficiency_water_heating). Wired into to_row; VERSION 0.2.0 -> 0.3.0 (MINOR).	2026-05-17 11:57:29 +00:00
Khalim Conn-Kowlessar	fca8815991	slice 16c: envelope_heat_loss_w_per_k feature New module domain.ml.envelope sums Sigma(UA) + yA_exposed across every sap_building_part on a cert. U-values come from rdsap_uvalues' cascade defaults, so the feature is never null. Per-part inputs: wall / roof / floor / party-wall / windows / doors. Windows + doors are apportioned to the main part (first in the list) per RdSAP10 convention. Wired into EpcMlTransform.to_row; transform VERSION 0.1.0 -> 0.2.0 (MINOR bump for an additive column per the ADR-0007 policy). 7 envelope unit tests + 2 transform-level tests, all AAA. Reference geometry: 100 m^2 age-G mid-terrace -> ~208 W/K; doubles for two storeys; drops with better insulation; sums across extensions.	2026-05-17 11:53:43 +00:00
Khalim Conn-Kowlessar	67a4f92d53	slice 16b: sap_efficiencies.py with Table 4a/4b/32 lookups Encodes SAP10.2 Table 4a (heating-system code -> space-eff %), Table 4b (gas/oil boiler winter eff %), and Table 32 (fuel-code -> p/kWh). Helpers: - seasonal_efficiency(code) -> decimal; unknown -> 0.80 (gas-boiler typical) - water_heating_efficiency(water_code, main_code) -> decimal; codes 901/914 inherit the main code's efficiency - fuel_unit_price_p_per_kwh(fuel_code) -> p/kWh; unknown -> 3.48 (mains gas) All returns are total. Provides the seasonal-efficiency input to slice 16d and the price multipliers for slice 16e's cost reconstruction.	2026-05-17 11:45:40 +00:00
Khalim Conn-Kowlessar	8bd8f8a622	slice 16a: rdsap_uvalues.py with cascade-defaulting U-value helpers Encodes RdSAP10 Tables 6-9 (walls), 15 (party walls), 16+18 (roofs), 19+BS EN ISO 13370 (floors), 20 (upper floors), 21 (thermal bridging), 24 (windows), 26 (doors). Helpers (u_wall / u_roof / u_floor / u_window / u_door / u_party_wall / thermal_bridging_y) cascade through cert -> age-band default -> country default -> mid-range fallback so the envelope-heat-loss feature is never null. Mirrors the RdSAP "assume as-built if no evidence" rule. Country.from_code collapses EAW/GB/UK/unknown to ENG; SCT/NIR/WAL get explicit K-M overrides where Tables 7-9 diverge from Table 6 (England). 28 tests, all AAA, cover the reference values and the cascade fallbacks.	2026-05-17 11:36:39 +00:00
Khalim Conn-Kowlessar	f61d74a327	docs: ADR-0008 physics-as-feature + v16.0.0 schema bump Captures the slice-16 plan decisions before code lands: - Mid-physics: predicted_ecf + predicted_log10_ecf, NOT predicted_sap_score - Cost scope: heating + DHW + lighting (no PV/pumps/secondary) - Crude annual heat-demand calc (HLC * HDH / efficiency) - Cascade-defaulting U-value imputation - envelope_heat_loss_w_per_k sums all parts; extension_1 only as discrete features (88% null drops extension_2) - v16.0.0 MAJOR bump (rename secondary_dwelling_* -> extension_1_*); coordinated cutover with AutoGluon repo + scoring lambda - LightGBM objective="mape" for sap_score+peui_ucl in 16g; sample weights deferred	2026-05-17 11:20:40 +00:00
Khalim Conn-Kowlessar	fd8d71eb05	slice 15e: per-decile residuals reporting in train_baseline Adds `_per_decile_residuals` and writes `residuals_<target>.json` next to metrics.json. Buckets test-set rows by deciles of the true target value; each bucket carries count + MAPE + MAE + mean residual + true_min/max. Lets us tell whether errors concentrate in the tails of the true distribution (e.g. SAP<40 / SAP>85) vs the mid-band — which the global MAPE alone hides. Baseline for slice 16's MAPE-improvement ablations.	2026-05-17 11:18:40 +00:00
Khalim Conn-Kowlessar	195336b7e1	slice 15d: +50 features (gap fill + secondary building part); drop 2 derived Removes: - environmental_impact_current (SAP-derived rating, leaks into co2 target) - energy_rating_average (average of sap_score + potential, direct leak) Adds: Doors draughtproofed_door_count, insulated_door_u_value Hot water cylinder_insulation_type, cylinder_thermostat, secondary_heating_type Ventilation mechanical_vent_duct_placement, _duct_insulation, _duct_insulation_level, _measured_installation Lighting low_energy_fixed_lighting_bulbs_count, fixed_lighting_outlets_count, low_energy_fixed_lighting_outlets_count Windows window_avg_glazing_gap_mm, window_avg_frame_factor, window_pct_permanent_shutters_insulated Main dwelling room_in_roof_floor_area_m2, alternative_wall_count, alternative_wall_area_m2, flat_roof_insulation_thickness_mm, wall_thickness_measured Element counts wall_count, roof_count, floor_count, main_heating_count_elements, main_heating_controls_present Wind wind_turbine_hub_height_m, wind_turbine_rotor_diameter_m Flat flat_unheated_corridor_length_m Addendum addendum_stone_walls, addendum_system_build, addendum_numbers_count LZC lzc_energy_sources_count Secondary part secondary_dwelling_present + 11 fabric features (wall/roof/floor construction + insulation + thickness + area + heat-loss perimeter) + other_building_parts_count Wires through schema -> domain -> mapper: adds Addendum dataclass, lzc_energy_sources, mechanical_vent_duct_insulation_level. Also fixes _measurement_value to accept raw dicts (from_dict left some Measurement fields as dict when they weren't typed as a dataclass). Results at N=25,000 2026 RdSAP certs: sap_score MAPE=0.043 sMAPE=0.036 R^2=0.891 co2_emissions sMAPE=0.106 R^2=0.929 peui_raw MAPE=0.087 sMAPE=0.084 R^2=0.860 peui_ucl MAPE=0.079 sMAPE=0.076 R^2=0.866 space_heating_kwh MAPE=0.112 sMAPE=0.108 R^2=0.947 hot_water_kwh MAPE=0.071 sMAPE=0.069 R^2=0.854 (+0.082 R^2 vs 15b) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-17 10:13:03 +00:00
Khalim Conn-Kowlessar	a1f89b6033	slice 15c: stream build_features so 500k+ cert runs fit memory Previously kept the full list of EpcPropertyData in memory before calling EpcMlTransform.to_rows. For the 25k slice that's ~30 MB; for the 580k full-2026 corpus it OOM-killed the process silently. Now: parse cert -> to_row -> append dict -> drop EpcPropertyData reference, so memory is O(row-dict * n) instead of O(EpcPropertyData * n). Same end-of-frame post-processing (categorical casts, column-order pin). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-17 00:36:53 +00:00
Khalim Conn-Kowlessar	9f6f7608b9	slice 15b: +18 features — heating type code, hot water, windows, flat, supply Heating: primary_sap_main_heating_code (the SAP10 heating-system enum was the single biggest missing input), primary_emitter_temperature, primary_main_heating_fraction. Hot water: immersion_heating_type, shower_outlet_count. Windows: window_pct_living, window_pct_external, window_pct_permanent_shutters (area-weighted shares parallel to existing window aggregates). Dwelling: conservatory_type, has_heated_separate_conservatory. Flat-only block (sap_flat_details): flat_level, flat_top_storey, flat_storey_count, flat_location, flat_heat_loss_corridor (int sentinels like '20+' coerce to None for the categorical features). Energy supply: meter_type, pv_connection, wind_turbines_terrain_type. Also plumbs `air_tightness` EnergyElement, `sap_flat_details` and `has_heated_separate_conservatory` through the 21.0.1 mapper path (they were silently None before). Results at N=25,000 2026 RdSAP certs: sap_score MAPE=0.044 sMAPE=0.038 R^2=0.884 (+0.045 R^2 vs 15a) co2_emissions sMAPE=0.108 R^2=0.925 peui_raw MAPE=0.092 sMAPE=0.088 R^2=0.849 peui_ucl MAPE=0.081 sMAPE=0.078 R^2=0.860 space_heating_kwh MAPE=0.111 sMAPE=0.108 R^2=0.945 hot_water_kwh MAPE=0.081 sMAPE=0.079 R^2=0.772 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-17 00:08:11 +00:00

1 2 3 4 5 ...

4832 commits