Khalim Conn-Kowlessar
195336b7e1
slice 15d: +50 features (gap fill + secondary building part); drop 2 derived
...
Removes:
- environmental_impact_current (SAP-derived rating, leaks into co2 target)
- energy_rating_average (average of sap_score + potential, direct leak)
Adds:
Doors draughtproofed_door_count, insulated_door_u_value
Hot water cylinder_insulation_type, cylinder_thermostat,
secondary_heating_type
Ventilation mechanical_vent_duct_placement, _duct_insulation,
_duct_insulation_level, _measured_installation
Lighting low_energy_fixed_lighting_bulbs_count,
fixed_lighting_outlets_count,
low_energy_fixed_lighting_outlets_count
Windows window_avg_glazing_gap_mm, window_avg_frame_factor,
window_pct_permanent_shutters_insulated
Main dwelling room_in_roof_floor_area_m2, alternative_wall_count,
alternative_wall_area_m2, flat_roof_insulation_thickness_mm,
wall_thickness_measured
Element counts wall_count, roof_count, floor_count,
main_heating_count_elements, main_heating_controls_present
Wind wind_turbine_hub_height_m, wind_turbine_rotor_diameter_m
Flat flat_unheated_corridor_length_m
Addendum addendum_stone_walls, addendum_system_build,
addendum_numbers_count
LZC lzc_energy_sources_count
Secondary part secondary_dwelling_present + 11 fabric features
(wall/roof/floor construction + insulation + thickness
+ area + heat-loss perimeter) + other_building_parts_count
Wires through schema -> domain -> mapper: adds Addendum dataclass,
lzc_energy_sources, mechanical_vent_duct_insulation_level. Also fixes
_measurement_value to accept raw dicts (from_dict left some Measurement
fields as dict when they weren't typed as a dataclass).
Results at N=25,000 2026 RdSAP certs:
sap_score MAPE=0.043 sMAPE=0.036 R^2=0.891
co2_emissions sMAPE=0.106 R^2=0.929
peui_raw MAPE=0.087 sMAPE=0.084 R^2=0.860
peui_ucl MAPE=0.079 sMAPE=0.076 R^2=0.866
space_heating_kwh MAPE=0.112 sMAPE=0.108 R^2=0.947
hot_water_kwh MAPE=0.071 sMAPE=0.069 R^2=0.854 (+0.082 R^2 vs 15b)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 10:13:03 +00:00
Khalim Conn-Kowlessar
9f6f7608b9
slice 15b: +18 features — heating type code, hot water, windows, flat, supply
...
Heating: primary_sap_main_heating_code (the SAP10 heating-system enum was the
single biggest missing input), primary_emitter_temperature,
primary_main_heating_fraction.
Hot water: immersion_heating_type, shower_outlet_count.
Windows: window_pct_living, window_pct_external, window_pct_permanent_shutters
(area-weighted shares parallel to existing window aggregates).
Dwelling: conservatory_type, has_heated_separate_conservatory.
Flat-only block (sap_flat_details): flat_level, flat_top_storey,
flat_storey_count, flat_location, flat_heat_loss_corridor (int sentinels
like '20+' coerce to None for the categorical features).
Energy supply: meter_type, pv_connection, wind_turbines_terrain_type.
Also plumbs `air_tightness` EnergyElement, `sap_flat_details` and
`has_heated_separate_conservatory` through the 21.0.1 mapper path (they were
silently None before).
Results at N=25,000 2026 RdSAP certs:
sap_score MAPE=0.044 sMAPE=0.038 R^2=0.884 (+0.045 R^2 vs 15a)
co2_emissions sMAPE=0.108 R^2=0.925
peui_raw MAPE=0.092 sMAPE=0.088 R^2=0.849
peui_ucl MAPE=0.081 sMAPE=0.078 R^2=0.860
space_heating_kwh MAPE=0.111 sMAPE=0.108 R^2=0.945
hot_water_kwh MAPE=0.081 sMAPE=0.079 R^2=0.772
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 00:08:11 +00:00
Khalim Conn-Kowlessar
0ffda529ec
slice 15a: add wall/floor/roof + demand scalar features for retrofit simulation
...
15 new features wired through schema -> domain -> mapper -> transform:
Main Dwelling fabric (11):
- wall_insulation_type, wall_insulation_thickness_mm, wall_dry_lined,
wall_thickness_mm, party_wall_construction
- roof_insulation_location, roof_insulation_thickness_mm
- floor_construction, floor_insulation, floor_insulation_thickness_mm,
floor_heat_loss
Dwelling-level scalars (4):
- multiple_glazed_proportion, number_baths, number_baths_wwhrs,
extract_fans_count
Thickness strings like '50mm'/'NI'/'ND' parsed via _parse_thickness_mm; NI
(no insulation) lands as 0mm so the model sees the physical zero rather than
a missing value. Categorical sentinels ('NA'/'NI'/'ND') become None.
Also fixed long-standing typo `multiple_glazed_propertion` -> `_proportion`
in domain dataclass + its lone DB-model usage.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 22:08:27 +00:00
Khalim Conn-Kowlessar
c496f345f8
slice 14l: bigger-run fixes — UCL guard, PV Measurement coercion, sMAPE
...
Three changes surfaced by the 25k 2026 run:
- transform._peui_ucl returns None for non-positive raw PEUI (net-exporters).
apply_ucl_correction would otherwise raise ValueError on negative input.
- PhotovoltaicArray scalars (peak_power, pitch, orientation, overshading)
now accept Measurement | int | float in the schema; mapper coerces via
_measurement_value.
- train_baseline reports sMAPE alongside MAPE — handles zero-actual rows
(e.g. co2_emissions for net-zero certs) where MAPE explodes.
Results at N=25,000 RdSAP 2026 certs (~32s end-to-end):
sap_score MAPE=0.064 sMAPE=0.054 R^2=0.762
co2_emissions sMAPE=0.140 R^2=0.890
peui_raw MAPE=0.126 sMAPE=0.120 R^2=0.714
peui_ucl MAPE=0.114 sMAPE=0.108 R^2=0.736
space_heating_kwh MAPE=0.167 sMAPE=0.157 R^2=0.915
hot_water_kwh MAPE=0.089 sMAPE=0.086 R^2=0.737
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 21:15:37 +00:00
Khalim Conn-Kowlessar
6697a6c76e
slice 14j: Optional sweep across schema 21.0.1 + mapper guards
...
Across 500 real RdSAP-21.0.1 certs from 2026, mapper goes 0% -> 100% success.
Schema-loading + ml-transform + ml_training_data: 146 tests pass.
Mainly affected fields:
- SapHeating: instantaneous_wwhrs, shower_outlets (now Union with List shape)
- SapWindow: glazing_gap, frame_factor, pvc_frame, window_transmission_details
- SapEnergySource: pv_battery_count, wind_turbine_details, pv_batteries (List form)
- SapBuildingPart: all 13 sub-fields now Optional
- SapFloorDimension: Measurement | int | float fallback
- RdSapSchema21_0_1: 16 top-level fields (mechanical_vent_*, lighting counts, ...)
Mapper helpers added: _measurement_value, _first_pv_battery, _first_shower_outlet.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 20:35:28 +00:00
Khalim Conn-Kowlessar
ccb654c230
slice 14i: pin real RdSAP cert as fixture + RED regression test
...
Currently fails on SapWindow.glazing_gap (first of ~30 fields the dataclass
incorrectly treats as required). Will go GREEN once 14j sweeps Optional.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 20:23:29 +00:00
Khalim Conn-Kowlessar
b050348927
slice 10.5: PhotovoltaicArray on SAP10 schema + EpcPropertyData
...
SAP10 EPCs with measured PV carry photovoltaic_supply as a nested
list of arrays (peak_power, pitch, orientation, overshading) rather
than the legacy unmeasured wrapper {none_or_no_details:
{percent_roof_area: N}}. The schema-21 dataclasses now accept both
shapes via Union[PhotovoltaicSupply, List[List[PhotovoltaicArray]]],
and from_dict._coerce now dispatches list values onto list type
variants of multi-type Unions.
EpcPropertyData.SapEnergySource gains
photovoltaic_arrays: Optional[List[PhotovoltaicArray]] — populated
when the measured shape is present, otherwise None. The legacy
photovoltaic_supply field is preserved for the fallback case.
Both schema-21.0.0 and 21.0.1 mappers dispatch via the new
_map_schema_21_pv helper.
Unblocks Slice 11 (PV feature aggregation in EpcMlTransform).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 16:00:25 +00:00
Jun-te Kim
35d191c70e
merged from main and resolved pytest.ini confict
2026-05-12 12:54:28 +00:00
Jun-te Kim
197e9a0e00
added histroci_epc.csv
2026-05-11 15:21:16 +00:00
Jun-te Kim
6504785e7c
merged from main
2026-05-11 12:30:29 +00:00
Jun-te Kim
c9c43f178c
demo generated for use in address2uprn
2026-05-08 14:48:15 +00:00
Jun-te Kim
a39c3a0772
added added historic epc data class with shape
2026-05-08 12:03:35 +00:00
Jun-te Kim
74b7b87de6
load historic epc from csv 🟥
2026-05-07 16:22:41 +00:00
Jun-te Kim
02fb3afbe4
defined histrocial epc data shapre from csv
2026-05-07 16:04:01 +00:00
Khalim Conn-Kowlessar
d338be867b
added missing files
2026-04-25 22:41:57 +00:00
Khalim Conn-Kowlessar
3ed25030d4
added new api call for new epc api
2026-04-25 22:17:38 +00:00
Daniel Roth
968f025bc3
inspection date is date not str
2026-04-20 08:17:01 +00:00
Daniel Roth
cf088c36fe
Map to domain from 21.0.1 schema 🟩
2026-04-14 13:20:09 +00:00
Daniel Roth
d95be6ce65
first draft dataclasses with loading tests
2026-04-10 11:33:17 +00:00