Model/scripts
Khalim Conn-Kowlessar afabfa0147 feat(modelling): sample a year from the EPC bulk export, offline-ready
fetch_epc_bulk_sample streams certificates-<year>.json out of the bulk ZIP via
range requests, keeps the first N SAP-version matches, and writes each cert's
inner document to <out>/<cert>.json for run_property_report. Stops after N, so
only the member prefix transfers, not the 15.7 GB archive (RangeFile.bytes_read
reports the true transfer vs the absolute ZIP offset). Verified on 2026: 100
SAP-10.2 certs -> report ran 81 scorable (MAE 2.03), 46 flagged, 19 raises
(11 full-SAP schema 19.1.0, 7 unmapped floor_construction 0/3, 1 missing
post_town) — real shadow-validation signal vs the curated golden 57.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 12:20:57 +00:00
..
download_cotality_evidence.py new methods for downloading all core files for pashub URL. Download currently not being authorised 2026-03-24 08:47:59 +00:00
fetch_cohort2_api_jsons.py Move sap10_calculator tests to tests/domain/sap10_calculator/ for CI 2026-06-02 16:58:00 +00:00
fetch_epc_bulk_sample.py feat(modelling): sample a year from the EPC bulk export, offline-ready 2026-06-04 12:20:57 +00:00
fetch_epc_dump.py feat(modelling): CLI to fetch an EPC dump + build the inspection report 2026-06-04 11:26:17 +00:00
historic_epc_demo.py added type hinting to uprn 2026-05-12 09:40:12 +00:00
init_db.py init db 2026-03-31 11:45:59 +00:00
rename_sharepoint_files.py rename files in sharepoint to desired structure 2026-05-20 16:26:07 +00:00
run_modelling_cohort.py feat(modelling): turnkey offline cohort script (tables + CSV) 2026-06-04 09:30:53 +00:00
run_property_report.py feat(modelling): CLI to fetch an EPC dump + build the inspection report 2026-06-04 11:26:17 +00:00
sero_address_list.csv add address list 2026-05-21 15:30:03 +00:00