chore(epc-prediction): grow validation corpus to 40 postcodes (ADR-0029)

Bump N_POSTCODES 150 -> 40 as the gradual-growth step from the 3-postcode
smoke. 40 postcodes / 1113 certs / 578 leave-one-out predictions is enough
for stable, trustworthy metrics (the smoke's 2 usable postcodes were
dominated by oddball flats — floor_area mean|.| 52.6 there vs 12.7 here).
Resumable + reproducible (random.seed(2026)); raise again to scale up.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Khalim Conn-Kowlessar 2026-06-14 01:52:44 +00:00
parent fa11df56c2
commit c3d56b00dd

View file

@ -62,7 +62,7 @@ CACHE.mkdir(parents=True, exist_ok=True)
WINDOW = {"date_start": "2026-01-01", "date_end": "2026-05-31"}
TOTAL_PAGES = 7402
SEED_PAGES = 20 # random search pages → postcode seeds
N_POSTCODES = 150 # distinct postcodes to pull full cohorts for
N_POSTCODES = 40 # distinct postcodes to pull full cohorts for
random.seed(2026) # reproducible draw