fetch_epc_bulk_sample streams certificates-<year>.json out of the bulk ZIP via
range requests, keeps the first N SAP-version matches, and writes each cert's
inner document to <out>/<cert>.json for run_property_report. Stops after N, so
only the member prefix transfers, not the 15.7 GB archive (RangeFile.bytes_read
reports the true transfer vs the absolute ZIP offset). Verified on 2026: 100
SAP-10.2 certs -> report ran 81 scorable (MAE 2.03), 46 flagged, 19 raises
(11 full-SAP schema 19.1.0, 7 unmapped floor_construction 0/3, 1 missing
post_town) — real shadow-validation signal vs the curated golden 57.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The bulk endpoint 302-redirects to a 15.7 GB S3 ZIP with one NDJSON member per
year; each line wraps the per-cert payload in a stringified 'document' that
parses to the same RdSAP-Schema-21.0.1 shape from_api_response already handles.
parse_bulk_line unwraps a record; is_sap_version filters to SAP 10.2; RangeFile
exposes the S3 object as a seekable file so zipfile streams a single year's
member (and a sampler stops early) without downloading the whole archive.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
format_report_csv emits one comma-safe row per property: the calculator-error
fields (lodged/calculated/Δ/flag), the Plan headline figures (baseline+post
SAP/band, measures, cost+contingency, bill & CO2 savings, valuation %), the
flattened measure triggers, and any captured error — sortable in a spreadsheet
for a large dump.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
format_report_markdown emits: (1) cohort parity stats + a per-property
lodged-vs-calculated table flagging |Δ| > 0.5 (errors shown inline),
(2) Plans + costings (SAP/band jump, cost + contingency, bill & CO2 savings,
valuation uplift), (3) each fired measure with the EPC attributes that
triggered it.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
build_property_reports models a dump in order (errors captured per-cert);
parity_report_for aggregates the lodged-vs-calculated SAP across the cohort
into the existing ParityReport (MAE/RMSE/bias/worst-N), excluding certs that
couldn't be mapped or scored. Residual convention is the calculator's own
(predicted - actual), the negative of PropertyReport.sap_error.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Section 3 of the report: build_property_report now runs the Modelling stage
and, for every Plan Measure, records the EPC attribute(s) that caused its
generator to fire (MeasureTrigger) — wall_construction/insulation for cavity
fill, roof thickness for loft, floor thickness/construction for floors, the
absent mechanical kind for ventilation. Modelling raises are captured as
plan_error, independent of the calculator-error capture.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Section 1 of the property inspection report: PropertyReport compares the
cert's lodged energy_rating_current to Sap10Calculator's un-rounded SAP and
flags |Δ| > 0.5 (the ADR-0010/0013 shadow-validation design target). A
mapping/scoring raise is captured per-cert as calculator_error, never
propagated, so one bad cert can't abort the sweep.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
CertResult now carries its Plan (with flat baseline/post-SAP/measures
properties), and `format_cohort_csv` renders one browsable row per cert
(SAP transition, band, measures, cost, bill saving, valuation %, error).
`scripts/run_modelling_cohort.py` is turnkey: no args runs the committed
golden cohort, prints a sense-check table for the first measure-bearing
certs (a capped preview so a large dump doesn't flood the terminal), the
summary, and writes modelling_cohort.csv (gitignored). Point it at the
EPC dump when it lands.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
`harness.cohort.run_cohort(paths)` parses each API-shaped EPC JSON with
from_api_response and models it via run_modelling — no database, no
network — capturing per-cert errors instead of aborting the sweep, plus
`format_cohort_summary`. A thin `scripts/run_modelling_cohort.py` CLI
points it at a directory. Proven over the 57 golden API certs: 56 ran
offline, 15 produced measures, 1 errored (COAL has no Fuel Rates entry —
a BillDerivation coverage gap, not a harness one). Ready for the EPC dump.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two fixes that unblock offline, no-database inspection over an arbitrary
EPC dump:
- Complete the harness sample catalogue with loft_insulation and
solid_floor_insulation — the four fabric generators can emit five
Measure Types, but the catalogue priced only three, so an offline run
on a property with an uninsulated loft or solid floor raised mid-run.
A new test pins the catalogue to cover every generator Measure Type.
- Add `run_modelling(epc, ...)` — runs ONLY the Modelling stage (no
Ingestion / Baseline), so it needs no lodged recorded-performance / RHI
and inspects recommendations on any calculator-scorable EPC. `run_one`
(full pipeline) stays for when you want Baseline too.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The Plan derives its Valuation Uplift (ADR-0018) from its baseline -> post
band jump and works+contingency cost, given one external input — the
Property's current market value (a Property Valuation, mostly absent).
`Plan.valuation` / `Plan.baseline_epc_rating` are derived like the other
headline figures; `PlanModel.from_domain` maps the £ forms to the live
plan.valuation_* columns (NULL when no value — the percentage is not
persisted on those columns). `Property.current_market_value` is the new
optional source; the orchestrator threads it onto the Plan. `run_one`
takes a `current_market_value` so the harness can value the uplift, and
the sense-check table shows the average % (always) plus the £ forms when
known.
Sourcing the current market value (upload / default) remains deferred
(ADR-0018); it is None throughout until that lands, so the columns stay
NULL at scale.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Slice 3. `harness.console.run_one(epc, goal_band=...)` wires the full
AraFirstRunPipeline against in-memory fakes — no Postgres, no network —
runs one property, prints the sense-check table, and returns the Plan
for interactive poking from a REPL at the worktree root. Defaults to the
committed harness sample catalogue.
Refactors the slice-1 integration test to delegate to run_one (dropping
~70 lines of duplicated wiring + the now-unused test catalogue fixture),
so it exercises the shipped entrypoint rather than a parallel copy. The
new console test covers run_one's print/return contract.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Slice 2. `harness.plan_table.format_plan_table(plan)` renders a Plan as a
plain-text table — one package summary line (baseline SAP/band -> post
SAP/band, CO2 saved, cost of works + contingency, bill saved) and one
line per Plan Measure (signed SAP points, cost, delivered kWh + £
savings). Pure presentation: reads the Plan, computes nothing. The
DB-less First Run test now prints it (visible under `pytest -s`) so the
modelled package can be eyeballed and debugged by hand.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>