Model

mirror of https://github.com/Hestia-Homes/Model.git synced 2026-07-27 23:35:01 +00:00

Author	SHA1	Message	Date
Khalim Conn-Kowlessar	67af2e9b43	docs: handover for §8c Space cooling + 000490 SAP-score diagnostic Two tickets in order for the next agent: 1. Ticket A — Investigate the 000490 +3 SAP overshoot. Corrects the previous agent's claim that "wiring water_heating_from_cert is the easy win"; that's already done. Real driver is the boiler efficiency cascade selecting 0.80 instead of the PDF Manufacturer-declared 0.882 (Vaillant Ecotec Pro). Time-boxed diagnostic; flag and defer if expensive. 2. Ticket B — §8c Space cooling (xlsx rows 435-466, lines (100)..(108)). All 6 Elmhurst fixtures = 0 cooling. Small slice; mirror §8 pattern. Includes spec anchors (Qcool formula sign, Jun-Aug inclusion rule), codebase pointers, slice plan, and the standard "do not" list. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-20 23:03:15 +00:00
Khalim Conn-Kowlessar	bb827803ac	docs: SPEC_COVERAGE §8 row flip to Full + slice progress table §8 Space heating requirement: Partial → Full. Six Elmhurst fixtures conform end-to-end on (95)..(99) at 5e-2..1e-1 kWh per month; tolerances reflect 4-d.p. fixture pin propagation, not physics drift. Spec inclusion rule (Jun..Sep summer clamp) now applied; 000490 SAP-score gap to PDF=57 documented (currently 60 — closes incrementally as §3 / §4 / §5 upstream precision tightens). Also renumbers the §9 row to "Energy requirements per heating system" (its SAP10.2 worksheet title) — the previous "§9 Space heating" entry conflated §8 and §9. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-20 22:55:17 +00:00
Khalim Conn-Kowlessar	eec8fb6f4f	docs: SPEC_COVERAGE §7 row flip to Full + slice progress table §7 Mean internal temperature: Partial → Full. Six Elmhurst fixtures conform end-to-end on (85)..(94) to ≤5e-3 °C / unitless on every per-zone line ref every month (588 monthly assertions GREEN). Slice progress table records the chain from per-zone η fix through legacy deletion. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-20 21:48:29 +00:00
Khalim Conn-Kowlessar	34f4fa8bef	docs: SPEC_COVERAGE §6 row flip to Full + slice progress table §6 Solar gains: Partial → Full. Six Elmhurst fixtures conform end-to-end on (83) total solar gains and (84) total gains to ≤5e-3 W on every month (144 monthly assertions GREEN). Slice progress table records the chain from tracer Z-solar lookup through legacy deletion. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-20 21:03:06 +00:00
Khalim Conn-Kowlessar	29feee7869	docs: handover for §6 Solar gains agent Captures the §5 implementation pattern (slice-per-test/impl/commit, ALL_FIXTURES e2e conformance, frozen Result dataclass, calculator.py wiring) and the SAP10.2 / Table 6d gotchas that cost time during §5 (Z_solar vs Z_L columns, rooflight Z=1.0, existing modules untrusted). Hard constraints documented for the next agent: - 6-fixture conformance ≤5e-3 W on every line (do not loosen tests). - Stop and ask the user after ~15 min of unsuccessful reconciliation or before scanning more than ~50 lines of spec PDF. - Don't touch the untracked `sap worksheets/` folder. Surfaces the pre-grilling unknowns the §6 agent should propose recommended answers for during `/grill-me`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-20 19:29:30 +00:00
Khalim Conn-Kowlessar	52a11f5e74	docs: SPEC_COVERAGE — rooflight Z_L=1.0 closed, §5 to ≤5e-3 W everywhere Slice 13 (`380115e2`) closed the only remaining §5 conformance bias. Promote that item from "remaining" → "done" in the §5 slice progress table, tighten the conformance summary to "every line ≤5e-3 W", and shift "rooflight derivation from cert" up as a forward-looking item (orchestrator accepts the arg but cert_to_inputs always passes 0). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-20 19:26:28 +00:00
Khalim Conn-Kowlessar	2d4fa24de9	docs: §5 close — SPEC_COVERAGE flip from "Full" stub to actual full The pre-§5-rebuild SPEC_COVERAGE row optimistically marked §5 as Full when only 4 of 8 worksheet lines were implemented and the lighting path used the L5b/L8c fallback (≈22 W/month bias for typical cert lodgings). Updates the §5 row with the actual coverage post-rebuild: worksheet-driven (66)..(73), Table 5 Column A throughout, Table 5a 9-row dispatch with heating-season mask, Appendix L L1-L12 lighting including RdSAP §12-1 per-lamp-type defaults + Table 6d Z_L light access factor, and orchestrator wired into cert_to_inputs + calculator. Adds a §5 slice progress table mirroring §4's format, with the 12-slice commit chain and the remaining work (rooflight Z_L=1.0, cert-driven fan/PIV/HIU dispatch, frame/glazing string parsing, Column B reduced-gain forms for new-build). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-20 19:15:43 +00:00
Khalim Conn-Kowlessar	6a3552a50d	docs: §4 slice progress — happy path closes both non-RR fixtures Updates SPEC_COVERAGE.md with the 9 §4 slices landed since the last doc sweep, and lays out the remaining work in priority order: 1. §4 orchestrator (water_heating_from_cert) 2. Wire calculator.py to the new worksheet module 3. End-to-end SAP score validation against Elmhurst worksheets 4. Cylinder + solar + renewables branches (population coverage) 5. PCDB-backed Table 3b/3c combi loss (000474 sits here) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-20 16:15:18 +00:00
Khalim Conn-Kowlessar	d90827446a	docs: sweep stale handover, mark §3 Full, scaffold §4 slice plan §3 close (LINE_31/33/36/37 exact for both non-RR Elmhurst worksheets) is now landed across slices 344a9c9d..cf244762. HANDOVER_S3_CLOSE.md was written as a mid-stream working brief; with §3 done it now creates doc rot, so it's removed in favour of SPEC_COVERAGE.md as the single source of truth. SPEC_COVERAGE.md updates: - §3 marked Full (non-RR); RR sub-area deferral noted - §4 carries the ordered slice plan for the worksheet-driven rewrite (xlsx rows 207–304, line refs (42)..(65)) - Hierarchy callout: the canonical SAP10.2 algorithm lives in the repo-root xlsx, not in any handover doc Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-20 15:18:46 +00:00
Khalim Conn-Kowlessar	49e8c65ae8	Handover: replace stale docs with focused §3-close + Table-11 brief Delete HANDOVER_FRESH_REVIEW (22-slice, MAE-5.34 era) and HANDOVER_SYSTEMATIC_REVIEW (pre-Elmhurst-conformance). Both described a state the Elmhurst worksheet work has since superseded. Add HANDOVER_S3_CLOSE.md with: - Accurate §3 status: §1/§2 fully done; LINE_31/LINE_36 exact for non-RR fixtures; LINE_33 gap diagnosed as missing floor_construction codes (not a window-area problem as previously assumed) - Concrete investigation steps to close LINE_33 for 000474 + 000490 - Table 11 Secondary Heating framed as next slice after §3 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 13:03:09 +00:00
Khalim Conn-Kowlessar	a1c9d2a14d	Record post-P5 parity-probe baseline (2026-05-19) 100-cert probe, seed=7, sap_score window 5..99. MAE 4.29 (vs 8.41 on 2026-05-18 with the older 20..95 window — the delta blends calculator improvements with sample-window change, so this is logged as the post-P5 reference, not as "P5 reduced MAE".) P5 itself was pure trace exposure; the calculator's SAP output should be numerically unchanged. The headline finding from this run is primary-energy over-prediction: PE MAE 44.40 kWh/m², bias +39.66 — now the dominant signal with SAP residuals halved. Each end-use PE contribution surfaces on SapResult.intermediate per P5.12, so the next session can localise the bias without re-instrumenting.	2026-05-19 16:19:01 +00:00
Khalim Conn-Kowlessar	411c477d09	P5.14: SAP 10.2 worksheet trace + RdSAP10 deflator drift note Closes the second half of P5 (HANDOVER_SYSTEMATIC_REVIEW §2.5): - Adds test_bre_worked_examples.py — one comprehensive test that locks every published SapResult.intermediate key against its SAP 10.2 worksheet item number ((4) TFA, (33) fabric heat loss, (39) HTC, (40) HLP, (73) gains, (93) mean internal temp, (98c) space heating, (240e/247/250) costs, (252) PV credit, (256) deflator, (257) ECF, (261-272) per-end-use CO2, (275-287) primary energy per m²). All formulas derived independently from the worksheet pages 131-148; passes against the synthetic 100 m² baseline. - Explicit caveat in module docstring: BRE-published worked examples don't exist in any of the three SAP-spec PDFs we have (rdSAP10, SAP10.2, SAP10.3 — all greppped). The test is spec-formula-derived, not BRE-validated. Structure stays if BRE numbers surface later; only expected values change. Also surfaces and documents an RdSAP10 spec drift in PARITY_FINDINGS.md: Table 32 (page 95 of rdSAP10) gives Energy Cost Deflator = 0.42, vs the code's 0.36 (SAP10.2 Table 12, worksheet item (256)). Not changed in P5 — needs ADR-level resolution on whether the calculator targets SAP10.2 (0.36) or RdSAP10 (0.42) ratings. P5 (SapResult.intermediate population + BRE worked-example fixtures) is now complete on this branch.	2026-05-19 15:32:42 +00:00
Khalim Conn-Kowlessar	bb9c5ac017	docs: ADR-0010 retargets calculator to SAP 10.2; rewrite handover Adds ADR-0010 superseding ADR-0009's spec-version target, PCDB sequencing, and cert-calibration layer. Captures the conclusions of a grill-with-docs session: 1. Active spec target is SAP 10.2 (14-03-2025), not SAP 10.3 — no SAP-10.3-lodged certs exist in the corpus to validate against. 2. table_12_cert_calibration is deleted (not "re-derived at the end"). It was pre-March-2025 spec prices fit against a mixture distribution of two spec-version regimes, with downstream- component bugs absorbed into the fit — not Elmhurst deviation. 3. Validation Cohort: filter the corpus to inspection_date ≥ 2025-07-01 so every cert in the probe was lodged on SAP 10.2 (14-03-2025) prices. One spec, one signal. 4. PCDB integration is promoted from "Session C deferred" to prerequisite P4 — dominates residual variance on heat pumps and the 78% of gas-boiler certs lodging main_heating_data_source=1. 5. Trace mode (SapResult.intermediate) and BRE worked-example fixtures replace the 7 cert-based golden fixtures, which contained compensating errors. 6. Strict-type EpcPropertyData via codes.csv-derived canonical enums (P6) — the in-source motivation lives at dimensions.py:74-82 (Khalim's comment, included in this commit). 7. Worksheet-faithful structure is a sweep-time principle: each worksheet module mirrors SAP 10.2 worksheet line numbering. CONTEXT.md additions: - Refined "Calculated SAP10 Performance" and "SAP10 Calculation" to reference SAP 10.2 + ADR-0010. - New term "SAP Spec Version" — domain-meaningful because the same EpcPropertyData yields different sap_score under different spec revisions. - New term "Validation Cohort" — the version-locked sub-corpus. HANDOVER_SYSTEMATIC_REVIEW.md is rewritten section-by-section to reflect ADR-0010: §1 framing, §2 status pointer, new §2.5 with the six prerequisites P1–P6 in dependency order, §3 diagnosis (cert-cal was stale prices, not Elmhurst deviation), §4 scope (PCDB IN, SAP 10.3 stays OUT), §5 approach (worksheet-faithful principle as §5.5), §7 tension dissolved, §7b findings re-framed, §8 dead-ends re-classified as conditional, §9 cohort filter, §10 fixture strategy, §11 trace mode as prerequisite, §12 prereqs-first, §13 Phase 0/Phase 1 workflow, §14 ADR-0010 reference, §15 final note. P2.1 (commit `ac1aa56a`) already lands the first ADR-0010 slice (probe swap to spec prices). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-19 09:54:24 +00:00
Khalim Conn-Kowlessar	377962f8bd	docs: strengthen handover with §7b outstanding findings + PCDB roadmap §7b "Outstanding findings to pick up during the systematic pass" collects spec-correct fixes that were reverted because they regressed SAP MAE against the corpus — but the spec basis is unambiguous and they WILL be the right answer once cert-calibration is re-derived. Treat as TODOs, not dead-ends. Documents: Finding 1 — HW cylinder zero-loss for combi (PE MAE -6.64 measured) Finding 2 — Standing charges Table 12 note (a) Finding 3 — Cat=10 room-heater Table 12a fractional blending Finding 4 — Lighting Appendix L proper (L1-L12 cascade) Finding 5 — Internal-gains Table 5 water-heating + losses rows Finding 6 — Storage-loss-factor table values 3× off spec Finding 7 — Heat-pump fallback (needs PCDB) Finding 8 — Smaller gaps carried forward Each documents the spec section/page reference, the current code bug, empirical impact where measured, and when to pick up during the section-by-section sweep. PCDB section strengthened from "deferred to Session C" to an explicit roadmap: data source URL, lookup key (main_heating_index_number), fields needed, recommended sequencing (after spec sweep so cert-cal is re-derivable), and why-not-now (cert-cal currently masks PCDB gaps). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-19 07:35:19 +00:00
Khalim Conn-Kowlessar	3363f63f5e	docs: handover for systematic section-by-section RdSAP 10 review The slice-by-slice "fix the biggest residual" approach has hit a ceiling at SAP MAE ~4.6 because the cert-calibration prices absorb multiple structural deviations from spec. Any spec-correct fix in one component breaks the calibration for others. Three failed slices this session (standing charges, cat=10 routing, combi zero-loss) made the pattern unambiguous. Pivot: systematic section-by-section spec verification. Read the RdSAP 10 + SAP 10.2 spec in order, check each table / formula / footnote against the corresponding code, fix gaps one at a time. Build the spec-correct engine first; re-derive cert-cal calibration once at the end as a thin Elmhurst-compatibility layer. Handover doc covers: - Critical framing (deterministic, not assessor judgement) - Current state (SAP MAE 4.61, PE MAE 43.32 at `f4a8d2a0`) - Why the slice-by-slice approach won't converge - Scope decisions (RdSAP 10 + SAP 10.2 only; park full-SAP + PCDB) - Section-to-code mapping - Known dead-ends to skip - Cert-calibration vs spec-correctness tension and how to resolve it - The 7 golden fixtures and their compensating-error caveats - Trace mode recommendation (ADR-0009's `intermediate` field) - Specific §1-3 starting tasks - Workflow recap Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-19 07:30:27 +00:00
Khalim Conn-Kowlessar	743f77d54c	docs: handover for fresh-context SAP calculator review Per user suggestion: the iteration history in this chat has likely accreted blind spots that a long context window can't shed (e.g. I spent slices comparing our delivered kWh to the cert's primary kWh without noticing the apples-to-oranges error). A fresh agent reading the SAP 10.2 + RdSAP 10 PDFs cold against the current calculator may spot gaps faster. HANDOVER_FRESH_REVIEW.md gives the fresh agent: - Current state (MAE 5.34, primary-energy bias +51 kWh/m²) - Repo layout pointer - Priority-ordered dig list (PEUI mystery first) - Validated truths - Dead-end list (don't repeat S-B5 NI thickness switch etc.) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 19:03:40 +00:00
Khalim Conn-Kowlessar	3a3f9cacdf	docs: SAP 10.2 / RdSAP 10 spec coverage map Per user suggestion (switch from probe-driven to worksheet-driven iteration), enumerates the §§1-15 worksheet + Appendices A-U state in the calculator with a status grade and a prioritised gap list. Becomes the roadmap for Session B remaining slices. Next slice from this list: Table 11 secondary heating allocation — 10% fraction on most boiler-main certs that we currently model as 0. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 16:55:17 +00:00
Khalim Conn-Kowlessar	c74857ac14	slice S-B9: SAP 10.2/10.3 Table 12 spec-correct prices + Table 12a fix Verified against the SAP 10.2 spec (14-03-2025): Table 12 unit prices are IDENTICAL to SAP 10.3 Table 12. Both specs mandate (§12.2): "Fuel costs are calculated using the fuel prices given in Table 12. Other prices must not be used for calculation of SAP ratings." The legacy ML-pipeline prices in domain.ml.sap_efficiencies (3.48 gas, 13.19 elec, 5.50 E7-low) do NOT match either SAP 10.2 or 10.3 and appear to be a pre-2022 holdover. New module domain.sap.tables.table_12 carries the spec-correct values: mains gas: 3.64 (was 3.48 legacy) standard electricity: 16.49 (was 13.19) 7h-low / Economy-7: 9.40 (was 5.50) 24h-heating: 14.04 (was 6.61) Also corrects an S-B4 bug: SAP 10.2 Table 12a shows direct-acting electric heating (codes 191-196) runs at 90% high-rate on 7h tariffs, not 0% — only true storage heaters (401-409, 421-425) bill at the low rate. _E7_SPACE_HEATING_CODES narrowed accordingly. 100-cert parity probe with spec-correct prices: MAE 4.66 → 6.66 (regression vs legacy prices) bias -0.70 → -4.66 (over-counting cost) spec-correctness: SAP 10.2 verbatim The MAE regression confirms the corpus's lodged ratings were NOT calculated against the published SAP 10.2 Table 12 prices. The cert ratings appear to use the legacy lower prices despite reporting sap_version=10.2. Three paths forward documented in next commit's discussion thread. Also adds the SAP 10.2 spec PDF to docs/sap-spec/. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 15:14:11 +00:00
Khalim Conn-Kowlessar	dde8ae30fa	S-B2: parity probe + first-pass findings (100-cert baseline) Adds services/ml_training_data/src/ml_training_data/sap_parity_probe.py — samples N certs from the v18a corpus, streams them via BulkZipReader, runs Sap10Calculator, prints MAE/RMSE/bias + worst-N residuals. Baseline across 100 certs: MAE 8.41, RMSE 13.98, bias -2.65, 0 errors. docs/sap-spec/PARITY_FINDINGS.md captures the dominant failure pattern (flats + bungalows under-predicted, 10 of the worst-15 are flats whose floor/roof are party with neighbouring dwellings) and the priority- ordered Session B iteration backlog (S-B-flat-surfaces first). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 13:59:23 +00:00
Khalim Conn-Kowlessar	8dbe873daf	ADR-0009: pivot to deterministic SAP 10.3 calculator (Accepted) Promotes ADR-0009 from Proposed to Accepted after the grill-with-docs session resolved all seven open questions. Bundles the SAP 10.3 and RdSAP 10 specifications under docs/sap-spec/ plus a calculator design sketch (module layout, monthly-loop pseudo-code, status table). CONTEXT.md adds three new domain terms parallel to existing performance language: - Calculated SAP10 Performance (parallel to Effective / Lodged) - SAP10 Calculation (process; implemented by Sap10Calculator) - Measure Application (process; implemented by MeasureApplicator) ML pipeline is NOT retired — it stays as the residual head once the calculator reaches parity in Session B. ADR-0009 §"Grill outcomes" carries the seven binding scope decisions plus three Session-A-scope changes discovered during the grill (RdSAP §19 EER formula, SAP 10.2 Appendix A cross-reference, RdSAP Table 29 cascade defaults).	2026-05-17 21:27:21 +00:00
Khalim Conn-Kowlessar	f61d74a327	docs: ADR-0008 physics-as-feature + v16.0.0 schema bump Captures the slice-16 plan decisions before code lands: - Mid-physics: predicted_ecf + predicted_log10_ecf, NOT predicted_sap_score - Cost scope: heating + DHW + lighting (no PV/pumps/secondary) - Crude annual heat-demand calc (HLC * HDH / efficiency) - Cascade-defaulting U-value imputation - envelope_heat_loss_w_per_k sums all parts; extension_1 only as discrete features (88% null drops extension_2) - v16.0.0 MAJOR bump (rename secondary_dwelling_* -> extension_1_*); coordinated cutover with AutoGluon repo + scoring lambda - LightGBM objective="mape" for sap_score+peui_ucl in 16g; sample weights deferred	2026-05-17 11:20:40 +00:00
Khalim Conn-Kowlessar	611ff24eb6	scaffolding for ml pipeline	2026-05-16 14:15:56 +00:00
Khalim Conn-Kowlessar	d9c1696085	added architechtural decisions, added to prd	2026-05-13 21:26:18 +00:00

23 commits