Model/docs/HANDOVER_MODELLING.md
Khalim Conn-Kowlessar ed6cd9c11a docs(modelling): handover — parser gate cleared, #1154/#1158/#1159 closed
Records that the Elmhurst recommendation Summaries parse via the
extractor chain (not parse_site_notes_pdf), so the "parser gate" never
blocked the cascade pins. All four pins close at delta 0; loft 270→300
and the suspended-floor insulation-type field were the two gaps fixed.
Remaining: #1157 (HITL schema review) + ProductJsonRepository.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 09:43:24 +00:00

8.7 KiB
Raw Blame History

HANDOVER — Modelling stage rebuild

Branch: feature/bill-derivation (worktree /workspaces/home/hestia-worktrees/model-assemble-new-backend). HEAD: a0b6a952. PRD: GitHub Hestia-Homes/Model#1152, sliced into #1153#1161.

Issue status

Issue What State
#1153 Overlay Applicator + EpcSimulation closed (350f4c8e)
#1154 Package Scorer closed — Elmhurst cascade pin landed (4c0a907a)
#1155 wall Recommendation Generator closed (bb2c0068); cascade-pinned (4c0a907a)
#1156 score Options + attribution closed (13dd5fe8)
#1157 persist a Plan via ModellingOrchestrator not started — HITL (persistence-schema review)
#1158 roof (loft) generator closed — 270→300 mm + cascade pin (44d62c0c)
#1159 floor generator closed — overlay insulation-type field + solid/suspended pins (a0b6a952)
#1160 Optimiser (knapsack + greedy repair) not started (blocked by #1157)
#1161 Measure Dependency (ventilation) not started (blocked by #1160)

Parser gate — CLEARED (was the blocker; turned out not to be)

The Elmhurst recommendation Summaries route cleanly through the same chain the worksheet e2e fixtures use: pdftotext -layoutElmhurstSiteNotesExtractorEpcPropertyDataMapper.from_elmhurst_site_notes. Helper: tests/domain/modelling/_elmhurst_recommendation.py::parse_recommendation_summary. The parse_site_notes_pdf Textract path still throws 'Manufacturer' on cert 001431 (_extract_windows multi-token bug) — but the Modelling pins never use it, so it does not block this work. The before/after Summaries are mirrored into tests/domain/modelling/fixtures/ so the pins don't depend on the unstaged /workspaces/model workspace.

Cascade pins (test_elmhurst_cascade_pins.py) — all 4 at delta 0

Each pin: parse Elmhurst before Summary → drive the matching generator → score its Option's overlay through PackageScorer → assert abs(diff) <= 1e-4 on SAP/CO2/PE vs the calculator's score on the parsed after re-lodgement. Two real gaps surfaced and were fixed: loft Elmhurst re-lodges at 300 mm (generator was 270 → +0.17 SAP, now 300); suspended floor needed the overlay to also set floor_insulation_type_str='Retro-fitted' (calculator's sealed/ unsealed seal logic, cert_to_inputs.py:4111 — was +1.40 SAP). Cavity wall and solid floor closed at delta 0 with no generator change.

Design (already recorded — read these)

  • CONTEXT.md terms: Recommendation (a target surface; Recommendations partition the modifiable EPD surface so overlays never collide), Measure Option (bundle-capable; deduped by overlay), Simulation Overlay (EpcSimulation), Product, Cost, Contingency, Measure Dependency. Targeting: building parts by BuildingPartIdentifier; windows by index; systems direct.
  • ADR-0016: the three scoring roles (per-Option signal → whole-package re-score → final-package marginal cascade attribution) + warm-start MILP → dependency injection → package re-score → greedy repair. Resolves ADR-0005 §14.
  • Governing: ADR-0005 (multi-phase scenarios, per-phase recompute vs rolling Effective EPC), ADR-0011 (composable stage orchestrators), ADR-0012 (one Unit of Work per stage, commit once).

What's built

All in domain/modelling/, domain/building_geometry.py, repositories/product/, infrastructure/postgres/product_table.py. 25 tests green, pyright strict clean, purely additive.

  • simulation.pyEpcSimulation(building_parts: Mapping[BuildingPartIdentifier, BuildingPartOverlay]); BuildingPartOverlay (all-optional: wall_insulation_type, roof_insulation_thickness, floor_insulation_thickness).
  • overlay_applicator.pyapply_simulations(baseline, simulations) -> EpcPropertyData. Generic field-fold (adding overlay fields needs NO change here — proven by roof/floor), sequential (later overlay wins), deep-copies (baseline never mutated), targets parts by identifier, writes the sap_* fields. Returns a throwaway EPD.
  • recommendation.pyRecommendation(surface, options), MeasureOption(measure_type, description, overlay, cost), Cost(total, contingency_rate).
  • product.py / contingencies.pyProduct(measure_type, unit_cost_per_m2, contingency_rate); per-type contingency (cavity 0.10, loft 0.10, suspended floor 0.20, solid floor 0.26).
  • package_scorer.pyPackageScorer(calculator: SapCalculator).score(baseline, simulations) -> Score(sap_continuous, co2_kg_per_yr, primary_energy_kwh_per_yr). The reusable scoring primitive (role 2).
  • scoring.pymarginal_impacts(scorer, baseline, overlays) -> list[MeasureImpact] (telescoping cascade, role 3); independent_option_impacts(scorer, baseline, options) (role 1, scores each distinct overlay once). MeasureImpact(sap_points, co2_savings_kg_per_yr, energy_savings_kwh_per_yr).
  • wall_recommendation.pyrecommend_cavity_wall(epc, products): detect cavity (wall_construction==4) + uninsulated (wall_insulation_type==4) → overlay sets wall_insulation_type=2 (Table 6 "Filled cavity").
  • roof_recommendation.pyrecommend_loft_insulation(epc, products): detect roof_insulation_thickness==0 → overlay roof_insulation_thickness=270.
  • floor_recommendation.pyrecommend_floor_insulation(epc, products): detect uninsulated ground floor + construction (floor_construction_type "Suspended"/"Solid") → overlay floor_insulation_thickness=100.
  • building_geometry.pygross_heat_loss_wall_area, roof_area, ground_floor_area (per part, by identifier; party walls excluded; areas are heat-loss/§3.8 quantities, not totals).
  • repositories/product/ProductRepository (ABC port, get(measure_type)->Product); ProductPostgresRepository reads the externally-owned material table (defensive SQLModel view MaterialRow; total_cost → unit_cost_per_m2; joins contingency). A ProductJsonRepository (file source, for ETL-gap costs) is intended behind the same port — the one remaining parser-independent AFK task.

Key facts / gotchas

  • Hand-built baseline fixture (no PDF): tests/domain/sap10_calculator/worksheet/_elmhurst_worksheet_000490.build_epc(). Its MAIN is an uninsulated cavity wall + uninsulated suspended ground floor + 300 mm (insulated) loft. Used as the baseline in every generator/scorer test. MAIN gross heat-loss wall area = 45.93 m², roof area = 14.85 m², ground floor = 14.85 m².
  • Calculator entry: Sap10Calculator().calculate(epc) -> SapResult (sap_score_continuous, co2_kg_per_yr, primary_energy_kwh_per_yr). Depend on the SapCalculator abstraction. Filled-cavity wall code = 2 (domain/sap10_ml/rdsap_uvalues.py::u_wall). Calculator reads wall/roof/floor from SapBuildingPart structured fields, NOT EnergyElement descriptions (those are detection-only).
  • Worktree vs main import trap: python /tmp/foo.py imports the repo from /workspaces/model (editable install), NOT this worktree. Run with PYTHONPATH=<worktree> or via pytest (rootdir handles it). pytest already uses worktree code.
  • Running tests: python -m pytest <path> -q. Do NOT pass -p no:cov (pytest.ini injects --cov args that then error). DB repo tests spin up ephemeral Postgres via the db_engine fixture (tests/conftest.py) — slower; SQLModel tables auto-register on import.
  • Conventions: commit per TDD slice; conventional-commit message ending Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>; stay on feature/bill-derivation (user's choice). Tests use literal # Arrange / # Act / # Assert; assert with abs(x - y) <= tol (not pytest.approx); pyright strict, zero errors; annotate call-return locals.

What's left

  1. #1157 persist a Plan (HITL). Design-review the Plan / Plan Phase / Recommendation persistence schema + ScenarioRepository method shapes with Khalim, then build ModellingOrchestrator.run(property_ids, scenario_ids) per ADR-0011/0012 (one UoW, commit once, thread only IDs, read via repos). Template: orchestration/property_baseline_orchestrator.py. Unblocks #1160 optimiser + #1161 ventilation dependency.
  2. ProductJsonRepository behind the existing ProductRepository port (file source for ETL-gap costs) — the only parser-independent AFK task remaining besides #1157.

Relevant memories (auto-loaded)

  • project_openos_conservation_data_gap — EWI eligibility needs listed/conservation status, not ingested; blocks the solid-wall EWI slice (later), NOT the fabric tracers.
  • project_calculator_geometry_extraction — the calculator holds reusable geometry; building_geometry.py is the start; DRY the calculator onto it later (coordinate with the calculator branch); don't edit heat_transmission.py now.