Model

mirror of https://github.com/Hestia-Homes/Model.git synced 2026-06-08 11:17:27 +00:00

Author	SHA1	Message	Date
Khalim Conn-Kowlessar	b3f4609c2d	feat(modelling): wire Valuation Uplift onto the Plan The Plan derives its Valuation Uplift (ADR-0018) from its baseline -> post band jump and works+contingency cost, given one external input — the Property's current market value (a Property Valuation, mostly absent). `Plan.valuation` / `Plan.baseline_epc_rating` are derived like the other headline figures; `PlanModel.from_domain` maps the £ forms to the live plan.valuation_* columns (NULL when no value — the percentage is not persisted on those columns). `Property.current_market_value` is the new optional source; the orchestrator threads it onto the Plan. `run_one` takes a `current_market_value` so the harness can value the uplift, and the sense-check table shows the average % (always) plus the £ forms when known. Sourcing the current market value (upload / default) remains deferred (ADR-0018); it is None throughout until that lands, so the columns stay NULL at scale. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-04 08:59:04 +00:00
Khalim Conn-Kowlessar	31da90f5eb	feat(modelling): persist recommendation.material_id from the catalogue Expand half of the recommendation_materials retirement (ADR-0017). A Plan Measure installs a single Product, so thread its catalogue id end to end — Product.id -> MeasureOption.material_id -> PlanMeasure.material_id -> recommendation.material_id — replacing the per-material BOM child table with one nullable column on the row. ProductPostgresRepository reads the id from MaterialRow; the four fabric generators set it on their Option; the orchestrator carries it onto the Plan Measure; the mirror declares + maps the column. Optional throughout (the JSON stopgap catalogue carries no ids -> NULL). The multi-measure integration test now pins each persisted measure's material_id to its seeded MaterialRow id. Migration spec (live column must be added before this deploys; contraction is the owner's next step) in docs/migrations/recommendation-material-id.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-04 08:26:58 +00:00
Khalim Conn-Kowlessar	c5520b82f9	feat(modelling): run_one console entrypoint for DB-less inspection Slice 3. `harness.console.run_one(epc, goal_band=...)` wires the full AraFirstRunPipeline against in-memory fakes — no Postgres, no network — runs one property, prints the sense-check table, and returns the Plan for interactive poking from a REPL at the worktree root. Defaults to the committed harness sample catalogue. Refactors the slice-1 integration test to delegate to run_one (dropping ~70 lines of duplicated wiring + the now-unused test catalogue fixture), so it exercises the shipped entrypoint rather than a parallel copy. The new console test covers run_one's print/return contract. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-04 08:14:14 +00:00
Khalim Conn-Kowlessar	9329978374	feat(modelling): sense-check table for a Plan in the DB-less harness Slice 2. `harness.plan_table.format_plan_table(plan)` renders a Plan as a plain-text table — one package summary line (baseline SAP/band -> post SAP/band, CO2 saved, cost of works + contingency, bill saved) and one line per Plan Measure (signed SAP points, cost, delivered kWh + £ savings). Pure presentation: reads the Plan, computes nothing. The DB-less First Run test now prints it (visible under `pytest -s`) so the modelled package can be eyeballed and debugged by hand. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-04 08:06:53 +00:00
Khalim Conn-Kowlessar	d5f1fc335b	test(modelling): run First Run with no database via in-memory fakes Slice 1 of the DB-less inspection harness. Complete the in-memory FakeUnitOfWork so the ModellingOrchestrator runs with no Postgres: add FakeScenarioRepository + FakePlanRepository (idempotent, keyed by (property_id, scenario_id)), expose scenario/product/plan on the fake unit, and grow FakePropertyRepo to compose the effective EPC from the EPC repo at read time — mirroring PropertyPostgresRepository, so the EPC Ingestion persists is visible to Baseline + Modelling (the through-repos hand-off, in memory). The new integration test drives the full AraFirstRunPipeline (Ingestion -> Baseline -> Modelling) against the FakeUnitOfWork — no Session ever opened — on the uninsulated 000490 fixture with its lodged recorded-performance filled in (it already carries the RHI block, so Baseline can run) and asserts a multi-measure Plan is produced. The committed product catalogue prices the wall/floor/ventilation measures it fires. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-04 07:51:44 +00:00
Khalim Conn-Kowlessar	c18968ba3c	refactor(modelling): consolidate scenario + installed_measure into the subpackage Move the scenario and installed_measure tables into infrastructure/postgres/modelling/ as full-parity SQLModel definitions (ScenarioModel, InstalledMeasureModel + MeasureType), completing the cluster consolidation. backend/app/db/models/recommendations.py is now a pure re-export shim. ScenarioModel.goal is the PortfolioGoal enum (legacy planning branches on it), sourced from domain/modelling/portfolio_goal.py; the repo's to_domain maps it to its value string, so domain Scenario.goal is now the value ("Increasing EPC") consistent with the orchestrator's check — fixing the latent name-vs-value inconsistency the old str column masked (the scenario repo test stored the enum name). Parity columns are nullable (mirror convention; live NOT-NULLs owned by Drizzle). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 22:52:35 +00:00
Khalim Conn-Kowlessar	01c2c3910e	refactor(modelling): rename the cluster SQLModel classes …Row → …Model Standardise the modelling persistence classes on the …Model suffix (PlanModel, RecommendationModel, RecommendationMaterialModel) — matching the epc_property precedent and the legacy names the rest of backend/ already imports, so the shim's plan re-export becomes literal (no alias) and the eventual shim deletion needs zero renames. The …Row→…Model sweep for the non-cluster tables (Property/Task/Material/…) waits until their live legacy …Model counterparts are retired, to avoid reintroducing dual-definition collisions. No behaviour change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 22:42:21 +00:00
Khalim Conn-Kowlessar	c1c7b06f09	refactor(modelling): consolidate plan/recommendation models into infrastructure Move the live plan, recommendation, recommendation_materials and (retiring) plan_recommendations tables into a new infrastructure/postgres/modelling/ subpackage as single SQLModel definitions (the epc_property pattern), absorbing the rebuild's partial PlanRow/RecommendationRow mirrors and carrying full legacy column parity plus recommendation.plan_id. Out-of-cluster references are plain indexed ints (mirror convention); the live FKs are owned by the Drizzle schema. backend/app/db/models/recommendations.py becomes a re-export shim (ScenarioModel/InstalledMeasure stay for a later slice). Fix the export conftest to create SQLModel-first (so Base funding_package's FK to the now-SQLModel plan resolves) and skip the redundant drop_all on its function-scoped throwaway DB (the epc enum type is now shared across both metadatas). Resolves the pre-existing dual-definition collision: the rebuild and legacy export suites are now co-runnable. No behaviour change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 21:00:14 +00:00
Khalim Conn-Kowlessar	b976c3abd2	feat(modelling): attribute per-measure bill savings via a telescoping cascade `_plan_for` now scores the baseline + every cumulative prefix once (`cascade_scores`, best-practice order) and reuses those Scores for both the role-3 marginal attribution and a per-measure bill cascade: bill each prefix at one Fuel Rates snapshot and take consecutive Bill deltas as each measure's marginal delivered-kWh and £ saving. Saving is signed (ventilation is negative) and telescopes exactly to the Plan headline savings, because the Plan's baseline/post Bills are now the same cascade endpoints (`bills[0]` / `bills[-1]`) — which also drops the redundant standalone baseline `calculate`. `recommendation.kwh_savings` / `energy_cost_savings` are filled from these. Adds `Bill.total_consumption_kwh` (shared by Plan + the orchestrator). Pinned end-to-end on the real calculator: Σ per-measure savings == the Plan totals (ADR-0014 amendment). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 18:01:11 +00:00
Khalim Conn-Kowlessar	198122d145	feat(modelling): derive + persist plan-level post-retrofit bills (#1152 follow-up) ModellingOrchestrator gains a constructor-injected FuelRatesRepository (mirrors Baseline): run() resolves get_current() once and reuses one BillDerivation across the batch. _plan_for prices the baseline and post-package end-states from the SapResults already on their Scores (no extra calculate) and passes the Bills to Plan. PlanRow mirror + from_domain gain the four live columns post_energy_bill / energy_bill_savings / post_energy_consumption / energy_consumption_savings. Pipeline/handler wire the fuel-rates repo. Integration tests assert the columns persist: the multi-measure (fallback) plan shows positive bill+consumption savings; the already-at-target zero-measure plan shows the current bill with exactly zero savings. Fuel-switch measures price at the new fuel for free (we bill the simulated end-state). 183 modelling/billing/orchestration/repo tests pass, pyright strict clean. Plan-level only; per-measure savings next. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 17:30:47 +00:00
Khalim Conn-Kowlessar	ced6287baa	refactor(billing): relocate Bill Derivation to domain/billing/ (cross-stage) Bill / EnergyBreakdown / BillDerivation / sap_fuel were under domain/property_baseline/ only because Baseline was built first. The Modelling stage now needs them too, so move them (and their tests) to a neutral domain/billing/ — Fuel/FuelRates already live in the shared domain/fuel_rates/. Avoids a modelling -> property_baseline cross-stage import and a package name that wrongly implies ownership (ADR-0011, ADR-0014 amendment). Pure git mv + import rewrite across 10 files; 40 billing/baseline/repo tests pass, pyright strict clean. CONTEXT.md Bill Derivation location updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 17:19:23 +00:00
Khalim Conn-Kowlessar	641c1bd7f6	test(modelling): pin least-cost-to-target end-to-end through the orchestrator The orchestrator already threads budget/target_sap/dependencies into optimise_package, so no orchestrator change was needed. Add an integration test proving the new objective end-to-end on the real calculator: a band-D property (~57.4) with a goal of band D — already met — yields a Plan with NO measures and zero cost (the old max-gain objective would have recommended wall+floor+vent, improving within the band it is already in). Clarified that the existing multi-measure test now exercises the max-gain fallback (goal C unreachable from D, tops out ~61). Narrowed Optional sap_points/estimated_cost through locals to keep pyright strict-clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 16:20:45 +00:00
Khalim Conn-Kowlessar	0fec069988	feat(modelling): wire the ventilation Measure Dependency into the orchestrator (#1161 ) ModellingOrchestrator builds the ventilation dependency per Property (suppressed when already mechanically ventilated) and passes it to optimise_package, so a selected wall measure forces MEV into the package before the re-score. Ventilation joins the role-3 cascade in best-practice order (walls -> roof -> ventilation -> floor) and persists as a Plan Measure carrying its real negative marginal and its cost. Added the mechanical_ventilation contingency rate (0.26, per legacy Costs.CONTINGENCIES). Integration test now seeds the ventilation Product and asserts the forced measure persists with <=0 SAP and 2x900 cost; the full-pipeline test seeds the Product too (the dependency is built for every not-yet-ventilated dwelling). On 000490 the real calculator scores MEV at -1.275 SAP. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 13:34:40 +00:00
Khalim Conn-Kowlessar	34d4748a3a	feat(modelling): wire the Optimiser into the orchestrator (#1160 ) Slice 3b — closes #1160. ModellingOrchestrator._plan_for now runs the full ADR-0016 flow instead of a single cavity measure: generate wall + roof + floor Recommendations → score each Option independently (role 1) into grouped ScoredOptions → optimise_package (grouped knapsack within budget + whole-package re-score + greedy repair toward the Scenario's SAP target) → attribute the selected set via the best-practice marginal cascade (role 3) → persist the Plan with its Plan Measures. The repair target comes from the goal: INCREASING_EPC → the goal_value band floor via Epc.sap_lower_bound(); other goals carry no SAP target yet (later slice). Best-practice order walls → roof → floor. Integration test: an uninsulated cavity wall + suspended floor (000490) driven directly through the Modelling stage off a repo-seeded EPC (the calculator fixture has no lodged recorded-performance fields, so Baseline can't run it) persists a Plan with two attributed, priced Plan Measures. The existing first-run test keeps full-pipeline coverage and now exercises real modelling (its sample EPC's uninsulated solid floor yields a floor measure). Replaces the single-measure cavity integration test (subsumed). 138 pass; pyright strict clean. Multi-phase remains descoped (ADR-0005); single-phase optimiser. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 13:07:14 +00:00
Khalim Conn-Kowlessar	c7e2aa3755	feat(modelling): ModellingOrchestrator persists a Plan end-to-end (#1157 ) Slice 4b — closes the #1157 tracer. ModellingOrchestrator.run(property_ids, scenario_ids, portfolio_id) now does real work in one Unit of Work, committed once (ADR-0011/0012/0016/0017): read Property (effective EPC) + Scenario via repos → recommend_cavity_wall → select its Option → PackageScorer.score (role-2 package total) + marginal_impacts (role-3 attribution) → build Plan/PlanMeasure → uow.plan.save → commit. - AraFirstRunPipeline / ModellingStage thread portfolio_id from the trigger body (one source of truth); handler builds the real orchestrator (unit_of_work + Sap10Calculator), dropping the Scenario/Materials stubs. - ScenarioRepository.get_many promoted to @abstractmethod now the bare-stub instantiations are gone. - New ara_first_run-style integration test: a property with an uninsulated cavity wall yields a persisted Plan + one cavity_wall_insulation Plan Measure (priced from the Product, figures present, linked by plan_id). Numeric SAP correctness is pinned separately in test_elmhurst_cascade_pins. - Existing pipeline integration test updated: seeds scenario 7 and runs the real Modelling stage (its already-insulated sample wall yields an empty package — no crash). 121 pass across repositories/modelling/orchestration/app; pyright strict clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 12:08:32 +00:00
Khalim Conn-Kowlessar	f179950519	feat(baseline): wire BillDerivation into the orchestrator and persist the Bill (ADR-0014) The PropertyBaselineOrchestrator now reads the current Fuel Rates snapshot once per batch, builds a BillDerivation, and prices each scored property's SapResult -> EnergyBreakdown into a Bill carried on PropertyBaselinePerformance (None only on the stub no-calculator path). The Bill is flattened onto nullable bill_* flat columns (per-section kwh+cost, standing charges, SEG credit, total) on the postgres table, with bill_total_annual_bill_gbp as the not-null discriminator on read-back. Section absent from the bill stays None, not 0. Updated all four orchestrator construction sites to inject the FuelRatesRepository port (handler + three test sites), and the FE migration doc to reflect the prefixed columns and that they are now populated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 18:51:18 +00:00
Khalim Conn-Kowlessar	69995edec8	Merge branch 'main' of https://github.com/Hestia-Homes/Model into feature/per-cert-mapper-validation	2026-06-02 16:10:41 +00:00
Khalim Conn-Kowlessar	15da2d3970	feat(baseline): CalculatorRebaseliner — calculator goes load-bearing (ADR-0013 amend) Slice 5a: the promotion. Replaces StubRebaseliner in production and collapses the shadow runner into the rebaseliner (ADR-0013 amendment). - CalculatorRebaseliner runs Sap10Calculator on every Property: * sap_version < 10.2 -> Effective Performance IS the calculator output (band via Epc.from_sap_score, CO2 kg->t, PEUI rounded), reason "pre_sap10". * sap_version >= 10.2 -> Effective = lodged (API figures on-target), and the calculator only logs divergence (SAP>0.5, PEUI/CO2 1%) as a validation signal. * a calculator raise propagates -> batch aborts (ADR-0012); fix the cert at once. - Rebaseliner.rebaseline gains property_id (for the divergence log). - LoggingCalculatorShadow / the calculator_shadow seam removed from the orchestrator; its divergence-comparison logic now lives in the rebaseliner. - StubRebaseliner kept (signature updated) for orchestrator/repo unit tests. The SapResult->EnergyBreakdown adapter + BillDerivation wiring (to populate the bill block) follow once the appliances/cooking SapResult fields land. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 10:04:24 +00:00
Khalim Conn-Kowlessar	561e1b8b49	feat(baseline): run Sap10Calculator in shadow on Property Baseline (ADR-0013) Wire Sap10Calculator into PropertyBaselineOrchestrator as a non-load-bearing shadow runner. For each property it scores the Effective EPC beside the load-bearing Lodged/Effective write, catches any strict-raise -> log.error (never aborts the batch), and on success log.warning's divergence from Lodged: SAP \|continuous - lodged\| > 0.5; PEUI/CO2 > 1% relative (CO2 after kg->tonnes). Every line is tagged with sap_version so SAP-10.2 signal separates from older-spec drift (ADR-0010 Validation Cohort). Per ADR-0013, Calculated SAP10 Performance is not a persisted third value-set: effective = calculated in every baselining scenario, so the calculator IS the mechanism that produces Effective Performance (the Rebaseliner). It runs in shadow only while being hardened; when overrides/estimation land it is promoted to drive Effective and the failure posture flips to abort (ADR-0012, calculator now load-bearing). No table change. - ADR-0013 + CONTEXT (Calculated SAP10 Performance / Effective Performance / Rebaselining) record the decision. - CalculatorShadow port + LoggingCalculatorShadow + Calculator protocol. - FakeCalculatorShadow for orchestrator unit tests. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 08:01:47 +00:00
Khalim Conn-Kowlessar	1ea71a3acb	refactor(ara): rename FirstRunPipeline → AraFirstRunPipeline (PR #1139 review) Aligns the composition with its entry point (the `ara_first_run` lambda + `AraFirstRunTriggerBody`): clearer what the file does. - orchestration/first_run_pipeline.py → ara_first_run_pipeline.py - FirstRunPipeline → AraFirstRunPipeline; FirstRunCommand → AraFirstRunCommand - test files renamed to match Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-01 16:28:48 +00:00
Khalim Conn-Kowlessar	457d959b1f	refactor(property-baseline): rename baseline → property_baseline aggregate (PR #1139 review) Wholesale rename of the Baseline aggregate to PropertyBaseline for clarity / to disambiguate from baselines that appear elsewhere in Modelling. Scoped to this aggregate only — the distinct Rebaselining term (rebaseline_reason, StubRebaseliner, RebaselineNotImplemented) is deliberately untouched. - domain/baseline → domain/property_baseline; BaselinePerformance → PropertyBaselinePerformance. - repositories/baseline → repositories/property_baseline; BaselineRepository / BaselinePostgresRepository → PropertyBaseline*. - orchestration/baseline_orchestrator.py → property_baseline_orchestrator.py; BaselineOrchestrator → PropertyBaselineOrchestrator. BaselineStage → PropertyBaselineStage. - infrastructure/postgres: baseline_performance_table.py → property_baseline_performance_table.py; table `baseline_performance` → `property_baseline_performance`; Model renamed. - UnitOfWork attribute `.baseline` → `.property_baseline`. - Docs: ADR-0004 references + migration doc (renamed to property-baseline-performance-table.md) updated. CONTEXT.md glossary term ("Baseline Performance") left as-is pending a ubiquitous-language call (raised on the PR). 123 tests pass; pyright strict clean (only the unrelated pre-existing moto import errors remain). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-01 16:28:48 +00:00
Khalim Conn-Kowlessar	d2d008f5c5	perf(repos): bulk get_many / get_for_properties — batch reads, not N round-trips (#1138 ) Final slice of ADR-0012: collapse the per-property read round-trips a batch made (Baseline hydrated ~8 queries x 30 properties one at a time) into a handful of per-table IN queries. - EpcPostgresRepository: extracted a shared `_compose(rows)` from `get` (the windows + floor-dim fetches are now passed in, not fetched inline), so both `get` and the new `get_for_properties(property_ids)` build EpcPropertyData from pre-fetched rows. `get_for_properties` fetches each child table once (`WHERE epc_property_id IN ...`), groups in memory, and composes — load-whole per ADR-0002. - PropertyRepository.get_many(property_ids) -> Properties: one query for the property rows + one bulk EPC hydration, composed in input order. - BaselineOrchestrator / IngestionOrchestrator read the batch via get_many instead of N x get. - Ports + fakes gain the bulk methods. The #1129 round-trip fidelity test stays green (the compose extraction is behaviour-preserving). New tests: bulk hydration correctness + round-trips are constant w.r.t. batch size (one-per-table, proven by query count). 123 pass; pyright strict clean; AAA. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-01 16:28:48 +00:00
Khalim Conn-Kowlessar	7275850c9e	refactor(orchestration): wire stages onto the UnitOfWork; per-stage commit (#1138 ) Replaces the handler's whole-pipeline Session (one transaction across all three stages, connection pinned during Ingestion's external IO) with a Unit-of-Work per stage (ADR-0012, added here). Each stage runs its batch in one unit and commits once; any property raising aborts the batch and the subtask fails noisily. - BaselineOrchestrator(unit_of_work, rebaseliner): one unit for the batch, commit once. Raise on a pre-SAP10 property leaves the unit uncommitted. - IngestionOrchestrator(unit_of_work, epc_fetcher, geospatial_repo, solar_fetcher): fetch/write split — phase 1 fetches the whole batch (EPC / coords / solar) with NO unit open; phase 2 writes in one unit and commits. The connection is never held during external IO. Geospatial S3 repo stays injected (reference data, not transactional). - Handler: module-scoped engine (pool reused across warm invocations) + a UoW factory; whole-pipeline `with Session` gone. `build_first_run_pipeline` composes on the factory. Source clients still behind the raising seam. - ADR-0012 records the decision (per-stage boundary, all-or-nothing batch, idempotent re-run, fetch/write split, module-scoped engine). Modelling stub left untouched (no-op, no DB) per the ADR. Tests: orchestrators on a shared FakeUnitOfWork (assert persisted batch + exactly-once commit + no-commit-on-raise). New real-DB E2E integration test: real PostgresUnitOfWork, Ingestion writes the EPC → Baseline reads it back through the repo → re-run replaces, not duplicates (1 EPC row, 1 baseline row after two runs). 121 pass in tests/; pyright strict clean; AAA. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-01 16:28:48 +00:00
Khalim Conn-Kowlessar	61846665b1	feat(first-run): FirstRunPipeline E2E — Ingestion → Baseline → Modelling (#1136 ) Completes the First Run spine. Replaces the #1130 stub FirstRunPipeline with the real three-stage composition and wires it into the handler. - `FirstRunPipeline.run(command)` sequences Ingestion → Baseline → Modelling, threading only `property_ids` between stages (and `scenario_ids` into Modelling, off the command — never a prior stage's output). Stages are injected behind thin `IngestionStage` / `BaselineStage` / `ModellingStage` Protocols (the EpcFetcher/SolarFetcher idiom), so the handler owns wiring and tests substitute fakes (ADR-0011). - `ModellingOrchestrator` stub + `ScenarioRepository` / `MaterialsRepository` seam ports — `run(property_ids, scenario_ids)` reads through repos, does no scoring yet. Method shapes deferred to the Modelling per-service grills (Scenario / Scenario Phase / Snapshot / Optimised Package / Plans are rich — not pre-empted here). - Handler delegates to the real pipeline via `build_first_run_pipeline` (Postgres-backed repos off the session). The Ingestion source clients (EPC API / Google Solar / geospatial S3) are isolated behind one `_source_clients_from_env` seam that raises until the deploy/Terraform config settles — out of scope for this slice. Subtask complete/failed + CloudWatch URL still come from `@subtask_handler`. Integration test (the criterion's centrepiece): wires REAL Ingestion + REAL Baseline + stub Modelling through a shared fake EPC repo, with a repo-backed PropertyRepo composing the Property from that slice. Proves Baseline reads the very EPC Ingestion persisted — the through-repos hand-off, no in-memory coupling. Plus a composition test pinning stage order + only-property_ids threading. TDD, one test → one impl. pyright strict clean; AAA layout. 116 pass in the tests/ tree, no regressions. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-01 16:28:48 +00:00
Khalim Conn-Kowlessar	9f22b0aae8	feat(baseline): BaselineOrchestrator + BaselinePerformance aggregate (#1135 ) Stage 2 of First Run. Establishes each Property's Baseline Performance from persisted source data and writes it back — reads only from repos, never a Fetcher or HTTP (ADR-0003), so it is byte-identical whether Ingestion ran milliseconds ago or last week. Domain (`domain/baseline/`): - `Performance` VO — the four rated quantities: SAP / EPC Band / CO2 / Primary Energy Intensity. `lodged_performance(epc)` reads them off the EPC's recorded fields (PEUI = `energy_consumption_current`). - `BaselinePerformance` (ADR-0004) — the paired `lodged` + `effective` Performance + `rebaseline_reason`, plus the no-derivation part of the energy block (`space_heating_kwh` / `water_heating_kwh`, off the RHI, deterministic per ADR-0006). Both halves always populated. - `Rebaseliner` port + `StubRebaseliner`: the re-score-on-override seam (ADR-0011). SAP10 certs pass through (effective == lodged, reason "none"); a pre-SAP10 cert raises `RebaselineNotImplemented` rather than fabricating a plausible-but-wrong "none" — ML rebaselining is not wired yet. Mirrors the repo's strict-raise culture. Persistence: new `BaselineRepository` port + `BaselinePostgresRepository` + flat-column `baseline_performance` SQLModel (one row per Property). Per ADR-0004's amendment this is a standalone table, NOT columns on the retiring `property_details_epc`. Production migration is FE-owned (Drizzle) — docs/migrations/baseline-performance-table.md. Docs (grill-with-docs): corrected CONTEXT.md Lodged/Effective Performance to Primary Energy Intensity (the term collided with its own _Avoid_ entry under "heat demand") + fixed stale RHI field names; amended ADR-0004 Consequences for the standalone-table decision. Fuel split + bills (rest of EPC Energy Derivation) deferred to a follow-up — they need a Fuel Rates source (Ofgem-cap ETL) that does not exist yet. TDD, one test -> one impl: 7 tests (lodged read, rebaseliner pass-through + raise, orchestrator establish-and-persist + pre-SAP10 raise, Postgres round-trip + absent). pyright strict clean; AAA layout. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-01 16:28:48 +00:00
Khalim Conn-Kowlessar	a910ce9855	feat(ara): AraFirstRunTriggerBody + ara_first_run lambda skeleton (#1130 ) Stage-2 entry point for the First Run use case. Adds the `ara_first_run` Lambda package mirroring the `postcode_splitter` template, its typed trigger contract, and a stub `FirstRunPipeline`. - `AraFirstRunTriggerBody`: thin command of five fields — `task_id`, `sub_task_id` (UUID, lifecycle), `portfolio_id`, `property_ids`, `scenario_ids` (int business IDs). No `model_config` override, so Pydantic's default `extra="ignore"` lets the FastAPI backend add fields without breaking deployed lambdas. UPRNs / Scenario defs are deliberately off the event — read from source-of-truth tables. - Thin `handler.py`: validate-and-delegate only, via a named `dispatch_first_run` seam (testable without the Lambda runtime). Subtask status (in-progress/complete/failed) + CloudWatch log URL come for free from the existing `@subtask_handler()` decorator. - `FirstRunPipeline` (orchestration/) stub: `run(command)` receives the validated command. Declares a structural `FirstRunCommand` Protocol (the three business fields) that `AraFirstRunTriggerBody` satisfies, so orchestration needs no application-layer import — rhymes with the `EpcFetcher`/`SolarFetcher` Protocols on IngestionOrchestrator (ADR-0011). Full Ingestion→Baseline→Modelling composition lands in #1136. - Dockerfile / requirements.txt / local_handler/ mirror postcode_splitter. TDD: 7 new tests (trigger-body validation incl. forward-compat + id-types, pipeline seam, handler delegation). pyright strict clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-01 16:28:48 +00:00
Khalim Conn-Kowlessar	454456bf22	feat(ingestion): IngestionOrchestrator end-to-end (#1134 ) Stage 1 of the pipeline: per property, read its UPRN from the property row, fetch its EPC, resolve coordinates from the Geospatial reference repo, thread those into the Solar fetcher, and persist EPC + solar via repos. Fetchers never call each other — the orchestrator threads the coordinate (ADR-0011). Coordinates are reference data (deterministic from UPRN), resolved transiently to drive the solar fetch rather than persisted per-property. Depends on thin EpcFetcher/SolarFetcher Protocols (EpcClientService and GoogleSolarApiClient satisfy them structurally). Unit-tested against fakes — no DB, gov API, or network: persists EPC, threads coords into solar, skips UPRN-less properties and skips solar when coordinates are absent. pyright clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-01 16:28:48 +00:00
Khalim Conn-Kowlessar	305bffd284	refactor(ara): rename FirstRunPipeline → AraFirstRunPipeline (PR #1139 review) Aligns the composition with its entry point (the `ara_first_run` lambda + `AraFirstRunTriggerBody`): clearer what the file does. - orchestration/first_run_pipeline.py → ara_first_run_pipeline.py - FirstRunPipeline → AraFirstRunPipeline; FirstRunCommand → AraFirstRunCommand - test files renamed to match Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-01 15:00:33 +00:00
Khalim Conn-Kowlessar	c3691d9af2	refactor(property-baseline): rename baseline → property_baseline aggregate (PR #1139 review) Wholesale rename of the Baseline aggregate to PropertyBaseline for clarity / to disambiguate from baselines that appear elsewhere in Modelling. Scoped to this aggregate only — the distinct Rebaselining term (rebaseline_reason, StubRebaseliner, RebaselineNotImplemented) is deliberately untouched. - domain/baseline → domain/property_baseline; BaselinePerformance → PropertyBaselinePerformance. - repositories/baseline → repositories/property_baseline; BaselineRepository / BaselinePostgresRepository → PropertyBaseline*. - orchestration/baseline_orchestrator.py → property_baseline_orchestrator.py; BaselineOrchestrator → PropertyBaselineOrchestrator. BaselineStage → PropertyBaselineStage. - infrastructure/postgres: baseline_performance_table.py → property_baseline_performance_table.py; table `baseline_performance` → `property_baseline_performance`; Model renamed. - UnitOfWork attribute `.baseline` → `.property_baseline`. - Docs: ADR-0004 references + migration doc (renamed to property-baseline-performance-table.md) updated. CONTEXT.md glossary term ("Baseline Performance") left as-is pending a ubiquitous-language call (raised on the PR). 123 tests pass; pyright strict clean (only the unrelated pre-existing moto import errors remain). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-01 14:54:59 +00:00
Jun-te Kim	3845ac10b0	moved classifier data transformation to an easy one	2026-06-01 14:53:34 +00:00
Jun-te Kim	0febf0e6d5	classifier	2026-06-01 14:30:09 +00:00
Jun-te Kim	c9a9620527	pr review, move domain and orhcestration	2026-06-01 14:00:31 +00:00
Khalim Conn-Kowlessar	8685f8ba3a	perf(repos): bulk get_many / get_for_properties — batch reads, not N round-trips (#1138 ) Final slice of ADR-0012: collapse the per-property read round-trips a batch made (Baseline hydrated ~8 queries x 30 properties one at a time) into a handful of per-table IN queries. - EpcPostgresRepository: extracted a shared `_compose(rows)` from `get` (the windows + floor-dim fetches are now passed in, not fetched inline), so both `get` and the new `get_for_properties(property_ids)` build EpcPropertyData from pre-fetched rows. `get_for_properties` fetches each child table once (`WHERE epc_property_id IN ...`), groups in memory, and composes — load-whole per ADR-0002. - PropertyRepository.get_many(property_ids) -> Properties: one query for the property rows + one bulk EPC hydration, composed in input order. - BaselineOrchestrator / IngestionOrchestrator read the batch via get_many instead of N x get. - Ports + fakes gain the bulk methods. The #1129 round-trip fidelity test stays green (the compose extraction is behaviour-preserving). New tests: bulk hydration correctness + round-trips are constant w.r.t. batch size (one-per-table, proven by query count). 123 pass; pyright strict clean; AAA. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 10:33:24 +00:00
Khalim Conn-Kowlessar	48a488d1e9	refactor(orchestration): wire stages onto the UnitOfWork; per-stage commit (#1138 ) Replaces the handler's whole-pipeline Session (one transaction across all three stages, connection pinned during Ingestion's external IO) with a Unit-of-Work per stage (ADR-0012, added here). Each stage runs its batch in one unit and commits once; any property raising aborts the batch and the subtask fails noisily. - BaselineOrchestrator(unit_of_work, rebaseliner): one unit for the batch, commit once. Raise on a pre-SAP10 property leaves the unit uncommitted. - IngestionOrchestrator(unit_of_work, epc_fetcher, geospatial_repo, solar_fetcher): fetch/write split — phase 1 fetches the whole batch (EPC / coords / solar) with NO unit open; phase 2 writes in one unit and commits. The connection is never held during external IO. Geospatial S3 repo stays injected (reference data, not transactional). - Handler: module-scoped engine (pool reused across warm invocations) + a UoW factory; whole-pipeline `with Session` gone. `build_first_run_pipeline` composes on the factory. Source clients still behind the raising seam. - ADR-0012 records the decision (per-stage boundary, all-or-nothing batch, idempotent re-run, fetch/write split, module-scoped engine). Modelling stub left untouched (no-op, no DB) per the ADR. Tests: orchestrators on a shared FakeUnitOfWork (assert persisted batch + exactly-once commit + no-commit-on-raise). New real-DB E2E integration test: real PostgresUnitOfWork, Ingestion writes the EPC → Baseline reads it back through the repo → re-run replaces, not duplicates (1 EPC row, 1 baseline row after two runs). 121 pass in tests/; pyright strict clean; AAA. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 09:54:47 +00:00
Khalim Conn-Kowlessar	b77fe26892	feat(first-run): FirstRunPipeline E2E — Ingestion → Baseline → Modelling (#1136 ) Completes the First Run spine. Replaces the #1130 stub FirstRunPipeline with the real three-stage composition and wires it into the handler. - `FirstRunPipeline.run(command)` sequences Ingestion → Baseline → Modelling, threading only `property_ids` between stages (and `scenario_ids` into Modelling, off the command — never a prior stage's output). Stages are injected behind thin `IngestionStage` / `BaselineStage` / `ModellingStage` Protocols (the EpcFetcher/SolarFetcher idiom), so the handler owns wiring and tests substitute fakes (ADR-0011). - `ModellingOrchestrator` stub + `ScenarioRepository` / `MaterialsRepository` seam ports — `run(property_ids, scenario_ids)` reads through repos, does no scoring yet. Method shapes deferred to the Modelling per-service grills (Scenario / Scenario Phase / Snapshot / Optimised Package / Plans are rich — not pre-empted here). - Handler delegates to the real pipeline via `build_first_run_pipeline` (Postgres-backed repos off the session). The Ingestion source clients (EPC API / Google Solar / geospatial S3) are isolated behind one `_source_clients_from_env` seam that raises until the deploy/Terraform config settles — out of scope for this slice. Subtask complete/failed + CloudWatch URL still come from `@subtask_handler`. Integration test (the criterion's centrepiece): wires REAL Ingestion + REAL Baseline + stub Modelling through a shared fake EPC repo, with a repo-backed PropertyRepo composing the Property from that slice. Proves Baseline reads the very EPC Ingestion persisted — the through-repos hand-off, no in-memory coupling. Plus a composition test pinning stage order + only-property_ids threading. TDD, one test → one impl. pyright strict clean; AAA layout. 116 pass in the tests/ tree, no regressions. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-30 22:32:58 +00:00
Khalim Conn-Kowlessar	76717dfc3a	feat(baseline): BaselineOrchestrator + BaselinePerformance aggregate (#1135 ) Stage 2 of First Run. Establishes each Property's Baseline Performance from persisted source data and writes it back — reads only from repos, never a Fetcher or HTTP (ADR-0003), so it is byte-identical whether Ingestion ran milliseconds ago or last week. Domain (`domain/baseline/`): - `Performance` VO — the four rated quantities: SAP / EPC Band / CO2 / Primary Energy Intensity. `lodged_performance(epc)` reads them off the EPC's recorded fields (PEUI = `energy_consumption_current`). - `BaselinePerformance` (ADR-0004) — the paired `lodged` + `effective` Performance + `rebaseline_reason`, plus the no-derivation part of the energy block (`space_heating_kwh` / `water_heating_kwh`, off the RHI, deterministic per ADR-0006). Both halves always populated. - `Rebaseliner` port + `StubRebaseliner`: the re-score-on-override seam (ADR-0011). SAP10 certs pass through (effective == lodged, reason "none"); a pre-SAP10 cert raises `RebaselineNotImplemented` rather than fabricating a plausible-but-wrong "none" — ML rebaselining is not wired yet. Mirrors the repo's strict-raise culture. Persistence: new `BaselineRepository` port + `BaselinePostgresRepository` + flat-column `baseline_performance` SQLModel (one row per Property). Per ADR-0004's amendment this is a standalone table, NOT columns on the retiring `property_details_epc`. Production migration is FE-owned (Drizzle) — docs/migrations/baseline-performance-table.md. Docs (grill-with-docs): corrected CONTEXT.md Lodged/Effective Performance to Primary Energy Intensity (the term collided with its own _Avoid_ entry under "heat demand") + fixed stale RHI field names; amended ADR-0004 Consequences for the standalone-table decision. Fuel split + bills (rest of EPC Energy Derivation) deferred to a follow-up — they need a Fuel Rates source (Ofgem-cap ETL) that does not exist yet. TDD, one test -> one impl: 7 tests (lodged read, rebaseliner pass-through + raise, orchestrator establish-and-persist + pre-SAP10 raise, Postgres round-trip + absent). pyright strict clean; AAA layout. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-30 21:21:34 +00:00
Khalim Conn-Kowlessar	75fbba60fc	feat(ara): AraFirstRunTriggerBody + ara_first_run lambda skeleton (#1130 ) Stage-2 entry point for the First Run use case. Adds the `ara_first_run` Lambda package mirroring the `postcode_splitter` template, its typed trigger contract, and a stub `FirstRunPipeline`. - `AraFirstRunTriggerBody`: thin command of five fields — `task_id`, `sub_task_id` (UUID, lifecycle), `portfolio_id`, `property_ids`, `scenario_ids` (int business IDs). No `model_config` override, so Pydantic's default `extra="ignore"` lets the FastAPI backend add fields without breaking deployed lambdas. UPRNs / Scenario defs are deliberately off the event — read from source-of-truth tables. - Thin `handler.py`: validate-and-delegate only, via a named `dispatch_first_run` seam (testable without the Lambda runtime). Subtask status (in-progress/complete/failed) + CloudWatch log URL come for free from the existing `@subtask_handler()` decorator. - `FirstRunPipeline` (orchestration/) stub: `run(command)` receives the validated command. Declares a structural `FirstRunCommand` Protocol (the three business fields) that `AraFirstRunTriggerBody` satisfies, so orchestration needs no application-layer import — rhymes with the `EpcFetcher`/`SolarFetcher` Protocols on IngestionOrchestrator (ADR-0011). Full Ingestion→Baseline→Modelling composition lands in #1136. - Dockerfile / requirements.txt / local_handler/ mirror postcode_splitter. TDD: 7 new tests (trigger-body validation incl. forward-compat + id-types, pipeline seam, handler delegation). pyright strict clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-30 20:38:15 +00:00
Khalim Conn-Kowlessar	1696cccba6	feat(ingestion): IngestionOrchestrator end-to-end (#1134 ) Stage 1 of the pipeline: per property, read its UPRN from the property row, fetch its EPC, resolve coordinates from the Geospatial reference repo, thread those into the Solar fetcher, and persist EPC + solar via repos. Fetchers never call each other — the orchestrator threads the coordinate (ADR-0011). Coordinates are reference data (deterministic from UPRN), resolved transiently to drive the solar fetch rather than persisted per-property. Depends on thin EpcFetcher/SolarFetcher Protocols (EpcClientService and GoogleSolarApiClient satisfy them structurally). Unit-tested against fakes — no DB, gov API, or network: persists EPC, threads coords into solar, skips UPRN-less properties and skips solar when coordinates are absent. pyright clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-30 19:58:21 +00:00
Jun-te Kim	3e30b4af40	tests wrong environemnt	2026-05-29 16:17:06 +00:00
Jun-te Kim	8422041215	landlord overrid orchestration	2026-05-26 15:27:45 +00:00
Jun-te Kim	a747534f37	refactored to allow multiple column types	2026-05-22 15:28:26 +00:00
Jun-te Kim	675aa089c9	updated rdsap option; seperated s3 location in infrastrucutre; added open ai api	2026-05-22 14:00:33 +00:00
Jun-te Kim	61efcad27b	standardist Address	2026-05-22 10:13:32 +00:00
Jun-te Kim	0dee917094	unsanistiesed address list instead of raw address lit	2026-05-22 08:27:59 +00:00
Jun-te Kim	91bb4b6571	address list	2026-05-22 08:22:13 +00:00
Jun-te Kim	84098e28ff	raw address list repo	2026-05-22 08:17:37 +00:00
Jun-te Kim	cf14a4e3aa	rename to SAL and AssetList and RawAddresses	2026-05-22 08:14:46 +00:00
Jun-te Kim	acb306f7b9	asset list from landlord	2026-05-22 07:34:50 +00:00
Jun-te Kim	94cbf5f516	changed useraddress landlordasset list	2026-05-21 16:59:57 +00:00
Jun-te Kim	8baa4c82aa	save correct progress	2026-05-21 16:57:14 +00:00

1 2

57 commits