mirror of
https://github.com/Hestia-Homes/Model.git
synced 2026-06-08 11:17:27 +00:00
Records the grill-with-docs outcomes for the ara_first_run rebuild: three composable stage orchestrators (Ingestion/Baseline/Modelling), one lambda per use case chaining them through repos (not in-memory), and the Fetcher-vs-Repo data-source taxonomy. Amends ADR-0003's chaining rule to generalise beyond RefreshOrchestrator. Adds the pipeline-composition + First Run vocabulary to CONTEXT.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
16 lines
1.9 KiB
Markdown
16 lines
1.9 KiB
Markdown
# Strict separation between Ingestion and Modelling
|
|
|
|
**Status: Accepted, refined by [ADR-0011](0011-composable-stage-orchestrators.md).** The one-way flow below stands. ADR-0011 generalises the chaining rule: it is no longer "only a `RefreshOrchestrator` may chain" — it is *"only a top-level use-case pipeline orchestrator (e.g. `FirstRunPipeline`) may chain across the Ingestion→Modelling seam; the stage orchestrators communicate through repos and never call across it."*
|
|
|
|
|
|
Data flows one way only: **Ingestion → Repos → Modelling**. Modelling services never make external HTTP calls; Ingestion services never run business logic. If Modelling needs fresh data, it sees a stale record in a repo and returns; the caller (a refresh orchestrator or the FE) decides whether to ingest first. We considered allowing modelling services to call fetchers directly on cache miss — convenient — and rejected it.
|
|
|
|
The trade-off is that modelling cannot "self-heal" by going to the gov EPC API when it finds stale data. The benefit is that modelling becomes a deterministic function of repository state: same Property in the repos, same modelling output. That is the property that makes modelling unit-testable against fakes (no DB, no network, no ML lambda), reproducible, and debuggable. It also enables a per-property UI flow where fetched data is shown to the user for review and possible override **before** modelling runs.
|
|
|
|
Under the rushed timeline this constraint is more valuable, not less. Mixing fetchers into services is the easy thing to do when shipping fast; once it's done it's hard to extract.
|
|
|
|
## Consequences
|
|
|
|
- Every modelling service depends only on Repos (and other Services / domain logic). No HTTP libraries in the modelling import graph.
|
|
- A `RefreshOrchestrator` is the only thing that calls Ingestion then Modelling in sequence; nothing else may.
|
|
- "Modelling is stale, refetch in-line" is a forbidden pattern — surface staleness, do not silently repair it.
|