mirror of
https://github.com/Hestia-Homes/assessment-model.git
synced 2026-06-08 11:37:25 +00:00
88 lines
5.3 KiB
Markdown
88 lines
5.3 KiB
Markdown
# Context
|
|
|
|
This document captures the domain language used in this project. Terms here are the **canonical** ones — when more than one word exists for a concept, we pick one and treat the others as aliases to avoid.
|
|
|
|
This file grows as terms are resolved during design conversations. Concepts that haven't been examined yet are not listed.
|
|
|
|
## Language
|
|
|
|
### Bulk upload
|
|
|
|
**BulkUpload**:
|
|
A user-supplied spreadsheet of addresses for a Portfolio, transformed and matched to UPRNs before being inserted as Properties. Has an explicit lifecycle from upload through finalisation.
|
|
_Avoid_: import, batch, file upload, ingest
|
|
|
|
**ColumnMapping**:
|
|
The user's declaration of which spreadsheet column means what (e.g. column "Property Address" means `address_1`). Stored as JSON on the BulkUpload row.
|
|
_Avoid_: schema, header map, field mapping
|
|
|
|
**UPRN**:
|
|
Unique Property Reference Number — the UK national identifier for an address. Address matching attaches a UPRN to each row where possible.
|
|
|
|
**Address matching**:
|
|
The pipeline stage that splits the source file by postcode, looks up UPRNs, and produces matched-address output. Triggered via FastAPI.
|
|
_Avoid_: postcode lookup, address resolution, address lookup
|
|
|
|
**Combiner**:
|
|
The pipeline stage that aggregates the per-postcode address-matching outputs into a single combined CSV in S3, ready for review.
|
|
_Avoid_: aggregator, merger
|
|
|
|
**Finalise**:
|
|
The terminal action that reads the combiner output, inserts rows as Properties on the Portfolio, and decides whether the BulkUpload needs further review.
|
|
_Avoid_: import, commit, ingest
|
|
|
|
### Landlord overrides
|
|
|
|
**Landlord**:
|
|
The housing association supplying a Portfolio's BulkUploads. A Landlord knows facts about their properties that EPC data doesn't (e.g. that a cavity has been filled), and those facts take precedence when computing an assessment.
|
|
_Avoid_: customer, client, owner, organisation (Organisation is a separate, broader entity)
|
|
|
|
**Landlord override**:
|
|
A landlord-supplied fact about a property that takes precedence over EPC-derived defaults when computing an assessment. The end-to-end Landlord override journey has two layers — a **VocabularyMapping** layer (this glossary entry below) and a per-Property fact layer (not yet modelled).
|
|
_Avoid_: customer data, manual override, landlord data
|
|
|
|
**VocabularyMapping**:
|
|
The translation from a Landlord's free-text description in a BulkUpload column (e.g. `"cavity: filledcavity"`) to a canonical domain enum value (e.g. `WallType.CAVITY`). Produced by a `ColumnClassifier` (today an LLM, tomorrow possibly a lookup table or rules engine) in the Model service. Stored per-Portfolio, one row per `(category, description)`. A row carries provenance (`classifier` or `user`) so user overrides survive re-classification.
|
|
_Avoid_: column mapping (that's a separate concept — see `ColumnMapping` above), classification, dictionary
|
|
|
|
## Lifecycle
|
|
|
|
A **BulkUpload** moves through these statuses:
|
|
|
|
```
|
|
ready_for_processing
|
|
→ mapping_complete (user submits ColumnMapping; Next.js writes)
|
|
→ processing (Address matching triggered; Next.js writes)
|
|
→ combining (Combiner stage running; FastAPI writes directly)
|
|
→ awaiting_review (Combiner output in S3; FastAPI writes directly)
|
|
→ complete (Finalise succeeded; Next.js writes)
|
|
→ failed (FastAPI reports in-flight failure — schema only, not yet wired)
|
|
```
|
|
|
|
`complete` and `failed` are terminal.
|
|
|
|
Re-mapping (PATCHing `columnMapping`) is legal only in `ready_for_processing` and `mapping_complete`. Any later state rejects with 409.
|
|
|
|
**Two writers**: Next.js owns transitions out of `mapping_complete`, into `processing`, and the terminal Finalise outcomes. FastAPI owns `combining` and `awaiting_review` — writing them direct to the DB during the combiner run. The BulkUpload aggregate observes both.
|
|
|
|
See [ADR-0001](./docs/adr/0001-bulk-upload-state-machine.md) for the deliberate "not yet" decisions baked into this lifecycle.
|
|
|
|
## Relationships
|
|
|
|
- A **Portfolio** has many **BulkUploads**.
|
|
- A **BulkUpload** produces zero or more **Properties** when finalised.
|
|
- A **BulkUpload** has at most one **Task** (the orchestration handle for the FastAPI pipeline run); a Task has many **SubTasks** (one per pipeline stage: address matching, combiner).
|
|
- A **Portfolio** has many **VocabularyMappings** — one row per `(category, description)` it has ever encountered across all its BulkUploads. See [ADR-0002](./docs/adr/0002-landlord-override-vocabulary.md).
|
|
|
|
## Example dialogue
|
|
|
|
> **Dev:** "If the **Combiner** finishes but the user hasn't clicked Finalise, what does the user see?"
|
|
> **Domain expert:** "The BulkUpload sits in `awaiting_review`. The frontend polls and shows a 'review and confirm' button. Nothing's been written to **Properties** yet."
|
|
>
|
|
> **Dev:** "And if **Finalise** runs and 30% of rows have no **UPRN**?"
|
|
> **Domain expert:** "Those still get imported as **Properties** — just without a UPRN — and the BulkUpload moves to `complete`. Manual cleanup happens later in the property table."
|
|
|
|
## Flagged ambiguities
|
|
|
|
- "Upload" is used in the codebase to mean both the file-on-S3 and the BulkUpload row. We standardise on **BulkUpload** for the row; the file is just "the source file."
|
|
- "Onboarding" appears in some route paths (`bulk_onboarding_inputs/...`) but isn't part of this glossary — we use **BulkUpload** end-to-end.
|