mirror of
https://github.com/Hestia-Homes/Model.git
synced 2026-06-08 11:17:27 +00:00
79 lines
7.4 KiB
Markdown
79 lines
7.4 KiB
Markdown
# ADR-0003: Python writes landlord overrides directly to Postgres
|
|
|
|
**Status:** Accepted
|
|
**Date:** 2026-05-26
|
|
**Supersedes (in part):** [assessment-model/docs/adr/0002-landlord-override-vocabulary.md](https://github.com/.../assessment-model/blob/main/docs/adr/0002-landlord-override-vocabulary.md) — specifically the clause beginning *"Writes happen from Next.js …"*.
|
|
|
|
## Context
|
|
|
|
ADR-0002 (in the `assessment-model` TS repo) defined the `landlord_property_type_overrides` and `landlord_wall_type_overrides` tables and noted that the Model service would POST classification results to a Next.js route handler, with Next.js performing the upsert. Drizzle remained the schema source of truth.
|
|
|
|
That extra hop has not been built and is now judged unnecessary for the present scope:
|
|
|
|
- The classification result is internal — a Lambda computes it, the same Lambda persists it. No third party needs to participate in the write.
|
|
- Drizzle remains the schema's source of truth either way: the Python adapter mirrors the schema in a SQLModel row, but the migrations stay with Drizzle. Adding a Next.js route would not change which side owns schema definition.
|
|
- The Python lambda already lives next to a Postgres connection in the existing pipeline (`subtask`/`tasks` tables are written from Python today). Adding two more tables to that adapter surface is a small, well-understood change. Routing the same writes through Next.js would mean: lambda → JSON-over-HTTP → Next.js route → Drizzle → Postgres, instead of lambda → SQLAlchemy → Postgres. Three extra moving parts to ship, deploy, monitor, and authenticate for no behavioural gain.
|
|
|
|
## Decision
|
|
|
|
The Model service (specifically `applications/landlord_description_overrides/handler.py`) writes directly to `landlord_property_type_overrides` and `landlord_wall_type_overrides` via a SQLAlchemy-backed `LandlordOverrideRepository[E]` adapter. No Next.js route handler is required.
|
|
|
|
Transaction boundaries live in `infrastructure/postgres/engine.transactional_session` — a context manager that commits on clean exit and rolls back on exception. The application layer (`handler.py`) never calls `.commit()` or `.rollback()` itself; it only opens the context. Orchestration and repository code likewise never commits — keeping transaction semantics confined to one infrastructure helper.
|
|
|
|
The conflict policy lives in SQL and is identical for every override category. A single generic adapter, `LandlordOverridesRepository[E]`, implements it once; the target table is selected by the SQLModel `…Row` class passed at construction. Each category (property / built-form / wall / roof type) is that same adapter parameterised by its row class:
|
|
|
|
```sql
|
|
INSERT INTO landlord_property_type_overrides (portfolio_id, description, value, source)
|
|
VALUES …
|
|
ON CONFLICT (portfolio_id, description)
|
|
DO UPDATE SET value = EXCLUDED.value,
|
|
source = EXCLUDED.source,
|
|
updated_at = now()
|
|
WHERE landlord_property_type_overrides.source = 'classifier';
|
|
```
|
|
|
|
The `WHERE existing.source = 'classifier'` guard is load-bearing: it lets the classifier refresh its own past output while leaving `source = 'user'` rows untouched. This is the contract ADR-0002's `source` column was added for.
|
|
|
|
`UNKNOWN` values are persisted, not skipped — consistent with ADR-0002 §5. A future user override can upgrade them.
|
|
|
|
## Consequences
|
|
|
|
**Positive.**
|
|
|
|
- One fewer service to deploy, monitor, and authenticate.
|
|
- The classifier and persistence live in the same process — failures surface against a single `sub_task` row, not split across two systems.
|
|
- The Postgres adapter mirrors the existing `subtask`/`tasks` repositories, so reviewers have a precedent to compare against.
|
|
|
|
**Negative.**
|
|
|
|
- The Python repo now holds two schemas — the schema-source-of-truth Drizzle definition lives in the TS repo, and the Python `SQLModel` row class shadows it. They must stay in lockstep. Mitigations: the TS schema header comment (`landlord_overrides.ts:12`) already names the Python source-of-truth file; a future ADR may add a CI check that diffs the two.
|
|
- The boundary that ADR-0002 anticipated for pgEnum validation (a Next.js route validating incoming values before insert) is gone. Pydantic + the Python `Enum` type catch invalid values on the producing side, and Postgres's pgEnum will reject anything that slips through.
|
|
|
|
## File layout
|
|
|
|
This ADR also fixes a placement convention for Postgres adapters going forward. The codebase currently has the ChatGPT classifier split cleanly along DDD lines — port in `domain/`, adapter in `infrastructure/chatgpt/` — but the `tasks` Postgres adapter does not follow the same shape: its concrete class lives in `repositories/tasks/`, not `infrastructure/postgres/`.
|
|
|
|
The convention going forward separates the persistence *behaviour* (grouped by aggregate) from the schema *mirrors* (grouped by technology, since they share pgEnums and engine metadata):
|
|
|
|
- **Port (protocol / abstract base):** `repositories/<aggregate>/<thing>_repository.py`
|
|
- **Postgres repository adapter (concrete):** `infrastructure/<aggregate>/<aggregate>_postgres_repository.py`
|
|
- **SQLModel row class (`table=True` schema mirror):** `infrastructure/postgres/<thing>_table.py`
|
|
|
|
The `LandlordOverridesRepository` adapter follows this convention: the concrete class — the aggregate's "talker" to Postgres — lives at `infrastructure/landlord_overrides/landlord_overrides_postgres_repository.py`, while the per-category `…Row` classes stay in `infrastructure/postgres/`. The `…Row` classes are one-per-table — each mirrors a genuinely distinct Drizzle table and `value` pgEnum, and they share the single `override_source` pgEnum instance, so they belong together in the Postgres technology bucket as schema mirrors, not duplicated logic.
|
|
|
|
(This refines the placement first sketched in this ADR, which put the adapter in `infrastructure/postgres/` alongside the row classes. The adapter holds no schema — only the write path — so it groups by aggregate; only the `table=True` mirrors stay tech-bucketed.)
|
|
|
|
**Existing outliers to relocate in a follow-up:**
|
|
|
|
- `repositories/tasks/task_postgres_repository.py` → `infrastructure/tasks/task_postgres_repository.py`
|
|
- `repositories/tasks/subtask_postgres_repository.py` → `infrastructure/tasks/subtask_postgres_repository.py`
|
|
|
|
(Their `task_table.py` / `subtask_table.py` schema mirrors already sit correctly in `infrastructure/postgres/`.) Both moves are mechanical (import-path updates only). They are intentionally out of scope for the present PR.
|
|
|
|
## Out of scope (deferred to follow-up work)
|
|
|
|
- Relocating `task_postgres_repository.py` and `subtask_postgres_repository.py` into `infrastructure/tasks/` per the convention above.
|
|
- ~~Extracting a shared upsert helper / base class once a third `landlord_*_overrides` column lands — until then the per-category adapters' 95%-identical bodies are kept side-by-side for direct comparison.~~ **Done.** The per-category adapter bodies were byte-identical (varying only in their row class), so they were consolidated into one generic `LandlordOverridesRepository[E]` parameterised by row class rather than waiting for a third column.
|
|
- Switching `applications/landlord_description_overrides/handler.py` to acquire its `Session` via a `@subtask_handler()`-style decorator instead of building its own engine.
|
|
- A cross-repo PR amending ADR-0002 to point at this ADR.
|
|
- A CI check (or codegen) that diffs the Drizzle pgEnum literals against the Python `Enum.value` strings.
|