diff --git a/docs/wip/landlord-override-frontend-plan.md b/docs/wip/landlord-override-frontend-plan.md index a26deb4b..9cf4a362 100644 --- a/docs/wip/landlord-override-frontend-plan.md +++ b/docs/wip/landlord-override-frontend-plan.md @@ -1,13 +1,14 @@ # Landlord override frontend — in-flight design notes -**Status:** Paused mid-grilling (2026-05-27) -**Branch:** `feautre/additional_walltypes` +**Status:** Grilling complete (2026-05-28) — Q1–Q7 resolved; ready to promote to ADR +**Branch:** `feature/frontend_landlord_overrides` **Author:** Jun-te (with Claude, via `/grill-me`) -This is a *design-in-progress* document, not an ADR. It captures decisions made -so far on the landlord-override frontend plan so the conversation can resume -without re-litigating settled questions. Promote to an ADR once the trigger -mechanism (Q4) is resolved — that's the decision worth permanent recording. +This is a _design-in-progress_ document, not an ADR. It captures the decisions +reached during grilling so the conversation can resume without re-litigating +settled questions. All seven questions are now resolved; the trigger + +state-machine integration (Q4) is ready to promote to a frontend ADR-0003, and +the work is ready to break into issues. ## Goal @@ -23,14 +24,22 @@ Build the front-end e2e for `landlord_description_override`, starting from the rationale in [ADR-0002](../adr/0002-landlord-override-vocabulary.md). - Nothing in Next.js reads or writes them yet. - The Python lambda at - `/workspaces/home/github/Model/applications/landlord_description_overrides/handler.py` - is **not deployed** and **not wired** into the BulkUpload pipeline. It - hardcodes its trigger params (`portfolio_id`, `s3_uri`) and its source column - names (`"Property Type"`, `"Walls"`, `"Roofs"`). -- Note: ADR-0002 says writes come from Next.js POST, but the current backend - writes direct to Postgres. This drift may need to be revisited under Q4. + `Model/applications/landlord_description_overrides/handler.py` (branch + `feature/landlord_data`) is **not deployed** and **not wired** into the + BulkUpload pipeline. It hardcodes the trigger fields (handler.py:55-60) and + builds a hardcoded `ClassifiableColumn` list with `source_column`s + `"Property Type"` (→ both `property_type` **and** `built_form_type`), + `"Walls"`, `"Roofs"`. It also caps the batch to 20 rows while under test. +- **Resolved drift (was a Q4 risk):** the backend's + [ADR-0003 (python-writes-landlord-overrides-directly)](https://github.com/Hestia-Homes/Model/blob/main/docs/adr/0003-python-writes-landlord-overrides-directly.md), + accepted 2026-05-26, **supersedes** ADR-0002's "writes happen from Next.js" + clause. The lambda computes **and** persists the classification directly to + Postgres via a SQLAlchemy `LandlordOverrideRepository[E]`, with an + `ON CONFLICT … WHERE source = 'classifier'` upsert. There is **no** Next.js + POST-back. Drizzle stays schema source-of-truth; the Python `SQLModel` + shadows it. -## Decided so far +## Decided ### Q1 — Scope @@ -41,8 +50,7 @@ to the lambda → lambda needs edits to work when deployed. ### Q2 — Categories -**All four classifier categories** get independent optional slots in -[`INTERNAL_FIELDS`](../../src/app/portfolio/[slug]/(portfolio)/bulk-upload/[uploadId]/map-columns/MapColumnsClient.tsx#L14-L21): +**All four classifier categories** are independent optional fields: `property_type`, `built_form_type`, `wall_type`, `roof_type`. Rejected alternatives: (a) start with only PT+BF — wasted plumbing churn for @@ -50,58 +58,118 @@ the same migration cost; (c) collapse PT+BF into one UI slot — bakes a backend coincidence (they read the same CSV column today) into the user-facing model. -### Q2.1 — No `autoDetect` for the new slots +### Q2.1 — No `autoDetect` for classifier fields -The four new slots default to `"skip"`. The user must explicitly map them. -`autoDetect()` regex patterns are for required address-ish fields only. +The four classifier fields default to unmapped ("Not provided"). The user must +explicitly map them. `autoDetect()` is for required address-ish fields only. **Why:** Address headers are unambiguous and required, so guessing is safe and -useful. Landlord-description columns are ambiguous (a "type" column could be -PropertyType or BuiltFormType or something else) and they are optional — +useful. Landlord-description columns are ambiguous and optional — auto-detecting them would silently opt the landlord into classifier runs they didn't intend. -## Open — resume here +### Q2.2 — Mapping shape: unified `field → header` (refines Q2's mechanism) -### Q3 (in flight) — Uniqueness validation on the mapping +The mapping is **one unified map keyed by internal field, valued by source CSV +header** (`{ address_1: "Addr 1", property_type: "Property Type", +built_form_type: "Property Type", wall_type: "Walls", … }`), replacing the +current `header → field` shape +([MapColumnsClient.tsx:62-64](<../../src/app/portfolio/[slug]/(portfolio)/bulk-upload/[uploadId]/map-columns/MapColumnsClient.tsx#L62-L64>)). -Today validation only checks required fields exist -([MapColumnsClient.tsx:67-68](../../src/app/portfolio/[slug]/(portfolio)/bulk-upload/[uploadId]/map-columns/MapColumnsClient.tsx#L67-L68)); -two CSV headers can both map to `address_1` silently. +**Why:** the backend's `ClassifiableColumn` is a `(name, source_column)` pair +where **one `source_column` feeds many `name`s** — `"Property Type"` feeds both +`property_type` and `built_form_type` +([classifiable_column.py:28-31](https://github.com/Hestia-Homes/Model/blob/feature/landlord_data/orchestration/classifiable_column.py)). +A `header → field` map (one value per header) **cannot express** that; the +chosen mechanism in Q2 (extending the single `INTERNAL_FIELDS` dropdown) would +leave `built_form_type` unmappable. `field → header` matches the backend model +1:1 and lets multiple categories share a header. -- (a) Leave it alone — backend last-wins. -- (b) Enforce uniqueness only on the four new slots. -- (c) Enforce uniqueness everywhere except `skip`. **Recommended.** +**Consequences:** UI inverts to one row per internal field with a header-picker; +`autoDetect` inverts to "best header per field"; "skip" becomes a per-field +"Not provided"; rewrite +[`transformFile`](<../../src/app/api/portfolio/[portfolioId]/bulk-uploads/[uploadId]/start-address-matching/route.ts#L17-L54>) +(currently iterates `header → field`), the `z.record` route schema, and +**migrate existing persisted `columnMapping` rows** (semantics flip) — see Q5. -### Q4 (queued — biggest) — Trigger mechanism +### Q3 — Sharing / uniqueness rule (reframed) -How does Next.js invoke the lambda once mapping is complete? SQS message? -Direct lambda invoke? HTTP endpoint? And what's the state-machine integration — -new `BulkUpload` status, or runs orthogonally to address matching? +A source header may feed **at most one address/reference field**, but **any +number of classifier fields** (PT + BF → `"Property Type"` is required, so +classifier sharing must be allowed). Required-field validation stays: +`address_1` and `postcode` must each be assigned a header. -This drives both the deployment work and the lambda edits. Likely worth its -own ADR once decided. +This **replaces** the original "enforce uniqueness everywhere except skip" +framing, which was wrong once classifier header-sharing became a requirement. -### Q5 (queued) — Persistence of the extended mapping +### Q4 — Trigger mechanism + state-machine integration (the big one) -Current `bulkAddressUploads.columnMapping` is a `Record` and -naturally accommodates the new slots. Confirm no separate table is needed. +- **Transport:** reuse + [`triggerFastApiPipeline`](../../src/lib/bulkUpload/server.ts) with a **new + FastAPI endpoint**; payload `{ task_id, sub_task_id, s3_uri, portfolio_id, + column_mapping }`. FastAPI turns the POST into the SQS subtask envelope the + handler TODO (handler.py:51-54) references. Not a direct lambda invoke, not + Next.js → SQS. +- **`s3_uri` is the ORIGINAL upload** (`upload.s3Bucket/s3Key`), **not** the + address-matching transformed CSV — the description columns and their original + header names only survive in the original (the address transform strips every + non-address column). +- **Writes:** lambda writes overrides directly to Postgres (ADR-0003); no + POST-back. +- **State machine:** the classifier runs as a **subtask under the same address + task** (not a separate task, not a new `BulkUpload` status). Both subtasks + fire together at the **"Start address matching"** action. Safe: the combiner + globs S3 `ara_raw_outputs/{task_id}/`, which the classifier never writes to, + so the combined address output is not affected; `subtask_handler` does no + sibling-completion gating. +- **Progress honesty:** add a nullable **`service`/`kind` discriminator to + `sub_task`** (existing rows = address/legacy) so the progress view shows + address batches vs classification separately and attributes failures + correctly. Update the 3 subtask-count sites + `OnboardingProgress`. -### Q6 (queued) — Lambda edits +### Q5 — Persistence + migration of the inverted mapping -Handler hardcodes `source_column="Property Type" / "Walls" / "Roofs"`; needs -to read the mapping from the trigger body. -`LandlordDescriptionOverridesTriggerBody` already exists — check what fields -it has vs needs. +Reuse `bulkAddressUploads.columnMapping` (jsonb `Record`); **no +separate table**. The Q2.2 inversion (`header → field` → `field → header`) is +handled by a **one-shot data migration**: invert each non-skip entry; on a +legacy duplicate (two headers → one field, which the new Q3 rule forbids +anyway) last-write-wins; `skip` entries drop. The migration is a no-op on empty +tables, so it's safe regardless of data volume. `validateMapping` +([server.ts:97-102](../../src/lib/bulkUpload/server.ts#L97-L102)) and the route +schema must be rewritten for the inverted shape (they currently check the +**values** for `address_1`/`postcode`). -### Q7 (queued) — Review/Edit UI for classified mappings +### Q6 — Lambda edits -Is the per-row review/edit surface in scope for this iteration, or deferred? -User has not addressed yet. ADR-0002 calls this "the future override -frontend" and treats it as deferred work — but "front end e2e from -bulk_upload" could reasonably include it. +`LandlordDescriptionOverridesTriggerBody` (task_id, sub_task_id, s3_uri, +portfolio_id; `extra="allow"`) gains a **`column_mapping: dict[str, str]`** +field carrying **only the classifier subset** (`category → source header`); the +frontend extracts the four classifier keys from its unified map before sending. +The handler keeps a fixed registry of the four category builders (each owns its +enum, repo, and any hint such as the wall construction-date one) and supplies +`source_column` from `column_mapping`, **skipping categories the user didn't +map**. Both `property_type` and `built_form_type` carry `"Property Type"`, so +PT+BF sharing falls out for free. Drop the hardcoded trigger and the 20-row cap. -## Resuming +### Q7 — Results UI: read-only this iteration -Re-read this file, then ask Q3. Don't re-litigate Q1/Q2 unless the user -reopens them. +This iteration ships a **read-only** results surface: it reads the four +`landlord_*_overrides` tables and shows each `description → value` (per +portfolio) so the pipeline is observably e2e. **No editing / write-back** — the +user-override write path (correcting a row to `source='user'`, which the +classifier upsert won't overwrite) is the deferred "future override frontend" +ADR-0002 anticipates. + +## Next steps + +Grilling is complete (Q1–Q7). Suggested follow-ups: + +1. Promote Q4 (trigger + state-machine integration) to a **frontend ADR-0003**; + cross-link the backend's ADR-0003 (which already superseded ADR-0002's + Next.js-writes clause). +2. Break the work into issues (`/to-issues`): (a) invert the mapping shape + UI + + migration + validation; (b) `sub_task` discriminator + progress view; + (c) classifier trigger (new FastAPI endpoint + payload, fire at "Start"); + (d) lambda edits (trigger body + registry-from-mapping); (e) read-only + results view. +3. Confirm the working branch (see header) before implementation starts.