save plans for landlord overrid

2026-06-30 12:55:02 +00:00 · 2026-05-27 10:13:59 +00:00 · 2026-05-27 10:13:59 +00:00 · 4981b0ee12
commit 4981b0ee12
parent 899ac432ea
1 changed files with 118 additions and 50 deletions
--- a/docs/wip/landlord-override-frontend-plan.md
+++ b/docs/wip/landlord-override-frontend-plan.md
@ -1,13 +1,14 @@
 # Landlord override frontend — in-flight design notes

-**Status:** Paused mid-grilling (2026-05-27)
-**Branch:** `feautre/additional_walltypes`
+**Status:** Grilling complete (2026-05-28) — Q1–Q7 resolved; ready to promote to ADR
+**Branch:** `feature/frontend_landlord_overrides`
 **Author:** Jun-te (with Claude, via `/grill-me`)

-This is a *design-in-progress* document, not an ADR. It captures decisions made
-so far on the landlord-override frontend plan so the conversation can resume
-without re-litigating settled questions. Promote to an ADR once the trigger
-mechanism (Q4) is resolved — that's the decision worth permanent recording.
+This is a _design-in-progress_ document, not an ADR. It captures the decisions
+reached during grilling so the conversation can resume without re-litigating
+settled questions. All seven questions are now resolved; the trigger +
+state-machine integration (Q4) is ready to promote to a frontend ADR-0003, and
+the work is ready to break into issues.

 ## Goal

@ -23,14 +24,22 @@ Build the front-end e2e for `landlord_description_override`, starting from the
  rationale in [ADR-0002](../adr/0002-landlord-override-vocabulary.md).
 - Nothing in Next.js reads or writes them yet.
 - The Python lambda at
-  `/workspaces/home/github/Model/applications/landlord_description_overrides/handler.py`
-  is **not deployed** and **not wired** into the BulkUpload pipeline. It
-  hardcodes its trigger params (`portfolio_id`, `s3_uri`) and its source column
-  names (`"Property Type"`, `"Walls"`, `"Roofs"`).
- Note: ADR-0002 says writes come from Next.js POST, but the current backend
-  writes direct to Postgres. This drift may need to be revisited under Q4.
+  `Model/applications/landlord_description_overrides/handler.py` (branch
+  `feature/landlord_data`) is **not deployed** and **not wired** into the
+  BulkUpload pipeline. It hardcodes the trigger fields (handler.py:55-60) and
+  builds a hardcoded `ClassifiableColumn` list with `source_column`s
+  `"Property Type"` (→ both `property_type` **and** `built_form_type`),
+  `"Walls"`, `"Roofs"`. It also caps the batch to 20 rows while under test.
+- **Resolved drift (was a Q4 risk):** the backend's
+  [ADR-0003 (python-writes-landlord-overrides-directly)](https://github.com/Hestia-Homes/Model/blob/main/docs/adr/0003-python-writes-landlord-overrides-directly.md),
+  accepted 2026-05-26, **supersedes** ADR-0002's "writes happen from Next.js"
+  clause. The lambda computes **and** persists the classification directly to
+  Postgres via a SQLAlchemy `LandlordOverrideRepository[E]`, with an
+  `ON CONFLICT … WHERE source = 'classifier'` upsert. There is **no** Next.js
+  POST-back. Drizzle stays schema source-of-truth; the Python `SQLModel`
+  shadows it.

-## Decided so far
+## Decided

 ### Q1 — Scope

@ -41,8 +50,7 @@ to the lambda → lambda needs edits to work when deployed.

 ### Q2 — Categories

-**All four classifier categories** get independent optional slots in
-[`INTERNAL_FIELDS`](../../src/app/portfolio/[slug]/(portfolio)/bulk-upload/[uploadId]/map-columns/MapColumnsClient.tsx#L14-L21):
+**All four classifier categories** are independent optional fields:
 `property_type`, `built_form_type`, `wall_type`, `roof_type`.

 Rejected alternatives: (a) start with only PT+BF — wasted plumbing churn for
@ -50,58 +58,118 @@ the same migration cost; (c) collapse PT+BF into one UI slot — bakes a
 backend coincidence (they read the same CSV column today) into the
 user-facing model.

-### Q2.1 — No `autoDetect` for the new slots
+### Q2.1 — No `autoDetect` for classifier fields

-The four new slots default to `"skip"`. The user must explicitly map them.
-`autoDetect()` regex patterns are for required address-ish fields only.
+The four classifier fields default to unmapped ("Not provided"). The user must
+explicitly map them. `autoDetect()` is for required address-ish fields only.

 **Why:** Address headers are unambiguous and required, so guessing is safe and
-useful. Landlord-description columns are ambiguous (a "type" column could be
-PropertyType or BuiltFormType or something else) and they are optional —
+useful. Landlord-description columns are ambiguous and optional —
 auto-detecting them would silently opt the landlord into classifier runs they
 didn't intend.

-## Open — resume here
+### Q2.2 — Mapping shape: unified `field → header` (refines Q2's mechanism)

-### Q3 (in flight) — Uniqueness validation on the mapping
+The mapping is **one unified map keyed by internal field, valued by source CSV
+header** (`{ address_1: "Addr 1", property_type: "Property Type",
+built_form_type: "Property Type", wall_type: "Walls", … }`), replacing the
+current `header → field` shape
+([MapColumnsClient.tsx:62-64](<../../src/app/portfolio/[slug]/(portfolio)/bulk-upload/[uploadId]/map-columns/MapColumnsClient.tsx#L62-L64>)).

-Today validation only checks required fields exist
-([MapColumnsClient.tsx:67-68](../../src/app/portfolio/[slug]/(portfolio)/bulk-upload/[uploadId]/map-columns/MapColumnsClient.tsx#L67-L68));
-two CSV headers can both map to `address_1` silently.
+**Why:** the backend's `ClassifiableColumn` is a `(name, source_column)` pair
+where **one `source_column` feeds many `name`s** — `"Property Type"` feeds both
+`property_type` and `built_form_type`
+([classifiable_column.py:28-31](https://github.com/Hestia-Homes/Model/blob/feature/landlord_data/orchestration/classifiable_column.py)).
+A `header → field` map (one value per header) **cannot express** that; the
+chosen mechanism in Q2 (extending the single `INTERNAL_FIELDS` dropdown) would
+leave `built_form_type` unmappable. `field → header` matches the backend model
+1:1 and lets multiple categories share a header.

- (a) Leave it alone — backend last-wins.
- (b) Enforce uniqueness only on the four new slots.
- (c) Enforce uniqueness everywhere except `skip`. **Recommended.**
+**Consequences:** UI inverts to one row per internal field with a header-picker;
+`autoDetect` inverts to "best header per field"; "skip" becomes a per-field
+"Not provided"; rewrite
+[`transformFile`](<../../src/app/api/portfolio/[portfolioId]/bulk-uploads/[uploadId]/start-address-matching/route.ts#L17-L54>)
+(currently iterates `header → field`), the `z.record` route schema, and
+**migrate existing persisted `columnMapping` rows** (semantics flip) — see Q5.

-### Q4 (queued — biggest) — Trigger mechanism
+### Q3 — Sharing / uniqueness rule (reframed)

-How does Next.js invoke the lambda once mapping is complete? SQS message?
-Direct lambda invoke? HTTP endpoint? And what's the state-machine integration —
-new `BulkUpload` status, or runs orthogonally to address matching?
+A source header may feed **at most one address/reference field**, but **any
+number of classifier fields** (PT + BF → `"Property Type"` is required, so
+classifier sharing must be allowed). Required-field validation stays:
+`address_1` and `postcode` must each be assigned a header.

-This drives both the deployment work and the lambda edits. Likely worth its
-own ADR once decided.
+This **replaces** the original "enforce uniqueness everywhere except skip"
+framing, which was wrong once classifier header-sharing became a requirement.

-### Q5 (queued) — Persistence of the extended mapping
+### Q4 — Trigger mechanism + state-machine integration (the big one)

-Current `bulkAddressUploads.columnMapping` is a `Record<string, string>` and
-naturally accommodates the new slots. Confirm no separate table is needed.
+- **Transport:** reuse
+  [`triggerFastApiPipeline`](../../src/lib/bulkUpload/server.ts) with a **new
+  FastAPI endpoint**; payload `{ task_id, sub_task_id, s3_uri, portfolio_id,
+  column_mapping }`. FastAPI turns the POST into the SQS subtask envelope the
+  handler TODO (handler.py:51-54) references. Not a direct lambda invoke, not
+  Next.js → SQS.
+- **`s3_uri` is the ORIGINAL upload** (`upload.s3Bucket/s3Key`), **not** the
+  address-matching transformed CSV — the description columns and their original
+  header names only survive in the original (the address transform strips every
+  non-address column).
+- **Writes:** lambda writes overrides directly to Postgres (ADR-0003); no
+  POST-back.
+- **State machine:** the classifier runs as a **subtask under the same address
+  task** (not a separate task, not a new `BulkUpload` status). Both subtasks
+  fire together at the **"Start address matching"** action. Safe: the combiner
+  globs S3 `ara_raw_outputs/{task_id}/`, which the classifier never writes to,
+  so the combined address output is not affected; `subtask_handler` does no
+  sibling-completion gating.
+- **Progress honesty:** add a nullable **`service`/`kind` discriminator to
+  `sub_task`** (existing rows = address/legacy) so the progress view shows
+  address batches vs classification separately and attributes failures
+  correctly. Update the 3 subtask-count sites + `OnboardingProgress`.

-### Q6 (queued) — Lambda edits
+### Q5 — Persistence + migration of the inverted mapping

-Handler hardcodes `source_column="Property Type" / "Walls" / "Roofs"`; needs
-to read the mapping from the trigger body.
-`LandlordDescriptionOverridesTriggerBody` already exists — check what fields
-it has vs needs.
+Reuse `bulkAddressUploads.columnMapping` (jsonb `Record<string, string>`); **no
+separate table**. The Q2.2 inversion (`header → field` → `field → header`) is
+handled by a **one-shot data migration**: invert each non-skip entry; on a
+legacy duplicate (two headers → one field, which the new Q3 rule forbids
+anyway) last-write-wins; `skip` entries drop. The migration is a no-op on empty
+tables, so it's safe regardless of data volume. `validateMapping`
+([server.ts:97-102](../../src/lib/bulkUpload/server.ts#L97-L102)) and the route
+schema must be rewritten for the inverted shape (they currently check the
+**values** for `address_1`/`postcode`).

-### Q7 (queued) — Review/Edit UI for classified mappings
+### Q6 — Lambda edits

-Is the per-row review/edit surface in scope for this iteration, or deferred?
-User has not addressed yet. ADR-0002 calls this "the future override
-frontend" and treats it as deferred work — but "front end e2e from
-bulk_upload" could reasonably include it.
+`LandlordDescriptionOverridesTriggerBody` (task_id, sub_task_id, s3_uri,
+portfolio_id; `extra="allow"`) gains a **`column_mapping: dict[str, str]`**
+field carrying **only the classifier subset** (`category → source header`); the
+frontend extracts the four classifier keys from its unified map before sending.
+The handler keeps a fixed registry of the four category builders (each owns its
+enum, repo, and any hint such as the wall construction-date one) and supplies
+`source_column` from `column_mapping`, **skipping categories the user didn't
+map**. Both `property_type` and `built_form_type` carry `"Property Type"`, so
+PT+BF sharing falls out for free. Drop the hardcoded trigger and the 20-row cap.

-## Resuming
+### Q7 — Results UI: read-only this iteration

-Re-read this file, then ask Q3. Don't re-litigate Q1/Q2 unless the user
-reopens them.
+This iteration ships a **read-only** results surface: it reads the four
+`landlord_*_overrides` tables and shows each `description → value` (per
+portfolio) so the pipeline is observably e2e. **No editing / write-back** — the
+user-override write path (correcting a row to `source='user'`, which the
+classifier upsert won't overwrite) is the deferred "future override frontend"
+ADR-0002 anticipates.
+
+## Next steps
+
+Grilling is complete (Q1–Q7). Suggested follow-ups:
+
+1. Promote Q4 (trigger + state-machine integration) to a **frontend ADR-0003**;
+   cross-link the backend's ADR-0003 (which already superseded ADR-0002's
+   Next.js-writes clause).
+2. Break the work into issues (`/to-issues`): (a) invert the mapping shape + UI
+   + migration + validation; (b) `sub_task` discriminator + progress view;
+   (c) classifier trigger (new FastAPI endpoint + payload, fire at "Start");
+   (d) lambda edits (trigger body + registry-from-mapping); (e) read-only
+   results view.
+3. Confirm the working branch (see header) before implementation starts.