From af86d53397a71c1e589a8c5ef333569acbacbdb7 Mon Sep 17 00:00:00 2001 From: Jun-te Kim Date: Tue, 2 Jun 2026 13:46:55 +0000 Subject: [PATCH] Confirm building-part ordering on awaiting_review (#297) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds the multiEntryOrdering jsonb column + interactive order picker: the largest-count multi-entry sample is shown with a building-part dropdown per file position (one Main building + Extensions), validated as a permutation. A PATCH route persists { count: permutation } + confirmed, and Finalise is disabled until the ordering is confirmed when the upload is multi-entry. Migration for the new column is intentionally not included here — generated after origin/main is merged (its multi_entry_summary migration lands first). Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/wip/multi-entry-ordering-plan.md | 178 ++++++++++++++++++ .../[uploadId]/multi-entry-ordering/route.ts | 49 +++++ src/app/db/schema/bulk_address_uploads.ts | 13 ++ .../[uploadId]/OnboardingProgress.tsx | 156 ++++++++++++--- src/lib/bulkUpload/client.ts | 21 +++ src/lib/bulkUpload/multiEntry.ts | 31 +++ src/lib/bulkUpload/server.ts | 42 +++++ 7 files changed, 467 insertions(+), 23 deletions(-) create mode 100644 docs/wip/multi-entry-ordering-plan.md create mode 100644 src/app/api/portfolio/[portfolioId]/bulk-uploads/[uploadId]/multi-entry-ordering/route.ts diff --git a/docs/wip/multi-entry-ordering-plan.md b/docs/wip/multi-entry-ordering-plan.md new file mode 100644 index 0000000..045f952 --- /dev/null +++ b/docs/wip/multi-entry-ordering-plan.md @@ -0,0 +1,178 @@ +# Multi-entry building-part ordering — in-flight design notes + +**Status:** Grilling complete (2026-06-02) — ready to break into issues +**Branch:** `feature/frontend_landlord_overrides` +**Author:** Jun-te (with Claude, via `/grill-me`) + +A _design-in-progress_ document, not the ADR. It records the decisions reached +during grilling so the conversation can resume without re-litigating settled +questions. The flow + schema decision is promoted to +[ADR-0004](../adr/0004-multi-entry-building-part-ordering.md); new domain terms +are promoted to [CONTEXT.md](../../CONTEXT.md#building-parts). + +## Goal + +After address matching and classification finish, a single address row can carry +**comma-separated entries** in physical-element columns — e.g. +`Walls = "Cavity: AsBuilt (1976-1982), Cavity: FilledCavity"`, +`Roofs = "Flat: As Built, PitchedNormalLoftAccess: 200mm"`. Each entry is a +**building part** (main building + extensions). The order is ambiguous and a +**consistent per-file mistake**, so we capture the correct ordering from the user +**once per file** and persist it on the BulkUpload for a later consumer. + +## Backstory / ground truth (verified against the example file + code) + +- In `ARA AddressProfiling_Download_28-04-2026_10501 (2).xlsx` (32,213 data + rows): **0 UPRNs appear in more than one row** — multi-entry is + comma-separated values **inside one cell**, never multiple rows per address. +- In a multi-entry row the multi-valued columns **agree on count** (Walls=2 ∧ + Roofs=2) while whole-dwelling columns stay at 1 (`Property Type` = `"House: + EndTerrace"`). So position *i* is the **same building part across every + multi-valued column**. +- The classifier today **discards** this: [`get_col_to_description_mappings`](/workspaces/home/github/Model/orchestration/landlord_description_overrides_orchestrator.py) + does `value.split(",")` into a **`set`** — orderless, deduped. Correct for the + vocabulary layer (description→enum), but it drops exactly the + position/building-part association this feature needs. +- This is the **per-Property building-part fact** territory ADR-0002 deferred + ("a per-Property fact layer (not yet modelled)"). We are **not** building that + layer here — only **capturing** the ordering it will need. + +## Decided + +### Q1 — Order semantics: full reorder, keyed by count + +Position *i* = a building part. The user supplies a **permutation per distinct +entry-count**; persisted as `{ count: permutation }`. This iteration captures +only the **largest-count** sample (see Q5). + +### Q1.1 — Order scope: one ordering across all columns + +A single per-count permutation realigns **every** multi-valued column at once +(index-aligned — Walls[i] and Roofs[i] are the same part). Not per-column. +Matches the data (counts agree across columns). + +### Q1.2 — Mixed counts: single-value columns are whole-property + +A 1-entry column (e.g. `Property Type`) is a **whole-dwelling** fact attached to +the property; only columns with N>1 are sliced into building parts. No padding. + +### Q2 — Scope: capture + persist ordering only + +Detect multi-entry, show one sample address + our classification, capture the +per-count ordering, persist on the BulkUpload. **Not** in scope: the +per-Property fact table or writing main/extension facts at finalise. The +ordering is stored for a later consumer. + +### Q2.1 — Editable verification IS in scope (expands Q2) + +The "verify classification" step lets the user **correct** a classification, +written back as `source='user'`. This deliberately picks up ADR-0002 Q7's +deferred **vocabulary** user-override write path — distinct from the per-Property +fact layer, which stays deferred. + +### Q3 — Placement: on the `awaiting_review` surface + +Render the flow on the existing +[OnboardingProgress](<../../src/app/portfolio/[slug]/(portfolio)/bulk-upload/[uploadId]/OnboardingProgress.tsx>) +page when `status === "awaiting_review"`. Classification finishes *before* the +combiner (both subtasks must complete → combiner → `awaiting_review`), so by the +time Finalise is offered the classification output exists. No new route. + +### Q3.1 — Flow: two-step stepper, steps appear independently + +- **Step 1 — Verify classification** — shows whenever **≥1 classifier column** + was mapped. +- **Step 2 — Confirm order** — shows only when **multi-entry was detected**. +- A file with classifier columns but no multi-entry shows only Step 1; a file + with neither goes straight to Finalise. + +### Q3.2 — Gate: both steps gate Finalise (where each applies) + +`canFinalize = status==="awaiting_review" && (noClassifierCols || verifyAck) && +(noMultiEntry || orderingConfirmed)`. Two flags persisted. Finalise is one +click but the button stays disabled until its applicable gates are satisfied. + +### Q4 — Verify step lists the sample address's entries only + +Step 1 lists just the descriptions in the **one sample address** (matches "one +address"). Because a correction is per-`(portfolio, description)`, editing one +changes the mapping **portfolio-wide** for that text — the UI must say so. A +spot-check, not full-vocabulary coverage. + +### Q4.1 — Write-back: Next.js upsert, `source='user'`, single row (as built) + +A Next.js route handler / server action upserts the `landlord_*_overrides` row +by `(portfolio_id, description)` setting `value` + `source='user'`, validating +against the pgEnum. **Schema unchanged** — we keep ADR-0002's `UNIQUE +(portfolio_id, description)` and flip the single row's source in place. The +Python classifier's existing `ON CONFLICT … WHERE source='classifier'` +([landlord_overrides_postgres_repository.py:84-91](/workspaces/home/github/Model/infrastructure/landlord_overrides/landlord_overrides_postgres_repository.py#L84)) +then never re-clobbers it. + +> Considered and **rejected**: two rows per description (classifier + user) with +> read-time `user > classifier` resolution. It buys "revert to our suggestion" + +> provenance, and is cheap now (no readers exist yet), but reopens ADR-0002's +> `UNIQUE` decision and migrates Drizzle + 4 Python tables + the conflict target. +> Not worth it for this iteration; the single-row flip already gives "user wins". +> This is the first Next.js writer of a `source='user'` row. + +### Q5 — Which sample: the largest-count row + +Show one sample address — the row with the **most** building parts — so ordering +it reveals the fullest convention. In the common case (only N=2) that is a +single 2-part address. + +### Q5.1 — Reorder UI: label each position + +Lay the file's entries out as rows (position 0, 1, …), each with a building-part +dropdown (**Main building** / **Extension 1** / …). Assigning labels yields the +permutation and validates (each part used once, exactly one Main building). All +multi-valued columns are shown together, each raw entry annotated with our +classified enum, so the user sanity-checks classification **and** alignment. + +### Q6 — Detection: at start, persist a summary + +Compute the multi-entry summary in the **start-address-matching POST** +([route.ts:106](<../../src/app/api/portfolio/[portfolioId]/bulk-uploads/[uploadId]/start-address-matching/route.ts#L106>)) +where the full `rows` are already parsed in memory — which columns are +multi-valued, the distinct counts (with row-counts so we can pick the largest), +and the largest-count sample (address + per-column raw entries). Avoids +re-reading a 32k-row file at render. Classification enums are joined at render +from the override tables. + +### Q7 — Persistence: two jsonb columns on `bulk_address_uploads` + +- `multiEntrySummary jsonb` — written at start (detection). +- `multiEntryOrdering jsonb` — written at confirm: `{ count: permutation }` plus + `verifyAck` / `orderingConfirmed` flags (final shape TBD; may split flags into + their own columns). + +No new table — mirrors how `columnMapping` lives on the upload row. + +## Risks / load-bearing assumptions + +1. **Consistent-mistake assumption.** All rows of a given count share one + ordering convention. The whole "ask once" design rests on this; if a file + mixes conventions within a count, a single per-count permutation is wrong. +2. **Largest-count-only capture.** Smaller counts stay unpopulated in the map. + A future consumer (or a later UI iteration) needs a derivation rule to apply + the convention to other counts. +3. **Normalization coupling — mitigated.** To join the sample's raw entries to + the override tables the frontend must match the backend's `split(",")` → + `strip` → `lower`. **Resolution:** store the *normalized* description keys in + `multiEntrySummary` at start (the route already holds the rows), so the + render-time join is exact-match — no cross-repo string-normalization drift. +4. **Portfolio-wide blast radius.** A verify-step edit changes the mapping for + every row with that description, not just the sample address. Must be + messaged in the UI. + +## Suggested issues (`/to-issues`) + +1. Schema: two jsonb columns on `bulk_address_uploads` + migration. +2. Detection at start: compute + persist `multiEntrySummary` (with normalized + description keys). +3. Verify step: list sample descriptions → enum (join override tables), + editable; Next.js upsert route writing `source='user'`; `verifyAck` flag. +4. Order step: largest-count sample, position→part dropdowns → permutation; + persist `multiEntryOrdering`; `orderingConfirmed` flag. +5. Gate: wire `canFinalize` to the two flags; conditional stepper rendering. diff --git a/src/app/api/portfolio/[portfolioId]/bulk-uploads/[uploadId]/multi-entry-ordering/route.ts b/src/app/api/portfolio/[portfolioId]/bulk-uploads/[uploadId]/multi-entry-ordering/route.ts new file mode 100644 index 0000000..676032f --- /dev/null +++ b/src/app/api/portfolio/[portfolioId]/bulk-uploads/[uploadId]/multi-entry-ordering/route.ts @@ -0,0 +1,49 @@ +import { setMultiEntryOrdering } from "@/lib/bulkUpload/server"; +import { NextRequest, NextResponse } from "next/server"; +import { getServerSession } from "next-auth"; +import { AuthOptions } from "@/app/api/auth/[...nextauth]/authOptions"; +import { z } from "zod"; + +const PatchSchema = z.object({ + // entry-count -> permutation (part slot -> file position). See ADR-0004. + permutations: z.record(z.string(), z.array(z.number().int().nonnegative())), +}); + +export async function PATCH( + request: NextRequest, + { params }: { params: Promise<{ portfolioId: string; uploadId: string }> } +) { + const session = await getServerSession(AuthOptions); + if (!session) return NextResponse.json({ error: "Unauthorized" }, { status: 401 }); + + const { uploadId } = await params; + + let body; + try { + body = PatchSchema.parse(await request.json()); + } catch { + return NextResponse.json({ error: "Invalid input" }, { status: 400 }); + } + + try { + const result = await setMultiEntryOrdering(uploadId, body.permutations); + switch (result.kind) { + case "ok": + return NextResponse.json(result.upload, { status: 200 }); + case "not_found": + return NextResponse.json({ error: "Not found" }, { status: 404 }); + case "wrong_state": + return NextResponse.json( + { error: `Cannot set ordering in state '${result.current}'` }, + { status: 409 } + ); + case "not_multi_entry": + return NextResponse.json({ error: "Upload has no multi-entry rows" }, { status: 409 }); + case "invalid_ordering": + return NextResponse.json({ error: result.reason }, { status: 422 }); + } + } catch (error) { + console.error("Failed to save multi-entry ordering:", error); + return NextResponse.json({ error: "Internal server error" }, { status: 500 }); + } +} diff --git a/src/app/db/schema/bulk_address_uploads.ts b/src/app/db/schema/bulk_address_uploads.ts index 8166551..6d6238a 100644 --- a/src/app/db/schema/bulk_address_uploads.ts +++ b/src/app/db/schema/bulk_address_uploads.ts @@ -25,6 +25,17 @@ export interface MultiEntrySummary { sample: MultiEntrySample | null; } +// User-confirmed building-part ordering (ADR-0004). Keyed by entry-count so it +// can hold more than one count later; this iteration populates only the +// largest. permutations[count][k] = the 0-based file position holding building +// part k, where 0 = Main building, 1..N-1 = Extension 1..N-1. +// e.g. { "2": [1, 0] } => for 2-part rows the main building is file position 1. +export interface MultiEntryOrdering { + permutations: Record; + // Set once the user confirms; gates Finalise when the upload is multi-entry. + confirmed: boolean; +} + export const bulkAddressUploads = pgTable("bulk_address_uploads", { id: uuid("id").defaultRandom().primaryKey(), portfolioId: text("portfolio_id").notNull(), @@ -38,6 +49,8 @@ export const bulkAddressUploads = pgTable("bulk_address_uploads", { // Multi-entry building-part detection, computed at start-address-matching // and read by the awaiting_review review surface (ADR-0004). multiEntrySummary: jsonb("multi_entry_summary").$type(), + // User-confirmed building-part ordering for the multi-entry sample (ADR-0004). + multiEntryOrdering: jsonb("multi_entry_ordering").$type(), taskId: uuid("task_id"), combinedOutputS3Uri: text("combined_output_s3_uri"), createdAt: timestamp("created_at", { withTimezone: true }).notNull().defaultNow(), diff --git a/src/app/portfolio/[slug]/(portfolio)/bulk-upload/[uploadId]/OnboardingProgress.tsx b/src/app/portfolio/[slug]/(portfolio)/bulk-upload/[uploadId]/OnboardingProgress.tsx index b80ae1c..b5ae7f4 100644 --- a/src/app/portfolio/[slug]/(portfolio)/bulk-upload/[uploadId]/OnboardingProgress.tsx +++ b/src/app/portfolio/[slug]/(portfolio)/bulk-upload/[uploadId]/OnboardingProgress.tsx @@ -1,14 +1,22 @@ "use client"; +import { useState } from "react"; import { useRouter } from "next/navigation"; import Link from "next/link"; -import { ArrowRightIcon } from "@heroicons/react/24/outline"; +import { ArrowRightIcon, CheckCircleIcon } from "@heroicons/react/24/outline"; import { useBulkUploadProgress, + useConfirmMultiEntryOrdering, useFinalize, useRequestCombine, } from "@/lib/bulkUpload/client"; -import type { MultiEntrySample } from "@/lib/bulkUpload/multiEntry"; +import { + partLabel, + isPermutation, + assignmentToPermutation, + type MultiEntrySample, +} from "@/lib/bulkUpload/multiEntry"; +import type { MultiEntryOrdering } from "@/app/db/schema/bulk_address_uploads"; interface Props { portfolioSlug: string; @@ -59,14 +67,15 @@ export default function OnboardingProgress({ const isImporting = upload.status === "awaiting_review"; const canRunCombiner = taskDone && !taskFailed && upload.status === "processing"; - const canFinalize = upload.status === "awaiting_review"; + const isAwaitingReview = upload.status === "awaiting_review"; - // Multi-entry building-part sample, shown read-only on the review surface - // (ADR-0004). Ordering confirmation arrives in a later slice. - const multiEntrySample = - upload.status === "awaiting_review" - ? (upload.multiEntrySummary?.sample ?? null) - : null; + // Multi-entry building-part sample on the review surface (ADR-0004). When the + // upload is multi-entry, Finalise is gated on the user confirming the order. + const multiEntrySample = isAwaitingReview + ? (upload.multiEntrySummary?.sample ?? null) + : null; + const orderingConfirmed = upload.multiEntryOrdering?.confirmed ?? false; + const canFinalize = isAwaitingReview && (!multiEntrySample || orderingConfirmed); return (
@@ -128,9 +137,16 @@ export default function OnboardingProgress({ )}
- {multiEntrySample && } + {multiEntrySample && ( + + )} - {(canRunCombiner || canFinalize) && ( + {(canRunCombiner || isAwaitingReview) && (
{canRunCombiner && ( combine.mutate()} /> )} - {canFinalize && ( + {isAwaitingReview && ( finalize.mutate(undefined, { onSuccess: () => router.refresh() }) } @@ -175,35 +193,78 @@ export default function OnboardingProgress({ ); } -// Read-only preview of the largest-count multi-entry row (ADR-0004). Each -// comma-separated entry is a building part; the user will confirm their order -// in a later slice. Positions are shown 1-based, unlabelled for now. -function MultiEntrySamplePanel({ sample }: { sample: MultiEntrySample }) { +// Interactive building-part ordering for the largest-count multi-entry sample +// (ADR-0004). The user labels each file position with a building part (one Main +// building + Extensions); the labels must form a permutation. Confirming +// persists the ordering and unlocks Finalise. +function MultiEntryOrderingPanel({ + sample, + ordering, + portfolioId, + uploadId, +}: { + sample: MultiEntrySample; + ordering: MultiEntryOrdering | null; + portfolioId: string; + uploadId: string; +}) { + const confirm = useConfirmMultiEntryOrdering(portfolioId, uploadId); + const count = sample.count; + + // assignment[filePosition] = building-part slot. Seed from a stored ordering + // (slot -> position, so invert) or default to identity (main building first). + const [assignment, setAssignment] = useState(() => { + const stored = ordering?.permutations?.[String(count)]; + if (stored && stored.length === count) { + const seeded = new Array(count); + stored.forEach((position, slot) => { + seeded[position] = slot; + }); + return seeded; + } + return Array.from({ length: count }, (_, i) => i); + }); + + const confirmed = ordering?.confirmed ?? false; + const valid = isPermutation(assignment); + + const setSlot = (position: number, slot: number) => + setAssignment((prev) => prev.map((s, i) => (i === position ? slot : s))); + + const onConfirm = () => { + if (!valid) return; + confirm.mutate({ + permutations: { [String(count)]: assignmentToPermutation(assignment) }, + }); + }; + return (

- Multiple building parts detected + Confirm building-part order

{sample.address ? {sample.address} : "An address"}{" "} - has {sample.count} building parts (e.g. a main building and extensions). - You'll be asked to confirm their order before finalising. + has {count} building parts. Tell us which entry is the main building and + which are extensions — we'll apply the same order to every{" "} + {count}-part row in this file.

- + {sample.columns.map((column) => ( ))} + - {Array.from({ length: sample.count }).map((_, position) => ( + {Array.from({ length: count }).map((_, position) => ( {sample.columns.map((column) => ( @@ -211,11 +272,54 @@ function MultiEntrySamplePanel({ sample }: { sample: MultiEntrySample }) { {column.entries[position]?.raw ?? "—"} ))} + ))}
PositionEntry {column.header} Building part
{position + 1} + +
+ + {!valid && ( +

+ Each part (Main building, Extension 1, …) must be used exactly once. +

+ )} + +
+ + {confirmed && !confirm.isPending && ( + + + Order confirmed + + )} +
+ + {confirm.error && ( +

{confirm.error.message}

+ )}
); } @@ -224,19 +328,25 @@ function StageButton({ label, activeLabel, isPending, + disabled = false, + disabledReason, onClick, }: { label: string; activeLabel: string; isPending: boolean; + disabled?: boolean; + disabledReason?: string; onClick: () => void; }) { + const blocked = isPending || disabled; return (