Adds the multiEntryOrdering jsonb column + interactive order picker: the
largest-count multi-entry sample is shown with a building-part dropdown per
file position (one Main building + Extensions), validated as a permutation.
A PATCH route persists { count: permutation } + confirmed, and Finalise is
disabled until the ordering is confirmed when the upload is multi-entry.
Migration for the new column is intentionally not included here — generated
after origin/main is merged (its multi_entry_summary migration lands first).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
9.3 KiB
Multi-entry building-part ordering — in-flight design notes
Status: Grilling complete (2026-06-02) — ready to break into issues
Branch: feature/frontend_landlord_overrides
Author: Jun-te (with Claude, via /grill-me)
A design-in-progress document, not the ADR. It records the decisions reached during grilling so the conversation can resume without re-litigating settled questions. The flow + schema decision is promoted to ADR-0004; new domain terms are promoted to CONTEXT.md.
Goal
After address matching and classification finish, a single address row can carry
comma-separated entries in physical-element columns — e.g.
Walls = "Cavity: AsBuilt (1976-1982), Cavity: FilledCavity",
Roofs = "Flat: As Built, PitchedNormalLoftAccess: 200mm". Each entry is a
building part (main building + extensions). The order is ambiguous and a
consistent per-file mistake, so we capture the correct ordering from the user
once per file and persist it on the BulkUpload for a later consumer.
Backstory / ground truth (verified against the example file + code)
- In
ARA AddressProfiling_Download_28-04-2026_10501 (2).xlsx(32,213 data rows): 0 UPRNs appear in more than one row — multi-entry is comma-separated values inside one cell, never multiple rows per address. - In a multi-entry row the multi-valued columns agree on count (Walls=2 ∧
Roofs=2) while whole-dwelling columns stay at 1 (
Property Type="House: EndTerrace"). So position i is the same building part across every multi-valued column. - The classifier today discards this:
get_col_to_description_mappingsdoesvalue.split(",")into aset— orderless, deduped. Correct for the vocabulary layer (description→enum), but it drops exactly the position/building-part association this feature needs. - This is the per-Property building-part fact territory ADR-0002 deferred ("a per-Property fact layer (not yet modelled)"). We are not building that layer here — only capturing the ordering it will need.
Decided
Q1 — Order semantics: full reorder, keyed by count
Position i = a building part. The user supplies a permutation per distinct
entry-count; persisted as { count: permutation }. This iteration captures
only the largest-count sample (see Q5).
Q1.1 — Order scope: one ordering across all columns
A single per-count permutation realigns every multi-valued column at once (index-aligned — Walls[i] and Roofs[i] are the same part). Not per-column. Matches the data (counts agree across columns).
Q1.2 — Mixed counts: single-value columns are whole-property
A 1-entry column (e.g. Property Type) is a whole-dwelling fact attached to
the property; only columns with N>1 are sliced into building parts. No padding.
Q2 — Scope: capture + persist ordering only
Detect multi-entry, show one sample address + our classification, capture the per-count ordering, persist on the BulkUpload. Not in scope: the per-Property fact table or writing main/extension facts at finalise. The ordering is stored for a later consumer.
Q2.1 — Editable verification IS in scope (expands Q2)
The "verify classification" step lets the user correct a classification,
written back as source='user'. This deliberately picks up ADR-0002 Q7's
deferred vocabulary user-override write path — distinct from the per-Property
fact layer, which stays deferred.
Q3 — Placement: on the awaiting_review surface
Render the flow on the existing
OnboardingProgress
page when status === "awaiting_review". Classification finishes before the
combiner (both subtasks must complete → combiner → awaiting_review), so by the
time Finalise is offered the classification output exists. No new route.
Q3.1 — Flow: two-step stepper, steps appear independently
- Step 1 — Verify classification — shows whenever ≥1 classifier column was mapped.
- Step 2 — Confirm order — shows only when multi-entry was detected.
- A file with classifier columns but no multi-entry shows only Step 1; a file with neither goes straight to Finalise.
Q3.2 — Gate: both steps gate Finalise (where each applies)
canFinalize = status==="awaiting_review" && (noClassifierCols || verifyAck) && (noMultiEntry || orderingConfirmed). Two flags persisted. Finalise is one
click but the button stays disabled until its applicable gates are satisfied.
Q4 — Verify step lists the sample address's entries only
Step 1 lists just the descriptions in the one sample address (matches "one
address"). Because a correction is per-(portfolio, description), editing one
changes the mapping portfolio-wide for that text — the UI must say so. A
spot-check, not full-vocabulary coverage.
Q4.1 — Write-back: Next.js upsert, source='user', single row (as built)
A Next.js route handler / server action upserts the landlord_*_overrides row
by (portfolio_id, description) setting value + source='user', validating
against the pgEnum. Schema unchanged — we keep ADR-0002's UNIQUE (portfolio_id, description) and flip the single row's source in place. The
Python classifier's existing ON CONFLICT … WHERE source='classifier'
(landlord_overrides_postgres_repository.py:84-91)
then never re-clobbers it.
Considered and rejected: two rows per description (classifier + user) with read-time
user > classifierresolution. It buys "revert to our suggestion" + provenance, and is cheap now (no readers exist yet), but reopens ADR-0002'sUNIQUEdecision and migrates Drizzle + 4 Python tables + the conflict target. Not worth it for this iteration; the single-row flip already gives "user wins". This is the first Next.js writer of asource='user'row.
Q5 — Which sample: the largest-count row
Show one sample address — the row with the most building parts — so ordering it reveals the fullest convention. In the common case (only N=2) that is a single 2-part address.
Q5.1 — Reorder UI: label each position
Lay the file's entries out as rows (position 0, 1, …), each with a building-part dropdown (Main building / Extension 1 / …). Assigning labels yields the permutation and validates (each part used once, exactly one Main building). All multi-valued columns are shown together, each raw entry annotated with our classified enum, so the user sanity-checks classification and alignment.
Q6 — Detection: at start, persist a summary
Compute the multi-entry summary in the start-address-matching POST
(route.ts:106)
where the full rows are already parsed in memory — which columns are
multi-valued, the distinct counts (with row-counts so we can pick the largest),
and the largest-count sample (address + per-column raw entries). Avoids
re-reading a 32k-row file at render. Classification enums are joined at render
from the override tables.
Q7 — Persistence: two jsonb columns on bulk_address_uploads
multiEntrySummary jsonb— written at start (detection).multiEntryOrdering jsonb— written at confirm:{ count: permutation }plusverifyAck/orderingConfirmedflags (final shape TBD; may split flags into their own columns).
No new table — mirrors how columnMapping lives on the upload row.
Risks / load-bearing assumptions
- Consistent-mistake assumption. All rows of a given count share one ordering convention. The whole "ask once" design rests on this; if a file mixes conventions within a count, a single per-count permutation is wrong.
- Largest-count-only capture. Smaller counts stay unpopulated in the map. A future consumer (or a later UI iteration) needs a derivation rule to apply the convention to other counts.
- Normalization coupling — mitigated. To join the sample's raw entries to
the override tables the frontend must match the backend's
split(",")→strip→lower. Resolution: store the normalized description keys inmultiEntrySummaryat start (the route already holds the rows), so the render-time join is exact-match — no cross-repo string-normalization drift. - Portfolio-wide blast radius. A verify-step edit changes the mapping for every row with that description, not just the sample address. Must be messaged in the UI.
Suggested issues (/to-issues)
- Schema: two jsonb columns on
bulk_address_uploads+ migration. - Detection at start: compute + persist
multiEntrySummary(with normalized description keys). - Verify step: list sample descriptions → enum (join override tables),
editable; Next.js upsert route writing
source='user';verifyAckflag. - Order step: largest-count sample, position→part dropdowns → permutation;
persist
multiEntryOrdering;orderingConfirmedflag. - Gate: wire
canFinalizeto the two flags; conditional stepper rendering.