docs: handover — flagged numbers were stale (different branch), Part 1 is the task

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Khalim Conn-Kowlessar 2026-06-04 12:11:12 +00:00
parent 8bd8ff8e5c
commit 1085842395

View file

@ -43,47 +43,20 @@ at 1e-4. The user will provide the cert numbers + drop the PDFs. For each:
worksheet but cascade 0/wrong → calculator; input field absent in `epc` → mapper or
extractor. Fix one cause = one slice.
### Part 2 — six flagged certs to fetch fresh and debug
### Part 2 — (secondary) re-check the previously-flagged certs on THIS branch
The user's dashboard flags these (their numbers, **sign = lodged our**):
A dashboard once flagged six certs (0240, 0390-2954-3640, 2130, 6035, 7536, 9390). **Those
numbers are STALE — they came from a branch WITHOUT this branch's fixes** (the user confirmed
this). Do not chase them. On THIS branch the picture is different and mostly settled:
| cert | lodged | their "our" | their Δ |
|---|---|---|---|
| 0240-0200-5706-2365-8010 | 73 | 71.73 | +1.27 |
| 0390-2954-3640-2196-4175 | 60 | 66.85 | 6.85 |
| 2130-1033-4050-5007-8395 | 82 | 83.35 | 1.35 |
| 6035-7729-2309-0879-2296 | 70 | 67.81 | +2.19 |
| 7536-3827-0600-0600-0276 | 68 | 69.07 | 1.07 |
| 9390-2722-3520-2105-8715 | 67 | 71.24 | 4.24 |
- 7536 (68.924, +1), 2130 (83.78, +2), 0240 (1) — concluded **0240-like** (API-only
residuals; see per-cert notes below). 0390-2954-3640 pins at **+0** (exact).
- 6035 (+2.19) and 9390 (community, 2) carry documented open residuals (see notes) but are
lower-priority and not worksheet-backed.
### ⚠ CRITICAL — reconcile the numbers FIRST, before debugging
**The user's flagged numbers DO NOT match the golden-fixtures cascade.** All six certs are
already golden fixtures (`tests/domain/sap10_calculator/rdsap/fixtures/golden/<cert>.json`),
and the cascade gives different values:
- **0390-2954-3640 is pinned at resid +0** (our cascade = 60, EXACTLY lodged) — but the user
flags it at **66.85 (6.85)**. A 6.85 SAP gap can't be staleness.
- 7536 (their 69.07) and 2130 (their 83.35) are **pre-this-session** values — the S0380.214
roof fix moved 7536 → 68.924, and the S0380.215 wall fix moved 2130 → 83.78.
So the user's numbers come from a **different computation** than the golden cascade. Two
hypotheses, test both before assuming the cascade is wrong:
1. **Fresh API JSON ≠ curated fixture.** The golden fixtures were bulk-fetched once
(`scripts/fetch_cohort2_api_jsons.py`, which *skips certs whose JSON already exists`) and
some may have been hand-corrected since. **Fetch each cert fresh and `diff` the raw JSON
against the committed fixture.** If they differ, the fixture was curated and the fresh raw
data is what the user's pipeline sees — debug the FRESH data. This is the most likely cause
and exactly why the user wants a fresh fetch.
2. **A different SAP engine.** The production stack (`backend/SearchEpc.py`
`etl/epc_clean/epc_attributes/*``backend/engine/engine.py`) is a SEPARATE mapping +
scorer from `cert_to_inputs`. If the user's dashboard is produced there, that's a different
code path than the golden cascade. Ask the user which pipeline the table came from.
Do NOT start "fixing" the cascade to hit the user's numbers until you know which pipeline
produced them. The golden cascade is worksheet-validated for 47 certs; chasing a dashboard
number from a different stack would regress it.
So Part 2 is only worth touching if a **fresh fetch differs from the committed fixture**
(curated/hand-corrected fixtures can mask raw-API mapper behaviour) — `diff` fresh vs fixture
and debug the delta. Otherwise these are done; the real new work is **Part 1**.
---