chore(tooling): add median column to corpus profiler; document shutter non-implementation

profile_corpus_error.py: print median signed error alongside the mean in each feature bucket. The mean is dragged by fat-tail register anomalies (e.g. electric room heaters mean -1.09 but median -0.01) — median is the outlier-resistant lens for finding TRUE systematic slices, so hunt by |median|, not |signed|. heat_transmission.py: document why permanent-shutter R is deliberately NOT applied (Elmhurst uses R=0.04 curtains on every window incl. insulated shutters, proven on case 46; API-path trial worsens MAE). Comment-only. pyright not installed locally — strict type gate not run. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 13:10:47 +00:00 · 2026-06-23 08:46:00 +00:00 · 2026-06-23 08:46:00 +00:00 · 2ac5ec6eb5
commit 2ac5ec6eb5
parent 6a4539f26b
2 changed files with 27 additions and 2 deletions
--- a/domain/sap10_calculator/worksheet/heat_transmission.py
+++ b/domain/sap10_calculator/worksheet/heat_transmission.py
@ -129,6 +129,25 @@ _DEFAULT_STOREY_HEIGHT_M: Final[float] = 2.5
 _CONSERVATORY_WALL_THICKNESS_MM: Final[int] = 300
 # SAP10.2 §3.2 curtain/blind thermal resistance applied to windows (and
 # roof windows) — turns raw window U into the worksheet's (27) effective U.
+#
+# PERMANENT SHUTTERS ARE DELIBERATELY NOT APPLIED. RdSAP 10 Table 24 note
+# (PDF p.51) genuinely specifies a larger R in formula (2) for permanent
+# shutters (0.13 uninsulated / 0.16 insulated) instead of this 0.04 curtain
+# allowance — but the accredited Elmhurst engine ("SAP 10 WORKSHEET ...
+# Version 10.2, February 2022") does NOT implement it, and the lodged
+# register we target was produced by that engine. Proven on Khalim's
+# "simulated case 46" worksheet: its INSULATED-shutter window (Metal,
+# U_raw 1.74) bills on (27) at U_eff 1.6268 = 1/(1/1.74 + 0.04), NOT the
+# R=0.16 value 1.3611; its UNINSULATED-shutter window (Wood, U_raw 1.69)
+# bills at 1.5830 = 1/(1/1.69 + 0.04), not the R=0.13 value 1.3856 — i.e.
+# R=0.04 on every window regardless of the lodged shutter type. A reverted
+# trial (`_window_added_resistance_m2k_per_w`, in git history) re-broke the
+# Elmhurst-pinned 000565 e2e fixture. The gov-API path was separately tested
+# (2026-06-21): 13 corpus certs lodge shuttered windows; applying the spec R
+# WORSENS cohort MAE 1.067 -> 1.245 (8 of 13 certs move away from lodged,
+# cert 41 +2.19 -> +3.94) while only band-gaming +3 borderline under-raters
+# into within-0.5 — net negative, and it would diverge from the very engine
+# that produced the lodged SAPs. Re-enable only if Elmhurst ships the clause.
 _WINDOW_CURTAIN_RESISTANCE_M2K_PER_W: Final[float] = 0.04

 # SAP10 glazing-type code (the cascade enum used on `SapWindow.glazing_type`,
--- a/scripts/profile_corpus_error.py
+++ b/scripts/profile_corpus_error.py
@ -182,11 +182,17 @@ def main() -> None:
            w05 = sum(1 for e in es if abs(e) < 0.5)
            mabs = stats.mean(abs(e) for e in es)
            waste = (cnt - w05) * mabs
+            # MEDIAN signed error is the outlier-RESISTANT bias lens. The
+            # `signed` mean is dragged by the fat-tail register anomalies, so a
+            # cohort can show a large mean bias while being symmetric noise
+            # (e.g. electric room heaters: mean -1.09 but median -0.01). Hunt
+            # slices by |median|, not |signed| — only |median| flags a TRUE
+            # one-directional systematic shift worth fixing.
            bucket_lines.append((waste, (
                f"  {fn:22s}={val:<20.20s} n={cnt:4d} "
                f"within0.5={w05 / cnt * 100:4.0f}% "
-                f"signed={stats.mean(es):+6.2f} mean|err|={mabs:5.2f} "
-                f"[waste={waste:6.0f}]"
+                f"signed={stats.mean(es):+6.2f} median={stats.median(es):+6.2f} "
+                f"mean|err|={mabs:5.2f} [waste={waste:6.0f}]"
            )))
    print(f"TOP ERROR-CARRYING BUCKETS (n_out x mean|err|; min-n={min_n}):")
    for _, line in sorted(bucket_lines, key=lambda x: -x[0])[:40]: