Use NullPool as a graceful ceiling for the one-connection-per-lambda design

The invocation is architecturally one DB connection at a time (read up front, sequential write Units of Work, overrides resolved on the unit's own session). Keep that as the design intent, but back it with NullPool instead of a fixed pool_size=1 pool: each checkout opens a fresh connection and closes it on return, so there is no pool slot to exhaust. The difference is the failure mode if a path ever regresses and holds two Sessions at once. A pool_size=1/max_overflow=0 pool turns that into a hard 30s dead-lock that fails the whole invocation ("QueuePool limit of size 1 overflow 0 reached, connection timed out"). NullPool instead opens a transient second connection for that instant and the Lambda keeps running. The design target stays one connection; NullPool just keeps it alive if we slip. The single-connection invariant itself is still enforced in the Unit of Work (overrides read on the unit's own session) and pinned by the regression test, which uses its own strict pool_size=1 engine so it asserts the architecture regardless of the production NullPool choice. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 13:10:47 +00:00 · 2026-06-24 17:10:23 +00:00 · 2026-06-24 17:10:23 +00:00 · fb308cfaea
commit fb308cfaea
parent de71f9abb6
1 changed files with 31 additions and 21 deletions
--- a/applications/modelling_e2e/handler.py
+++ b/applications/modelling_e2e/handler.py
@ -18,18 +18,19 @@ All Measure Types are considered: pricing goes through
 and heating gaps) are priced from the committed off-catalogue overlay instead of
 crashing.
-DB engine is module-scoped so the connection pool is reused across warm
+The DB engine is module-scoped (ADR-0012). Architecturally each invocation uses
-invocations (ADR-0012). The pool holds a single connection (``pool_size=1``): the
+one DB connection at a time: the handler reads everything up front — overrides,
-handler reads everything up front — overrides, Scenario, a catalogue snapshot, and
+Scenario, a catalogue snapshot, and stored Solar — through one short-lived read
-stored Solar — through one short-lived read Session, closes it, then writes each
+Session, closes it, then writes each Property in a sequential Unit of Work whose
-Property in a sequential Unit of Work, so the read and write Sessions never
+overrides resolve on its own session, so no two Sessions ever overlap. The engine
-overlap. The orchestrator shares the same engine and releases its connection
+uses ``NullPool`` rather than a fixed pool so that target is a graceful ceiling,
-between bookkeeping commits, so one invocation uses one DB connection at a time.
+not a hard one: a fresh connection is opened per checkout and closed on return,
 so there is no pool slot to exhaust — any future accidental overlap opens a
 transient second connection instead of dead-locking the Lambda.
 """
 from __future__ import annotations
 import dataclasses
 import io
 import os
 from collections.abc import Callable, Generator
@ -39,6 +40,7 @@ from typing import Any, Optional, cast
 import boto3
 import pandas as pd  # pyright: ignore[reportMissingTypeStubs]
 from sqlalchemy import Engine, text
 from sqlalchemy.pool import NullPool
 from sqlmodel import Session
 from datatypes.epc.domain.epc_property_data import (
@ -136,26 +138,34 @@ def _get_engine() -> Engine:
    global _engine
    if _engine is None:
        config = PostgresConfig.from_env(dict(os.environ))
-        # One connection per invocation: the handler reads everything up front
+        # Architecturally one connection per invocation: the handler reads
-        # through one short-lived read Session, closes it, then writes each
+        # everything up front through one short-lived read Session, closes it,
-        # Property in a sequential Unit of Work — so the read and write Sessions
+        # then writes each Property in a sequential Unit of Work — and the Unit of
-        # never overlap and a single pooled connection suffices. The orchestrator
+        # Work resolves overrides on its own session — so no two Sessions overlap
-        # shares this engine (see ``_shared_engine_orchestrator``) and releases
+        # and a single connection suffices. 32 concurrent containers × 1 = 32
-        # its connection between bookkeeping commits, so it holds none during the
+        # against RDS.
-        # work. 32 concurrent containers × 1 connection = 32 against RDS.
+        #
-        _engine = make_engine(dataclasses.replace(config, pool_size=1, max_overflow=0))
+        # NullPool, not a fixed pool, enforces that as a *graceful* ceiling rather
        # than a hard one: each checkout opens a fresh connection and closes it on
        # return, so there is no pool slot to exhaust. If a future code path ever
        # holds two Sessions at once it opens a second connection for that instant
        # instead of dead-locking on a 1-slot pool and failing the whole
        # invocation (the "QueuePool limit of size 1 overflow 0 reached" timeout).
        # The design target stays one connection; NullPool just keeps the Lambda
        # running if we ever regress it.
        _engine = make_engine(config, poolclass=NullPool)
    return _engine
@contextmanager
 def _shared_engine_orchestrator() -> Generator[TaskOrchestrator, None, None]:
-    """A ``TaskOrchestrator`` on the same module-scoped pooled engine as the
+    """A ``TaskOrchestrator`` on the same module-scoped engine as the modelling
-    modelling work — not a separate per-invocation NullPool engine.
+    work, not a separate one.
-    Its repositories commit on every ``save``/``create``, releasing the pooled
+    Its repositories commit on every ``save``/``create``, releasing the
    connection between bookkeeping calls, so it holds none while the wrapped
-    handler body runs. Combined with the read-then-write handler structure and
+    handler body runs. Combined with the read-then-write handler structure, the
-    ``pool_size=1``, the whole invocation uses one DB connection at a time."""
+    whole invocation uses one DB connection at a time."""
    engine = _get_engine()
    with Session(engine) as session:
        yield TaskOrchestrator(