Use NullPool as a graceful ceiling for the one-connection-per-lambda design

The invocation is architecturally one DB connection at a time (read up front, sequential write Units of Work, overrides resolved on the unit's own session). Keep that as the design intent, but back it with NullPool instead of a fixed pool_size=1 pool: each checkout opens a fresh connection and closes it on return, so there is no pool slot to exhaust. The difference is the failure mode if a path ever regresses and holds two Sessions at once. A pool_size=1/max_overflow=0 pool turns that into a hard 30s dead-lock that fails the whole invocation ("QueuePool limit of size 1 overflow 0 reached, connection timed out"). NullPool instead opens a transient second connection for that instant and the Lambda keeps running. The design target stays one connection; NullPool just keeps it alive if we slip. The single-connection invariant itself is still enforced in the Unit of Work (overrides read on the unit's own session) and pinned by the regression test, which uses its own strict pool_size=1 engine so it asserts the architecture regardless of the production NullPool choice. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 13:10:47 +00:00 · 2026-06-24 17:10:23 +00:00 · 2026-06-24 17:10:23 +00:00 · fb308cfaea
commit fb308cfaea
parent de71f9abb6
1 changed files with 31 additions and 21 deletions
--- a/applications/modelling_e2e/handler.py
+++ b/applications/modelling_e2e/handler.py
@ -18,18 +18,19 @@ All Measure Types are considered: pricing goes through
 and heating gaps) are priced from the committed off-catalogue overlay instead of
 crashing.

-DB engine is module-scoped so the connection pool is reused across warm
-invocations (ADR-0012). The pool holds a single connection (``pool_size=1``): the
-handler reads everything up front — overrides, Scenario, a catalogue snapshot, and
-stored Solar — through one short-lived read Session, closes it, then writes each
-Property in a sequential Unit of Work, so the read and write Sessions never
-overlap. The orchestrator shares the same engine and releases its connection
-between bookkeeping commits, so one invocation uses one DB connection at a time.
+The DB engine is module-scoped (ADR-0012). Architecturally each invocation uses
+one DB connection at a time: the handler reads everything up front — overrides,
+Scenario, a catalogue snapshot, and stored Solar — through one short-lived read
+Session, closes it, then writes each Property in a sequential Unit of Work whose
+overrides resolve on its own session, so no two Sessions ever overlap. The engine
+uses ``NullPool`` rather than a fixed pool so that target is a graceful ceiling,
+not a hard one: a fresh connection is opened per checkout and closed on return,
+so there is no pool slot to exhaust — any future accidental overlap opens a
+transient second connection instead of dead-locking the Lambda.
 """

 from __future__ import annotations

-import dataclasses
 import io
 import os
 from collections.abc import Callable, Generator
@ -39,6 +40,7 @@ from typing import Any, Optional, cast
 import boto3
 import pandas as pd  # pyright: ignore[reportMissingTypeStubs]
 from sqlalchemy import Engine, text
+from sqlalchemy.pool import NullPool
 from sqlmodel import Session

 from datatypes.epc.domain.epc_property_data import (
@ -136,26 +138,34 @@ def _get_engine() -> Engine:
    global _engine
    if _engine is None:
        config = PostgresConfig.from_env(dict(os.environ))
-        # One connection per invocation: the handler reads everything up front
-        # through one short-lived read Session, closes it, then writes each
-        # Property in a sequential Unit of Work — so the read and write Sessions
-        # never overlap and a single pooled connection suffices. The orchestrator
-        # shares this engine (see ``_shared_engine_orchestrator``) and releases
-        # its connection between bookkeeping commits, so it holds none during the
-        # work. 32 concurrent containers × 1 connection = 32 against RDS.
-        _engine = make_engine(dataclasses.replace(config, pool_size=1, max_overflow=0))
+        # Architecturally one connection per invocation: the handler reads
+        # everything up front through one short-lived read Session, closes it,
+        # then writes each Property in a sequential Unit of Work — and the Unit of
+        # Work resolves overrides on its own session — so no two Sessions overlap
+        # and a single connection suffices. 32 concurrent containers × 1 = 32
+        # against RDS.
+        #
+        # NullPool, not a fixed pool, enforces that as a *graceful* ceiling rather
+        # than a hard one: each checkout opens a fresh connection and closes it on
+        # return, so there is no pool slot to exhaust. If a future code path ever
+        # holds two Sessions at once it opens a second connection for that instant
+        # instead of dead-locking on a 1-slot pool and failing the whole
+        # invocation (the "QueuePool limit of size 1 overflow 0 reached" timeout).
+        # The design target stays one connection; NullPool just keeps the Lambda
+        # running if we ever regress it.
+        _engine = make_engine(config, poolclass=NullPool)
    return _engine


@contextmanager
 def _shared_engine_orchestrator() -> Generator[TaskOrchestrator, None, None]:
-    """A ``TaskOrchestrator`` on the same module-scoped pooled engine as the
-    modelling work — not a separate per-invocation NullPool engine.
+    """A ``TaskOrchestrator`` on the same module-scoped engine as the modelling
+    work, not a separate one.

-    Its repositories commit on every ``save``/``create``, releasing the pooled
+    Its repositories commit on every ``save``/``create``, releasing the
    connection between bookkeeping calls, so it holds none while the wrapped
-    handler body runs. Combined with the read-then-write handler structure and
-    ``pool_size=1``, the whole invocation uses one DB connection at a time."""
+    handler body runs. Combined with the read-then-write handler structure, the
+    whole invocation uses one DB connection at a time."""
    engine = _get_engine()
    with Session(engine) as session:
        yield TaskOrchestrator(