LLD 12 — Child Session Lifecycle, Quotas, Cascade, Sweepers (as-built)
| Field | Value |
|---|---|
| Status | Merged as of PR #440 (squash: 289e919) |
| Date | 2026-04-15 |
| Parent HLD | #430 |
| Milestone | 9 (Agent Runtime Optimisation & Security Hardening) |
| Delivery issue | #432 |
This document captures the as-built invariants of the
SubagentCoordinator. It is intentionally lean — full design rationale
lives in HLD #430 and the per-file doc comments in
internal/runtime/subagent/coordinator/. The Race-Freedom Invariants
section (§3) is the normative contract that operators and future authors
of multi-replica coordinator variants must not violate.
1. Scope
The coordinator owns child session lifecycle: quota admission (4
dimensions — depth, fanout, tree-size, concurrent), cycle detection,
cascade termination, timeout + orphan sweepers, Redis per-tenant slot
counter with self-healing reconciler, and a single-replica daemon binary
(cmd/subagent-coordinator). Out of scope: tool surface (LLD 11),
result delivery / rendezvous (LLD 13), approval integration (LLD 14).
2. Components
| File | Responsibility |
|---|---|
coordinator.go | Public surface: Invoke, Cancel, CascadeOnParentTerminated |
checks.go | Three quota-check helpers (depth / fanout / tree-size) run inside one RO tx |
quotas.go | QuotaConfig, platform defaults + ceilings, ApplyTenantQuotas, snapshot encoder |
redis.go | SlotManager — per-tenant concurrent counter + reconciler |
termination.go | CascadeTerminator — DFS-post-order tree termination |
sweeper.go | SweepTimeouts + SweepOrphans — two cross-tenant batch-claim passes |
scheduler.go | Daemon — single-replica tick loop (timeout / orphan / reconcile) |
lease.go | AcquireLease — PG advisory-lock session-scoped singleton gate |
pgadapter.go | pgxpool-backed walker, pending-count, org-list, bridge status store |
3. Race-Freedom Invariants
The coordinator’s quota-correctness guarantees depend on three runtime invariants. If any are violated, the depth / fanout / tree-size checks can oversubscribe under concurrent load. Items 1 and 2 are enforced by wiring decisions today; item 3 is a property of the Redis data model.
3.1 Single-replica coordinator (founder decision #6)
The cmd/subagent-coordinator binary is pinned to replicas=1 at
deploy time AND gated by a session-scoped PostgreSQL advisory lock
(pg_try_advisory_lock(LeaseKey()), see lease.go). A second replica
blocks in the acquire loop until the incumbent exits (SIGTERM / crash)
and PostgreSQL releases the lock on connection close.
This invariant is what makes the DB-backed quota checks in
runAdmissionChecks race-safe:
CheckDepthreadsagent_sessions.delegation_depthfor the parent.CheckFanoutrunsCOUNT(*) FROM subagent_invocations WHERE parent_session_id = $1 AND status = 'pending'.CheckTreeSizeruns aWITH RECURSIVEwalk under the root.
All three are COUNT-style reads inside a single RO transaction. Two
concurrent Invoke calls for the same parent, executing on two
coordinator replicas, would each observe the pre-write COUNT and each
admit — producing a fanout+2 outcome when the limit is fanout+1.
3.2 Per-session serial executor
Within the single coordinator process, Invoke calls for the same
parent session are serialised by the session-scoped executor upstream
(the MCP tool handler in LLD 11 runs inside the parent worker’s single
goroutine). Two concurrent Invoke calls for the same parent cannot
happen in-process today.
This invariant is implicit — it is not enforced by the coordinator
itself; it is a property of the LLD 11 tool-call invocation path. A
future code path that calls Invoke from two goroutines for the same
parent (e.g., a speculative-execution experiment) would need to add
its own parent-scoped lock or the checks would race even under the
single-replica guarantee.
3.3 Redis INCR atomic concurrent-slot counter
The MaxConcurrent dimension is enforced via SlotManager.AcquireSlot,
which uses a Redis INCR/DECR pair. Unlike the DB-backed dimensions,
this remains race-safe even under multi-replica coordinator deployment
because Redis INCR is atomic on a single key. The test
TestSlotManager_FanOutBomb_100Concurrent pen-tests this claim under
-race with 100 true-concurrent goroutines at limit=8 and asserts
exactly 8 succeed / 92 denied.
3.4 Failure mode if violated
Running more than one coordinator replica against the same database —
by mis-configuring the deployment to replicas>1 and bypassing the
advisory-lock gate, or by running the daemon in-process alongside an
orchestrator-embedded coordinator — causes quota oversubscription
under concurrent load on the depth, fanout, and tree-size dimensions.
The concurrent dimension remains correct (Redis INCR is atomic).
Observable symptoms:
IncInvocation("ok")counts exceed the product of per-tenant limits.subagent_invocationsrows exist at delegation_depth > MaxDepth.- Tree-walk queries observe more than
MaxTreeSizedescendants.
3.5 Deferred: multi-replica coordinator
Multi-replica coordinator with advisory-lock-driven leader election —
e.g., per-parent pg_advisory_xact_lock(hashtext(parent_session_id))
or migrating the DB checks to row-level UPDATE ... WHERE CAS — is
deferred to post-Wave-3 maturity. The single-replica gate keeps
the operational surface smaller while Wave 3 stabilises.
4. Operational hooks
cmd/subagent-coordinator: single-replica daemon binary. Runs the timeout / orphan / reconcile tick loops.- Metrics (OTel):
subagent_invocations_total,subagent_quota_denials_total,subagent_cascade_terminations_total,subagent_sweeper_actions_total,subagent_sweeper_cross_tenant_reads_total(item 5),subagent_invoke_duration_seconds. - Migration:
044_subagent_coordinator.up.sql— additive only (enum extension + partial index on(org_id, parent_session_id) WHERE status='pending').
5. References
- HLD: #430
- Delivery issue: #432
- Hardening tracker: #442
- Prior-art lease:
internal/runtime/approval/scheduler_lease.go - Prior-art BYPASSRLS posture:
internal/mcp/middleware/aggregator.go