Skip to main content

LLD 12 — Child Session Lifecycle, Quotas, Cascade, Sweepers (as-built)

FieldValue
StatusMerged as of PR #440 (squash: 289e919)
Date2026-04-15
Parent HLD#430
Milestone9 (Agent Runtime Optimisation & Security Hardening)
Delivery issue#432

This document captures the as-built invariants of the SubagentCoordinator. It is intentionally lean — full design rationale lives in HLD #430 and the per-file doc comments in internal/runtime/subagent/coordinator/. The Race-Freedom Invariants section (§3) is the normative contract that operators and future authors of multi-replica coordinator variants must not violate.


1. Scope

The coordinator owns child session lifecycle: quota admission (4 dimensions — depth, fanout, tree-size, concurrent), cycle detection, cascade termination, timeout + orphan sweepers, Redis per-tenant slot counter with self-healing reconciler, and a single-replica daemon binary (cmd/subagent-coordinator). Out of scope: tool surface (LLD 11), result delivery / rendezvous (LLD 13), approval integration (LLD 14).

2. Components

FileResponsibility
coordinator.goPublic surface: Invoke, Cancel, CascadeOnParentTerminated
checks.goThree quota-check helpers (depth / fanout / tree-size) run inside one RO tx
quotas.goQuotaConfig, platform defaults + ceilings, ApplyTenantQuotas, snapshot encoder
redis.goSlotManager — per-tenant concurrent counter + reconciler
termination.goCascadeTerminator — DFS-post-order tree termination
sweeper.goSweepTimeouts + SweepOrphans — two cross-tenant batch-claim passes
scheduler.goDaemon — single-replica tick loop (timeout / orphan / reconcile)
lease.goAcquireLease — PG advisory-lock session-scoped singleton gate
pgadapter.gopgxpool-backed walker, pending-count, org-list, bridge status store

3. Race-Freedom Invariants

The coordinator’s quota-correctness guarantees depend on three runtime invariants. If any are violated, the depth / fanout / tree-size checks can oversubscribe under concurrent load. Items 1 and 2 are enforced by wiring decisions today; item 3 is a property of the Redis data model.

3.1 Single-replica coordinator (founder decision #6)

The cmd/subagent-coordinator binary is pinned to replicas=1 at deploy time AND gated by a session-scoped PostgreSQL advisory lock (pg_try_advisory_lock(LeaseKey()), see lease.go). A second replica blocks in the acquire loop until the incumbent exits (SIGTERM / crash) and PostgreSQL releases the lock on connection close.

This invariant is what makes the DB-backed quota checks in runAdmissionChecks race-safe:

  • CheckDepth reads agent_sessions.delegation_depth for the parent.
  • CheckFanout runs COUNT(*) FROM subagent_invocations WHERE parent_session_id = $1 AND status = 'pending'.
  • CheckTreeSize runs a WITH RECURSIVE walk under the root.

All three are COUNT-style reads inside a single RO transaction. Two concurrent Invoke calls for the same parent, executing on two coordinator replicas, would each observe the pre-write COUNT and each admit — producing a fanout+2 outcome when the limit is fanout+1.

3.2 Per-session serial executor

Within the single coordinator process, Invoke calls for the same parent session are serialised by the session-scoped executor upstream (the MCP tool handler in LLD 11 runs inside the parent worker’s single goroutine). Two concurrent Invoke calls for the same parent cannot happen in-process today.

This invariant is implicit — it is not enforced by the coordinator itself; it is a property of the LLD 11 tool-call invocation path. A future code path that calls Invoke from two goroutines for the same parent (e.g., a speculative-execution experiment) would need to add its own parent-scoped lock or the checks would race even under the single-replica guarantee.

3.3 Redis INCR atomic concurrent-slot counter

The MaxConcurrent dimension is enforced via SlotManager.AcquireSlot, which uses a Redis INCR/DECR pair. Unlike the DB-backed dimensions, this remains race-safe even under multi-replica coordinator deployment because Redis INCR is atomic on a single key. The test TestSlotManager_FanOutBomb_100Concurrent pen-tests this claim under -race with 100 true-concurrent goroutines at limit=8 and asserts exactly 8 succeed / 92 denied.

3.4 Failure mode if violated

Running more than one coordinator replica against the same database — by mis-configuring the deployment to replicas>1 and bypassing the advisory-lock gate, or by running the daemon in-process alongside an orchestrator-embedded coordinator — causes quota oversubscription under concurrent load on the depth, fanout, and tree-size dimensions. The concurrent dimension remains correct (Redis INCR is atomic).

Observable symptoms:

  • IncInvocation("ok") counts exceed the product of per-tenant limits.
  • subagent_invocations rows exist at delegation_depth > MaxDepth.
  • Tree-walk queries observe more than MaxTreeSize descendants.

3.5 Deferred: multi-replica coordinator

Multi-replica coordinator with advisory-lock-driven leader election — e.g., per-parent pg_advisory_xact_lock(hashtext(parent_session_id)) or migrating the DB checks to row-level UPDATE ... WHERE CAS — is deferred to post-Wave-3 maturity. The single-replica gate keeps the operational surface smaller while Wave 3 stabilises.


4. Operational hooks

  • cmd/subagent-coordinator: single-replica daemon binary. Runs the timeout / orphan / reconcile tick loops.
  • Metrics (OTel): subagent_invocations_total, subagent_quota_denials_total, subagent_cascade_terminations_total, subagent_sweeper_actions_total, subagent_sweeper_cross_tenant_reads_total (item 5), subagent_invoke_duration_seconds.
  • Migration: 044_subagent_coordinator.up.sql — additive only (enum extension + partial index on (org_id, parent_session_id) WHERE status='pending').

5. References

  • HLD: #430
  • Delivery issue: #432
  • Hardening tracker: #442
  • Prior-art lease: internal/runtime/approval/scheduler_lease.go
  • Prior-art BYPASSRLS posture: internal/mcp/middleware/aggregator.go