Skip to main content

LLD: Wave 2 LLD 8 — Approval Delegation + Clearance Enforcement

FieldValue
Parent HLDdocs/hld/agent-runtime-wave-2-approval-pause-resume.md §9
PRD#380 (P4.3.9 Delegation)
Tracker#381
Issue#419
PR#426
MilestoneWave 2
Authorbackend-sme (as-built), principal-architect (original spec)
Date2026-04-13 (backfill)

1. Scope

Layer a delegation primitive onto the LLD 6 approval state machine so an approver holding a pending governance_approvals row can hand decision authority to another operator, subject to a clearance check, a chain-depth cap, a TTL, and a cycle-detector. The engine exposes

  • Delegate(approval_id, from_member_id, to_member_id, reason, expires_at)
  • chain retrieval through GetApproval.delegation_chain
  • authoritative-approver resolution consumed by RecordDecision

on top of the existing governance_approvals table via a new child table approval_delegations. Every transition writes one approval_events row (event_type='delegated') so the full chain is reconstructible from the hash-chained audit trail in isolation from approval_delegations.

Non-goals: policy-resolver-driven "original approver" identity (deferred to LLD 9), SCIM-sourced member clearance, chain UI for ops consoles, background TTL sweep, escalation on delegation timeout (separate from approval timeout).

2. Schema / migration SQL

Migration 042 — see internal/context/store/migrations/042_approval_delegations.up.sql for canonical text.

CREATE TABLE approval_delegations (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
org_id TEXT NOT NULL,
approval_id UUID NOT NULL REFERENCES governance_approvals(id) ON DELETE CASCADE,
chain_position INT NOT NULL CHECK (chain_position >= 1),
from_member_id TEXT NOT NULL,
to_member_id TEXT NOT NULL,
to_clearance INT NOT NULL,
reason TEXT,
expires_at TIMESTAMPTZ NOT NULL,
revoked_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE (approval_id, chain_position)
);

ALTER TABLE approval_delegations ENABLE ROW LEVEL SECURITY;
ALTER TABLE approval_delegations FORCE ROW LEVEL SECURITY;

CREATE POLICY scope_isolation ON approval_delegations
USING (org_id = current_setting('app.org_id', true));

CREATE INDEX ix_approval_delegations_chain
ON approval_delegations(approval_id, chain_position);

CREATE INDEX ix_approval_delegations_to_active
ON approval_delegations(org_id, to_member_id)
WHERE revoked_at IS NULL;

Reversible. Down migration drops indexes, policy, and the table. ON DELETE CASCADE on approval_id keeps state consistent if an approval row is ever hard-deleted (should not happen in production but supports dev fixtures).

No CHECK constraint on chain depth. The cap is policy-tunable; enforcing it in application code lets future waves raise the limit without a migration.

3. Go interfaces

// internal/runtime/approval/delegation.go

type DelegationManager struct {
store Store // pg persistence
clearance MemberClearanceLookup // LLD 8 seam, PG users table (issue #428)
dispatcher Dispatcher // optional; fan-out to delegatee channels
now func() time.Time // injectable
maxDepth int // MaxChainDepth = 3
defaultTTL time.Duration // DefaultDelegationTTL = 24h
}

// Delegate inserts a new chain hop. Validation order:
// 1. params validation (non-empty IDs; from != to)
// 2. approval loaded; must be status=pending
// 3. existing chain loaded
// 4. depth check: len(activeChain) + 1 <= MaxChainDepth
// 5. cycle check: to_member must not appear earlier (as from OR to)
// 6. current-approver check: subsequent hops must match currentTail
// (OR the original approver when all prior hops have lapsed —
// see §4 "All-lapsed fallback")
// 7. clearance check: clearance(to) >= approval.required_clearance
// 8. TTL clamp: expires_at = min(req or now+default, approval.expires_at)
// 9. Insert + append 'delegated' approval_event + fan-out
func (m *DelegationManager) Delegate(ctx context.Context, p DelegateParams) (DelegateOutcome, error)

// CurrentApprover reports the active tail ONLY. ok=false when chain
// is empty OR all hops have lapsed. Callers enforcing the all-lapsed
// fallback MUST use AuthoritativeApprover instead.
func (m *DelegationManager) CurrentApprover(ctx context.Context, orgID, approvalID string) (memberID string, ok bool, err error)

// AuthoritativeApprover is the function RecordDecision consults.
// no chain → ("", false) // no restriction
// active tail → (<tail>, true) // only tail may record
// all lapsed → (<orig>, true) // fallback to chain[0].FromMemberID
func (m *DelegationManager) AuthoritativeApprover(ctx context.Context, orgID, approvalID string) (memberID string, ok bool, err error)

// MemberClearanceLookup is the seam through which the PG users table
// (issue #428) supplies clearance integers. Production impl lives in
// internal/runtime/approval/clearance (PGLookup with 60s/5s TTL cache).
// ok=false => member unknown OR account disabled; Delegate treats
// both as ErrInsufficientClearance (fail closed).
type MemberClearanceLookup interface {
LookupClearance(ctx context.Context, orgID, memberID string) (clearance int32, ok bool)
}

Constants: MaxChainDepth = 3, DefaultDelegationTTL = 24h.

Sentinel errors: ErrInsufficientClearance, ErrChainDepthExceeded, ErrCycleDetected, ErrAlreadyResolved, ErrNotCurrentApprover, ErrSelfDelegation.

4. State transitions and authority rules

4.1 Chain evolution

Each approval starts with an empty chain. Delegate appends rows with monotonic chain_position starting at 1. A revoke marks revoked_at; an expiry is detected at read time (expires_at <= now). No row is ever mutated except for revoked_at.

4.2 Hop activity

func (l *DelegationLink) IsActive(now time.Time) bool {
return l.RevokedAt == nil && now.Before(l.ExpiresAt)
}

currentTail additionally consults MemberClearanceLookup so a disabled delegatee's account is treated as lapsed. This implements the founder-approved rule "if the delegatee's account is disabled, the delegation becomes ineffective; approval reverts to the next-up in the chain".

4.3 Authoritative approver resolution

Chain stateCurrentApproverAuthoritativeApprover
empty("", false)("", false)
at least one active hop(tail.to, true)(tail.to, true)
non-empty, all lapsed("", false)(chain[0].from, true)

RecordDecision MUST consult AuthoritativeApprover. The original-approver fallback (chain[0].FromMemberID) closes the security gap flagged in PR #426 architect review — a chain whose every hop had lapsed previously admitted any operator.

CurrentApprover is retained as the narrower "is there an active delegatee right now?" helper. It is used by introspection call sites (dashboard rendering) that specifically do not want the fallback baked into their answer.

4.4 Subsequent-hop guard in Delegate

Mirror rule: if the chain is non-empty and there is no active tail, a subsequent hop is allowed only when p.FromMemberID == chain[0].FromMemberID (the original approver). Any other operator attempting to extend a dead chain is rejected with ErrNotCurrentApprover.

5. Founder decision #7 — clearance rule

Delegation is allowed when

to_member.clearance >= approval.required_clearance

NOT when to_member.clearance >= from_member.clearance. The semantic is "can this operator discharge this specific approval?" so the delegator's own clearance on other decisions is not at issue. A low-clearance delegator CAN hand a high-clearance delegatee an approval the delegator could not personally have approved. This matches PRD P4.3.9 phrasing and the industry convention (least-privilege, no privilege chaining).

Locked in by TestDelegate_FounderDecision7_RuleIsVsRequired_NotVsFrom.

6. Unit + integration test plan (as-built)

All tests in internal/runtime/approval/{delegation_test.go, service_delegation_test.go, grpcserver_delegate_test.go}.

Isolation (delegation_test.go)

  • TestDelegate_RejectsSelffrom == to rejected
  • TestDelegate_RejectsAlreadyResolved — non-pending approval rejected
  • TestDelegate_InsufficientClearance_Rejectedto below required_clearance
  • TestDelegate_UnknownDelegatee_FailsClosedLookupClearance ok=false
  • TestDelegate_DisabledDelegatee_Rejected — disabled account rejected
  • TestDelegate_FounderDecision7_RuleIsVsRequired_NotVsFrom — founder rule lock-in
  • TestDelegate_TTLClampedToApprovalDeadline — requested TTL > approval TTL clamped
  • TestDelegate_DefaultTTL_AppliedWhenOmitted — default 24h, clamped
  • TestDelegate_CycleDetected — A→B→A refused
  • TestDelegate_ChainDepthLimit — 4th hop on depth-3 chain refused
  • TestDelegate_SubsequentHopMustMatchTail — non-tail subsequent hop refused
  • TestDelegate_ChainPositionMonotonic — positions increment without gaps
  • TestDelegate_EmitsAuditEventapproval_events.event_type='delegated'
  • TestDelegate_DispatchesNotificationToDelegatee — fan-out fires
  • TestDelegate_RevokedHop_IgnoredForTailAndCycle — revoke clears tail but not cycle history
  • TestCurrentApprover_{EmptyChain,ActiveChain,DisabledTail,AllDisabled,ExpiredTail} — tail-only semantics
  • TestAuthoritativeApprover_{EmptyChain,ActiveChain,AllLapsed} — fallback semantics

Service-level (service_delegation_test.go)

  • TestService_Delegate_EndToEnd — Request → Delegate → delegatee approves → session resumes
  • TestService_RecordDecision_RejectsNonTail — former holder and third party rejected
  • TestService_RecordDecision_NoChainDelegates_StillWorks — LLD 6 parity when no chain
  • TestService_Delegate_NotConfigured_ReturnsError — no Clearance wiring → Delegate refuses
  • TestService_Get_PopulatesDelegationChainGetApproval.delegation_chain hydrated
  • TestService_RecordDecision_AllLapsedChain_RevertsToOriginalApproversecurity fix: all-lapsed → only original approver accepted
  • TestService_RecordDecision_PartialLapsedChain_OnlyActiveTailMayRecord — active tail wins over original-approver fallback
  • TestService_Delegate_AlreadyResolvedApprovalErrAlreadyResolved surfaced

gRPC (grpcserver_delegate_test.go)

  • TestGRPC_Delegate_* — error mapping: InvalidArgument, FailedPrecondition, PermissionDenied, NotFound
  • TestGRPC_RecordDecision_NonTail_PermissionDeniedErrNotCurrentApproverPermissionDenied

7. gRPC error mapping

Go sentinelgRPC codeCaller semantic
ErrInsufficientClearancePermissionDenieddelegatee cannot hold this approval
ErrChainDepthExceededFailedPreconditionchain saturated
ErrCycleDetectedFailedPreconditiondelegate would re-enter chain
ErrAlreadyResolvedFailedPreconditionapproval terminal
ErrNotCurrentApproverPermissionDeniedcaller is not the authoritative approver
ErrSelfDelegationInvalidArgumentfrom == to
ErrNotFoundNotFoundno approval for (org, approval_id)

8. Rollout plan

Wiring status

  • Core engine: IN PLACE (PR #426)
  • MemberClearanceLookup implementation: WIRED as of issue #428. internal/runtime/approval/clearance.PGLookup reads from the users table (migration 012) — (org_id, id, clerk_user_id, clearance, status, deleted_at) is the authoritative source. cmd/agent-orchestrator/main.go constructs it with clearance.New(pool, clearance.Config{...}) and passes it as ServiceConfig.Clearance so DelegationManager is instantiated.
  • Cache: 60s positive / 5s negative TTL in-process. Multi-replica deployments converge within TTL without cross-replica invalidation.
  • Fail-closed: DB error OR soft-deleted row OR status IN ('suspended', 'removed')ok=falseErrInsufficientClearance. A demoted operator whose JWT still carries a stale clearance=5 claim cannot escalate — the DB row wins.

Feature flag

None. The lookup-presence check at service construction (Clearance != nil) is the implicit flag — a deployment that wires the lookup has LLD 8 on; one that does not has LLD 6 semantics unchanged.

Rollback

Revoke by writing revoked_at = now() on the delegation row. Supported by the store (RevokeDelegation(ctx, orgID, approvalID, chainPosition)). No gRPC surface in Wave 2 — ops-only via psql for now.

MTTR to disable delegation platform-wide: unset ServiceConfig.Clearance, restart. < 30 s.

9. Observability

Counters (added in this PR):

  • approval_delegations_total{tenant, result=created|revoked|expired} — chain-mutation volume by outcome
  • approval_delegations_clearance_denied_total{tenant} — failed delegations due to clearance gap; a signal of RBAC drift

Emitted through the ServiceMetrics interface exactly as the existing approval counters; the orchestrator wires them through OtelServiceMetrics onto the shared runtime/metrics Instruments.

Only the created and clearance_denied buckets have live call sites in Wave 2. revoked and expired labels are defined so the counter is forward-compatible once revoke / expiry call sites land.

10. Open questions resolved

#QuestionResolution
1MemberClearanceLookup sourceusers table (migration 012). Resolved by issue #428 — see internal/runtime/approval/clearance/ for the DB-backed implementation with 60s/5s TTL cache. The original "Clerk custom claim" sketch was never shipped; the table-backed path is the authoritative source of record.
2Chain depth, default TTL3 hops, 24 h. Code constants. Change is one code edit, no migration.
3RecordDecision on all-lapsed chainRevert to chain[0].FromMemberID (original delegator). Fixed in PR #426 per architect review — do NOT defer to LLD 9.
4Background TTL sweepNot required for correctness. Every read path invokes IsActive(now). No stale-delegation usage risk. Follow-up for dashboard latency can land in Wave 3 if the reads become hot.
5Original approver identity (vs. policy-derived)Currently chain[0].FromMemberID. LLD 9 policy resolver can tighten this — the fallback is a floor, not a ceiling, and refining it is strictly additive.

11. Known edge cases

  • from_member_id on the first hop is unchecked. The approval-creation-time policy already gated who could see the approval; we accept that as the initial grant. The value is captured verbatim in the audit row.
  • Revoked + expired simultaneouslyIsActive returns false once either condition is met. Ordering does not matter.
  • Disabled delegator mid-chain does NOT invalidate the chain going forward. Only the delegatee's liveness matters at read time; the delegator's role was to hand authority over, which they already did.
  • Concurrent Delegate race surfaces as ErrIdempotencyCollision from the UNIQUE (approval_id, chain_position) constraint — the loser retries after re-reading the chain.
  • Approval hard-deleted while chain existsON DELETE CASCADE on approval_id tears down delegations too. Hard-delete is not a production code path.

12. Estimated size

M–L (1.5–2 weeks): migration + chain logic + clearance lookup seam + gRPC error mapping + dispatcher fan-out to delegatee + unit and service integration tests + audit event + as-built doc + metrics. Story points ≈ 8.

Shipped in PR #426. LLD doc backfilled post-merge per architect follow-up.