Skip to main content

ADR-0009: Wave H tool visibility — narrowed to residuals after overlap with PR #599

  • Status: Accepted
  • Date: 2026-04-17
  • Decision owners: Principal Architect, Founder (Option B pre-approval)
  • Related: Issue #593, Issue #567, PR #599, PRD #549 §3.8, HLD #556 §§9.1–9.2, LLD #560 addendum

Context

HLD v1.1 introduced Wave H — "Org Unit Tool Visibility" — with the goal of delivering:

  1. org_unit_tools + tool_config_conflicts schema (migration 064).
  2. ToolVisibilityResolver (set-algebra most-restrictive-wins).
  3. 6 RPCs on OrgUnitService for attach/detach/override request/resolve plus effective-tool queries.
  4. Per-tool merger registry (MCP repos, KB namespaces, strict).
  5. 10k-case fuzz invariant check.
  6. Golden fixtures for the HLD §9.2 / PRD §3.8 worked examples.
  7. tool_resolver_duration_seconds{depth_bucket} Prometheus histogram.

Issue #567 was filed and implemented before HLD v1.1 adopted the v1.1 wave numbering. PR #599 shipped on 2026-04-17 at 06:41 UTC covering items (1)–(4) end-to-end under the old numbering scheme. When issue #593 (v1.1 "Wave H") was subsequently opened, the backend-sme audit (issue comment 4267171485) found 100% overlap between #593's acceptance criteria and the already-merged PR #599, leaving only three hardening residuals unshipped:

  • The Prometheus histogram tool_resolver_duration_seconds{depth_bucket}.
  • The 10k-case fuzz corpus.
  • The HLD §9.2 worked-example golden test fixtures.

Decision

Per the founder's Option B pre-approval ("when overlap is clean and only residuals remain, narrow and proceed without re-asking"), we do NOT re-open or rewrite the already-merged surface area. We:

  1. Narrow #593's scope to the three residuals only.
  2. Deliver them in a focused follow-up PR against main.
  3. Record the overlap and acceptance here so future readers understand why Wave H is split across #567/#599 (schema + resolver + RPCs) and #593/this-PR (observability + fuzz + goldens).

What PR #599 Shipped (already on main)

AreaFileNotes
Schemainternal/context/store/migrations/064_org_unit_tools.up.sql / .down.sqlorg_unit_tools + tool_config_conflicts with RLS FORCE. rls-check.sh clean.
Class registryinternal/compliance/classregistry/scopes.goBoth tables Operational/Medium.
Protoproto/upsquad/orgunit/v1/orgunit.protoAttachTool, DetachTool, ListToolsForUnit, ListToolsForMember, RequestToolOverride, ResolveToolOverride + ToolBinding/EffectiveToolBinding/ToolConfigConflict messages + ToolVisibility enum (in store constants).
Storeinternal/orgunit/tool_store.goUpsert/SoftDelete/Get/List + conflict insert/get/update. config_hash via pg sha256() (stored in DB, not Go layer).
Resolverinternal/orgunit/tool_visibility.goEffectiveTools (direct > inherited > sibling), AttachTool subset validator, Request/ResolveOverride.
Per-tool semanticsinternal/orgunit/tool_mergers/*.goRegistry + GenericStrictMerger + MCPReposMerger + KBNamespacesMerger.
RBACinternal/orgunit/handler_perms.go5 new permissions: tools.attach, tools.detach, tools.view, tools.override_request, tools.override_resolve.
Testsinternal/orgunit/tool_visibility_test.go + tool_mergers/*_test.go + migration static16+ tests, all under -race.

What This PR Adds (the residuals)

ResidualFileRationale
Metric: tool_resolver_duration_seconds{depth_bucket}internal/orgunit/tool_visibility_metrics.go + emit point in tool_visibility.go EffectiveToolsWire-once-in-the-resolver: bounded 4-series cardinality (0, 1-2, 3-5, 6+), bucketed from len(ancestor_ids). Registered via promauto so it auto-attaches to the existing /metrics scrape.
10k-case fuzz corpus + invariantinternal/orgunit/tool_visibility_fuzz_test.goRandom trees (3–12 units, 2–6 tools, 1–6 members) × 10k iterations, deterministic seed 0xC0FFEE. Asserts: no child's effective tool config exceeds an inherit_down ancestor's ceiling.
HLD §9.2 golden fixturesinternal/orgunit/testdata/tool_visibility_goldens/*.json + tool_visibility_goldens_test.goFour hand-authored fixtures anchored to the PRD §3.8 worked example (Parent + SRE), HLD §9.1 subset accept/reject, and HLD §9.2 3-level most-restrictive cascade. Walks *.json auto-discover; minimum fixture count hard-pinned at 3 so accidental deletion fails loudly.

Why This Is Acceptable

  • No re-implementation risk: we are only adding observability and tests. No production code path is altered except the single time.Now() / Observe() call pair in EffectiveTools.
  • Spec fidelity preserved: golden fixtures are hand-authored from the PRD/HLD text and reviewed; README explicitly forbids regenerating from code.
  • Cardinality discipline: the only new label (depth_bucket) has 4 values forever — no per-org / per-unit / per-tool labels, no cardinality explosion.
  • Seamless merger with #599: PR #599 is already on main and providing the runtime behaviour. This PR merges on top without touching migrations, protos, or RPCs.

Consequences

  • Wave H is now complete: (schema + resolver + RPCs + mergers + RBAC) via PR #599, (observability + fuzz + goldens) via this PR.
  • Future changes to ToolVisibilityResolver, simulateEffective, or the attach-time subset validator will be regression-caught by the goldens. Any violation of the subset invariant will be caught by the fuzz run.
  • Follow-ups:
    • When OrgUnitService.ListEffectiveTools is eventually called in production traffic, tool_resolver_duration_seconds will gain non-zero samples; dashboards should chart p95 by depth_bucket.
    • If a fifth depth bucket is ever needed (e.g., large fleets with deeper hierarchies), extend depthBucket(int) and add an enum constant in the same file — do not add free-form labels.
    • PR numbering: 064_org_unit_tools.up.sql already occupies migration 064. Wave H residuals do NOT add a new migration.