ADR-0009: Wave H tool visibility — narrowed to residuals after overlap with PR #599
- Status: Accepted
- Date: 2026-04-17
- Decision owners: Principal Architect, Founder (Option B pre-approval)
- Related: Issue #593, Issue #567, PR #599, PRD #549 §3.8, HLD #556 §§9.1–9.2, LLD #560 addendum
Context
HLD v1.1 introduced Wave H — "Org Unit Tool Visibility" — with the goal of delivering:
org_unit_tools+tool_config_conflictsschema (migration 064).ToolVisibilityResolver(set-algebra most-restrictive-wins).- 6 RPCs on
OrgUnitServicefor attach/detach/override request/resolve plus effective-tool queries. - Per-tool merger registry (MCP repos, KB namespaces, strict).
- 10k-case fuzz invariant check.
- Golden fixtures for the HLD §9.2 / PRD §3.8 worked examples.
tool_resolver_duration_seconds{depth_bucket}Prometheus histogram.
Issue #567 was filed and implemented before HLD v1.1 adopted the v1.1 wave numbering. PR #599 shipped on 2026-04-17 at 06:41 UTC covering items (1)–(4) end-to-end under the old numbering scheme. When issue #593 (v1.1 "Wave H") was subsequently opened, the backend-sme audit (issue comment 4267171485) found 100% overlap between #593's acceptance criteria and the already-merged PR #599, leaving only three hardening residuals unshipped:
- The Prometheus histogram
tool_resolver_duration_seconds{depth_bucket}. - The 10k-case fuzz corpus.
- The HLD §9.2 worked-example golden test fixtures.
Decision
Per the founder's Option B pre-approval ("when overlap is clean and only residuals remain, narrow and proceed without re-asking"), we do NOT re-open or rewrite the already-merged surface area. We:
- Narrow #593's scope to the three residuals only.
- Deliver them in a focused follow-up PR against
main. - Record the overlap and acceptance here so future readers understand why Wave H is split across #567/#599 (schema + resolver + RPCs) and #593/this-PR (observability + fuzz + goldens).
What PR #599 Shipped (already on main)
| Area | File | Notes |
|---|---|---|
| Schema | internal/context/store/migrations/064_org_unit_tools.up.sql / .down.sql | org_unit_tools + tool_config_conflicts with RLS FORCE. rls-check.sh clean. |
| Class registry | internal/compliance/classregistry/scopes.go | Both tables Operational/Medium. |
| Proto | proto/upsquad/orgunit/v1/orgunit.proto | AttachTool, DetachTool, ListToolsForUnit, ListToolsForMember, RequestToolOverride, ResolveToolOverride + ToolBinding/EffectiveToolBinding/ToolConfigConflict messages + ToolVisibility enum (in store constants). |
| Store | internal/orgunit/tool_store.go | Upsert/SoftDelete/Get/List + conflict insert/get/update. config_hash via pg sha256() (stored in DB, not Go layer). |
| Resolver | internal/orgunit/tool_visibility.go | EffectiveTools (direct > inherited > sibling), AttachTool subset validator, Request/ResolveOverride. |
| Per-tool semantics | internal/orgunit/tool_mergers/*.go | Registry + GenericStrictMerger + MCPReposMerger + KBNamespacesMerger. |
| RBAC | internal/orgunit/handler_perms.go | 5 new permissions: tools.attach, tools.detach, tools.view, tools.override_request, tools.override_resolve. |
| Tests | internal/orgunit/tool_visibility_test.go + tool_mergers/*_test.go + migration static | 16+ tests, all under -race. |
What This PR Adds (the residuals)
| Residual | File | Rationale |
|---|---|---|
Metric: tool_resolver_duration_seconds{depth_bucket} | internal/orgunit/tool_visibility_metrics.go + emit point in tool_visibility.go EffectiveTools | Wire-once-in-the-resolver: bounded 4-series cardinality (0, 1-2, 3-5, 6+), bucketed from len(ancestor_ids). Registered via promauto so it auto-attaches to the existing /metrics scrape. |
| 10k-case fuzz corpus + invariant | internal/orgunit/tool_visibility_fuzz_test.go | Random trees (3–12 units, 2–6 tools, 1–6 members) × 10k iterations, deterministic seed 0xC0FFEE. Asserts: no child's effective tool config exceeds an inherit_down ancestor's ceiling. |
| HLD §9.2 golden fixtures | internal/orgunit/testdata/tool_visibility_goldens/*.json + tool_visibility_goldens_test.go | Four hand-authored fixtures anchored to the PRD §3.8 worked example (Parent + SRE), HLD §9.1 subset accept/reject, and HLD §9.2 3-level most-restrictive cascade. Walks *.json auto-discover; minimum fixture count hard-pinned at 3 so accidental deletion fails loudly. |
Why This Is Acceptable
- No re-implementation risk: we are only adding observability and tests. No production code path is altered except the single
time.Now()/Observe()call pair inEffectiveTools. - Spec fidelity preserved: golden fixtures are hand-authored from the PRD/HLD text and reviewed; README explicitly forbids regenerating from code.
- Cardinality discipline: the only new label (
depth_bucket) has 4 values forever — no per-org / per-unit / per-tool labels, no cardinality explosion. - Seamless merger with #599: PR #599 is already on
mainand providing the runtime behaviour. This PR merges on top without touching migrations, protos, or RPCs.
Consequences
- Wave H is now complete: (schema + resolver + RPCs + mergers + RBAC) via PR #599, (observability + fuzz + goldens) via this PR.
- Future changes to
ToolVisibilityResolver,simulateEffective, or the attach-time subset validator will be regression-caught by the goldens. Any violation of the subset invariant will be caught by the fuzz run. - Follow-ups:
- When
OrgUnitService.ListEffectiveToolsis eventually called in production traffic,tool_resolver_duration_secondswill gain non-zero samples; dashboards should chart p95 bydepth_bucket. - If a fifth depth bucket is ever needed (e.g., large fleets with deeper hierarchies), extend
depthBucket(int)and add anenumconstant in the same file — do not add free-form labels. - PR numbering:
064_org_unit_tools.up.sqlalready occupies migration 064. Wave H residuals do NOT add a new migration.
- When