ADR-0013: Abstraction Principle — open primitives adopted directly, cloud services abstracted behind interfaces
- Status: Proposed (awaiting founder sign-off — amends ADR-0012)
- Date: 2026-06-01
- Decision owners: Founder (proposer + sign-off), Principal Architect (technical validation), Backend SME (implementer), Product Manager (positioning consistency)
- Amends: ADR-0012 — adds the abstraction principle that makes its positioning architecturally enforceable
- Related:
upsquad-ai/upsquad-core#1184(Agent Decision Receipt) — the receipt'sRuleHashfield feeds from these provider interfacesupsquad-ai/work-tracker#56(GEAP eval tracker) — empirical basis for the matrix below- Per-component evals
upsquad-ai/upsquad-core#1185–#1199— each named cloud service was probed and categorised
Context
ADR-0012 committed UpsQuad to the agnostic workforce-governance middleware positioning, with a reshape matrix that named multiple cloud services to "adopt" (Model Armor, Vertex Gen AI Evaluation, Cloud Trace, Agent Runtime managed deployment, etc.). The framing was load-bearing for the positioning's credibility but left "adopt" ambiguous between two architectures:
- Architecture A (direct adoption): UpsQuad's middleware calls Google's cloud services directly under the hood. Every prompt + response flows through
modelarmor.googleapis.com. Our content-safety layer is Google's content-safety layer. - Architecture B (abstracted adoption): UpsQuad's middleware exposes provider interfaces. Cloud-provider services are one option behind those interfaces; on-prem / open-weights / multi-cloud alternatives are others. The tenant (or per-tenant config) picks which one runs.
Architecture A directly contradicts the agnostic positioning ADR-0012 commits to. It also forecloses ~50–70 % of the addressable market under that positioning (AWS-only customers, Azure-only customers, on-prem regulated tenants in FinServ / HealthCare / Defense / Gov, EU sovereignty-sensitive deployments, and customers running open-weights models on-prem).
Architecture B is consistent with the positioning and matches the durable pattern used by every middleware play that survived big-cloud bundling (Stripe abstracts over acquirers / issuers / networks; Datadog abstracts over cloud monitoring; Auth0 abstracts over identity providers; Snowflake abstracts over cloud warehouses; HashiCorp abstracts over cloud IaC).
This ADR closes the ambiguity by codifying Architecture B and naming the interfaces.
Decision
UpsQuad adopts the following Abstraction Principle:
Open primitives — protocols, specifications, OSS libraries, framework code that runs inside our process — are adopted directly.
Cloud-provider services — anything requiring a network call to a third-party cloud API — are adopted behind a provider interface, with the cloud-provider implementation as one option among multiple providers.
Rationale: direct dependency on a single cloud provider's services structurally contradicts the agnostic-middleware positioning ADR-0012 commits to. The abstraction is the architectural mechanism that makes the positioning enforceable rather than rhetorical.
This principle binds engineering decisions downstream. Every LLD that adopts a third-party capability must answer: is this an open primitive (direct) or a cloud service (abstracted)? and, when abstracted, which interface does it implement?
Direct adopt — open primitives
These run inside our process or are protocol/spec implementations. No cloud-API network round-trip per call. Direct dependency is acceptable because portability is the substrate's property, not ours to engineer.
| Primitive | Why direct is correct |
|---|---|
| ADK (Agent Development Kit) | Apache-2.0 Python/Go/Java/TS library, runs in agent process, same library is the natural authoring SDK whether the runtime target is Vertex, Bedrock, or our own. |
| A2A protocol | Wire protocol, Linux Foundation governance. We implement endpoints; we call endpoints. Zero GCP coupling. |
| SPIFFE / SPIRE pattern | CNCF spec for workload identity. We deploy our own issuer or run SPIRE. Google's auto-provisioning is the pattern we adopt, not their implementation. |
| OpenTelemetry semantic conventions | CNCF spec + libraries. We instrument; the export destination is per-tenant config (see TraceExporter below). |
| in-toto / SLSA / Sigstore Rekor | Open specs underpinning the Decision Receipt (#1184). Receipt registers to Rekor for public-tenant case OR a tenant-private SCITT service OR our own transparency log. |
| MCP protocol spec | Wire protocol. We are already an MCP consumer; we will be an MCP producer when the catalog is published. |
| OAuth 2.0 / OIDC / SAML | Auth standards. Already abstracted via Clerk + our clearance layer. |
Abstract behind an interface — every cloud service
The provider interfaces below sit in internal/ packages and are consumed by internal/mcp/middleware, internal/runtime, internal/governance, and the audit/observability pipeline. Each interface has at least one cloud-provider implementation (for tenants who want it) and at least one on-prem / open-weights / portable implementation (for tenants who can't or won't use the cloud option).
| Cloud service | Interface | Primary cloud provider | Alternatives |
|---|---|---|---|
| Model Armor | ContentSafety | providers/modelarmor (GCP tenants default) | Llama Guard, ShieldGemma, Prompt Guard (on-prem), Bedrock Guardrails, Azure AI Content Safety, Anthropic content classifier, OpenAI Moderation, passthrough (explicit opt-out) |
| Vertex Gen AI Evaluation | EvalProvider | providers/vertex_eval | LangSmith, Galileo, Humanloop, in-house harness |
| Cloud Trace | TraceExporter (already abstracted via OTel) | OTel → GCP exporter | OTel → Honeycomb / Grafana Tempo / Datadog / local Grafana / raw OTLP |
| Cloud Logging | LogExporter (already abstracted via OTel logs) | OTel → GCP exporter | OTel → Grafana Loki / Datadog / raw S3 |
| Vertex MCP servers (the 9 sharded toolset paths) | MCPToolSource (already exists in middleware) | providers/vertex_mcp | Any other MCP server registry; these are tools per tenant, not infra |
| Agent Runtime managed deployment | RuntimeDeployment | providers/vertex_runtime (one cloud target) | Our internal runtime (pkg/runtimepb) for multi-tenant + on-prem; AWS Bedrock Agents target (future); raw Cloud Run / EKS / GKE target |
| Agent Registry | AgentCatalog (we already have a catalog) | providers/gcp_registry (mirror for GCP tenants if desired) | Our tenant-scoped catalog (primary) |
| GCP MCP servers — BigQuery / GCS / Logging / etc. | (already abstracted — they are tools) | Each is one tool source per tenant config | Any other MCP server |
| AWS Bedrock Guardrails (future) | ContentSafety | providers/bedrock_guardrails | Same interface as Model Armor |
| Azure AI Content Safety (future) | ContentSafety | providers/azure_content_safety | Same |
Do not adopt
These are explicitly rejected — they are architecturally incompatible with the positioning, not just abstract-able.
| Service | Why not |
|---|---|
| Agent Gateway + Semantic Governance Policies | Architectural opposite (LLM-as-judge governance with explicit "LLMs make mistakes; encourage human review"). Our deterministic cascade is the differentiator; we sell against this. |
| Projects in Gemini Enterprise | Workspace-distribution play. Cannot follow Google there. Reframe Quad as embedded surface for B2B SaaS instead (per ADR-0012). |
| Agent Studio (no-code visual builder) | Targets Workspace IT buyer. Different category. |
Concrete interface shape
The ContentSafety interface is the canonical worked example; the same shape applies to EvalProvider, RuntimeDeployment, AgentCatalog.
// internal/guardrail/content/safety.go
package content
type ContentSafety interface {
SanitizePrompt(ctx context.Context, req SanitizeRequest) (*Verdict, error)
SanitizeResponse(ctx context.Context, req SanitizeRequest) (*Verdict, error)
}
type SanitizeRequest struct {
TenantID string
AgentID string
Text string
Categories []Category // PII, PromptInjection, Hate, Jailbreak, MaliciousURL, ...
Threshold Confidence // Low, MediumAndAbove, High
}
type Verdict struct {
MatchFound bool
Findings []Finding
ProviderID string // "model-armor" | "llama-guard" | "shield-gemma" | ...
LatencyMs int
RuleHash string // feeds decision-receipt provenance (#1184)
}
Per-tenant config the middleware reads at request time:
tenant_id: acme_corp
content_safety:
provider: model-armor # or llama-guard, shield-gemma, etc.
threshold: MEDIUM_AND_ABOVE
categories: [prompt_injection, pii, jailbreak]
fallback: passthrough # behaviour if provider unreachable
The middleware calls contentSafety.SanitizePrompt(...) per tool call without knowledge of which provider is behind it. Provider swap is config-only. Same pattern for the other interfaces above.
Cost-benefit
For the ContentSafety interface specifically — representative of the cost structure for each abstraction:
| Direct Model Armor adoption | Abstracted (ContentSafety interface) | |
|---|---|---|
| Engineering cost | ~4 days (wire + test) | ~3 weeks (interface + Model Armor provider + Llama Guard provider + per-tenant config + tests) |
| Customers we can sell to | GCP tenants only | GCP + AWS + Azure + on-prem + sovereign |
| Roadmap coupling to Google | Tight (their pricing/deprecation breaks us simultaneously) | Loose (swap provider per-tenant) |
| Positioning consistency with ADR-0012 | Contradicts | Enforces |
Decision Receipt provenance field (RuleHash) | Locked to Model Armor's rule identifiers | Provider-tagged; receipts portable across providers |
~2.5 weeks of additional engineering buys back 50–70 % of the addressable market and the architectural credibility of ADR-0012. The math is overwhelming for every interface in the table above.
Consequences
Affirmative — engineering commitments
- Every LLD that adopts a third-party capability must declare its adoption mode (Direct / Abstracted / Don't adopt) and, if Abstracted, the interface name. The principal-architect's review of any new LLD verifies this declaration.
internal/guardrail/content/is created in the next implementation wave with theContentSafetyinterface + at least one cloud provider (Model Armor) and one open-weights provider (Llama Guard or ShieldGemma).- The existing
RuntimeDeploymentshape (aroundpkg/runtimepb) is treated as the abstraction layer; managed-Agent-Runtime is one deployment target, our runtime is another. No change required beyond documentation; this ADR makes the role explicit. - Cloud-provider implementations are always one option, never the only option. A tenant config schema that does not include a provider field is a schema bug — surface it in code review.
- OTel-based abstractions (
TraceExporter,LogExporter) already satisfy the principle. No new code; document explicitly that the OTel layer is the abstraction.
Negative — what we accept
- ~2.5 weeks of additional engineering per provider interface, up front. Real cost. Funded by the customer-segment unlock.
- Per-tenant configuration surface grows. Each abstracted interface adds tenant config fields. Acceptable; multi-tenant SaaS already lives with this.
- Testing matrix grows. Each interface has N providers × M tenant configs. Use contract tests + provider conformance suites to keep this tractable.
What this changes for ADR-0012's reshape matrix
ADR-0012's reshape matrix (the table mapping each GEAP surface to Keep / Adopt / Interop / No change) is reinterpreted through the abstraction principle:
- "Adopt" for an open primitive (ADK, A2A, SPIFFE pattern, OTel semconv, in-toto, MCP) means direct dependency — these run inside our process or implement open specs.
- "Adopt" for a cloud service (Model Armor, Vertex Eval, Cloud Trace, Agent Runtime managed deployment, GCP MCP servers, Agent Registry) means provider-interface integration — the cloud service is one provider among multiple behind a UpsQuad-owned interface.
Operationally, ADR-0012's matrix should be re-rendered with two added columns: Adoption mode and Interface name. The principal-architect can do this as a follow-up to ADR-0012 if desired, or the original matrix can stay as-is with this ADR understood to amend its semantics.
Risks bound to this decision
- Risk: engineering treats "abstracted" as a future-state aspiration and ships direct-dependency code today. Mitigation: principal-architect reviews every LLD against the principle; PR review rejects code that directly imports a cloud-provider SDK from the middleware/runtime/observability paths.
- Risk: provider interfaces leak provider-specific semantics, so swap becomes impractical even though the interface exists. Mitigation: write contract tests that exercise the interface against ≥2 providers in CI; if a contract test only passes for one provider, the interface is leaky.
- Risk: the 2.5-week-per-interface cost compounds and delays shipping. Mitigation: sequence the interfaces by customer urgency (
ContentSafetyfirst — needed for any pilot;EvalProvidersecond; observability is already abstracted via OTel; runtime deployment can stay informal until a second target is real). - Risk: founder or customer pressure to "just integrate Model Armor and ship" pushes Architecture A back into the codebase. Mitigation: this ADR is the durable artefact. Reference it in any future debate.
Alternatives considered
- Edit ADR-0012 in place. Rejected — ADR-0012 is merged; editing its body retroactively breaks the audit trail. New ADR amending the prior is the correct pattern.
- Leave the ambiguity, trust engineering judgement at each LLD. Rejected — the cost of one direct-dependency adoption shipped into production is unwinding it from every tenant configuration. Cheaper to set the rule now.
- Adopt direct dependency for "GCP-friendly tenants only" and worry about portability later. Rejected — defers the architectural debt to the customer onboarding conversation, where it is most expensive to discover.
- Build everything in-house (no third-party cloud services at all). Rejected — ADR-0012 explicitly commits to adopting external primitives where they exist; this ADR refines how, not whether.
Sign-off
- Founder (Vaisakh) — architectural alignment with ADR-0012's positioning
- Principal Architect — LLD-review enforcement of the principle
- Backend lead (Ashik) — implementation implications for
internal/mcp/middleware,internal/runtime,internal/guardrail/content/(new) - Product Manager — positioning consistency in PRD and customer messaging