Skip to main content

ADR-0013: Abstraction Principle — open primitives adopted directly, cloud services abstracted behind interfaces

  • Status: Proposed (awaiting founder sign-off — amends ADR-0012)
  • Date: 2026-06-01
  • Decision owners: Founder (proposer + sign-off), Principal Architect (technical validation), Backend SME (implementer), Product Manager (positioning consistency)
  • Amends: ADR-0012 — adds the abstraction principle that makes its positioning architecturally enforceable
  • Related:
    • upsquad-ai/upsquad-core#1184 (Agent Decision Receipt) — the receipt's RuleHash field feeds from these provider interfaces
    • upsquad-ai/work-tracker#56 (GEAP eval tracker) — empirical basis for the matrix below
    • Per-component evals upsquad-ai/upsquad-core#1185–#1199 — each named cloud service was probed and categorised

Context

ADR-0012 committed UpsQuad to the agnostic workforce-governance middleware positioning, with a reshape matrix that named multiple cloud services to "adopt" (Model Armor, Vertex Gen AI Evaluation, Cloud Trace, Agent Runtime managed deployment, etc.). The framing was load-bearing for the positioning's credibility but left "adopt" ambiguous between two architectures:

  • Architecture A (direct adoption): UpsQuad's middleware calls Google's cloud services directly under the hood. Every prompt + response flows through modelarmor.googleapis.com. Our content-safety layer is Google's content-safety layer.
  • Architecture B (abstracted adoption): UpsQuad's middleware exposes provider interfaces. Cloud-provider services are one option behind those interfaces; on-prem / open-weights / multi-cloud alternatives are others. The tenant (or per-tenant config) picks which one runs.

Architecture A directly contradicts the agnostic positioning ADR-0012 commits to. It also forecloses ~50–70 % of the addressable market under that positioning (AWS-only customers, Azure-only customers, on-prem regulated tenants in FinServ / HealthCare / Defense / Gov, EU sovereignty-sensitive deployments, and customers running open-weights models on-prem).

Architecture B is consistent with the positioning and matches the durable pattern used by every middleware play that survived big-cloud bundling (Stripe abstracts over acquirers / issuers / networks; Datadog abstracts over cloud monitoring; Auth0 abstracts over identity providers; Snowflake abstracts over cloud warehouses; HashiCorp abstracts over cloud IaC).

This ADR closes the ambiguity by codifying Architecture B and naming the interfaces.

Decision

UpsQuad adopts the following Abstraction Principle:

Open primitives — protocols, specifications, OSS libraries, framework code that runs inside our process — are adopted directly.

Cloud-provider services — anything requiring a network call to a third-party cloud API — are adopted behind a provider interface, with the cloud-provider implementation as one option among multiple providers.

Rationale: direct dependency on a single cloud provider's services structurally contradicts the agnostic-middleware positioning ADR-0012 commits to. The abstraction is the architectural mechanism that makes the positioning enforceable rather than rhetorical.

This principle binds engineering decisions downstream. Every LLD that adopts a third-party capability must answer: is this an open primitive (direct) or a cloud service (abstracted)? and, when abstracted, which interface does it implement?

Direct adopt — open primitives

These run inside our process or are protocol/spec implementations. No cloud-API network round-trip per call. Direct dependency is acceptable because portability is the substrate's property, not ours to engineer.

PrimitiveWhy direct is correct
ADK (Agent Development Kit)Apache-2.0 Python/Go/Java/TS library, runs in agent process, same library is the natural authoring SDK whether the runtime target is Vertex, Bedrock, or our own.
A2A protocolWire protocol, Linux Foundation governance. We implement endpoints; we call endpoints. Zero GCP coupling.
SPIFFE / SPIRE patternCNCF spec for workload identity. We deploy our own issuer or run SPIRE. Google's auto-provisioning is the pattern we adopt, not their implementation.
OpenTelemetry semantic conventionsCNCF spec + libraries. We instrument; the export destination is per-tenant config (see TraceExporter below).
in-toto / SLSA / Sigstore RekorOpen specs underpinning the Decision Receipt (#1184). Receipt registers to Rekor for public-tenant case OR a tenant-private SCITT service OR our own transparency log.
MCP protocol specWire protocol. We are already an MCP consumer; we will be an MCP producer when the catalog is published.
OAuth 2.0 / OIDC / SAMLAuth standards. Already abstracted via Clerk + our clearance layer.

Abstract behind an interface — every cloud service

The provider interfaces below sit in internal/ packages and are consumed by internal/mcp/middleware, internal/runtime, internal/governance, and the audit/observability pipeline. Each interface has at least one cloud-provider implementation (for tenants who want it) and at least one on-prem / open-weights / portable implementation (for tenants who can't or won't use the cloud option).

Cloud serviceInterfacePrimary cloud providerAlternatives
Model ArmorContentSafetyproviders/modelarmor (GCP tenants default)Llama Guard, ShieldGemma, Prompt Guard (on-prem), Bedrock Guardrails, Azure AI Content Safety, Anthropic content classifier, OpenAI Moderation, passthrough (explicit opt-out)
Vertex Gen AI EvaluationEvalProviderproviders/vertex_evalLangSmith, Galileo, Humanloop, in-house harness
Cloud TraceTraceExporter (already abstracted via OTel)OTel → GCP exporterOTel → Honeycomb / Grafana Tempo / Datadog / local Grafana / raw OTLP
Cloud LoggingLogExporter (already abstracted via OTel logs)OTel → GCP exporterOTel → Grafana Loki / Datadog / raw S3
Vertex MCP servers (the 9 sharded toolset paths)MCPToolSource (already exists in middleware)providers/vertex_mcpAny other MCP server registry; these are tools per tenant, not infra
Agent Runtime managed deploymentRuntimeDeploymentproviders/vertex_runtime (one cloud target)Our internal runtime (pkg/runtimepb) for multi-tenant + on-prem; AWS Bedrock Agents target (future); raw Cloud Run / EKS / GKE target
Agent RegistryAgentCatalog (we already have a catalog)providers/gcp_registry (mirror for GCP tenants if desired)Our tenant-scoped catalog (primary)
GCP MCP servers — BigQuery / GCS / Logging / etc.(already abstracted — they are tools)Each is one tool source per tenant configAny other MCP server
AWS Bedrock Guardrails (future)ContentSafetyproviders/bedrock_guardrailsSame interface as Model Armor
Azure AI Content Safety (future)ContentSafetyproviders/azure_content_safetySame

Do not adopt

These are explicitly rejected — they are architecturally incompatible with the positioning, not just abstract-able.

ServiceWhy not
Agent Gateway + Semantic Governance PoliciesArchitectural opposite (LLM-as-judge governance with explicit "LLMs make mistakes; encourage human review"). Our deterministic cascade is the differentiator; we sell against this.
Projects in Gemini EnterpriseWorkspace-distribution play. Cannot follow Google there. Reframe Quad as embedded surface for B2B SaaS instead (per ADR-0012).
Agent Studio (no-code visual builder)Targets Workspace IT buyer. Different category.

Concrete interface shape

The ContentSafety interface is the canonical worked example; the same shape applies to EvalProvider, RuntimeDeployment, AgentCatalog.

// internal/guardrail/content/safety.go
package content

type ContentSafety interface {
SanitizePrompt(ctx context.Context, req SanitizeRequest) (*Verdict, error)
SanitizeResponse(ctx context.Context, req SanitizeRequest) (*Verdict, error)
}

type SanitizeRequest struct {
TenantID string
AgentID string
Text string
Categories []Category // PII, PromptInjection, Hate, Jailbreak, MaliciousURL, ...
Threshold Confidence // Low, MediumAndAbove, High
}

type Verdict struct {
MatchFound bool
Findings []Finding
ProviderID string // "model-armor" | "llama-guard" | "shield-gemma" | ...
LatencyMs int
RuleHash string // feeds decision-receipt provenance (#1184)
}

Per-tenant config the middleware reads at request time:

tenant_id: acme_corp
content_safety:
provider: model-armor # or llama-guard, shield-gemma, etc.
threshold: MEDIUM_AND_ABOVE
categories: [prompt_injection, pii, jailbreak]
fallback: passthrough # behaviour if provider unreachable

The middleware calls contentSafety.SanitizePrompt(...) per tool call without knowledge of which provider is behind it. Provider swap is config-only. Same pattern for the other interfaces above.

Cost-benefit

For the ContentSafety interface specifically — representative of the cost structure for each abstraction:

Direct Model Armor adoptionAbstracted (ContentSafety interface)
Engineering cost~4 days (wire + test)~3 weeks (interface + Model Armor provider + Llama Guard provider + per-tenant config + tests)
Customers we can sell toGCP tenants onlyGCP + AWS + Azure + on-prem + sovereign
Roadmap coupling to GoogleTight (their pricing/deprecation breaks us simultaneously)Loose (swap provider per-tenant)
Positioning consistency with ADR-0012ContradictsEnforces
Decision Receipt provenance field (RuleHash)Locked to Model Armor's rule identifiersProvider-tagged; receipts portable across providers

~2.5 weeks of additional engineering buys back 50–70 % of the addressable market and the architectural credibility of ADR-0012. The math is overwhelming for every interface in the table above.

Consequences

Affirmative — engineering commitments

  1. Every LLD that adopts a third-party capability must declare its adoption mode (Direct / Abstracted / Don't adopt) and, if Abstracted, the interface name. The principal-architect's review of any new LLD verifies this declaration.
  2. internal/guardrail/content/ is created in the next implementation wave with the ContentSafety interface + at least one cloud provider (Model Armor) and one open-weights provider (Llama Guard or ShieldGemma).
  3. The existing RuntimeDeployment shape (around pkg/runtimepb) is treated as the abstraction layer; managed-Agent-Runtime is one deployment target, our runtime is another. No change required beyond documentation; this ADR makes the role explicit.
  4. Cloud-provider implementations are always one option, never the only option. A tenant config schema that does not include a provider field is a schema bug — surface it in code review.
  5. OTel-based abstractions (TraceExporter, LogExporter) already satisfy the principle. No new code; document explicitly that the OTel layer is the abstraction.

Negative — what we accept

  1. ~2.5 weeks of additional engineering per provider interface, up front. Real cost. Funded by the customer-segment unlock.
  2. Per-tenant configuration surface grows. Each abstracted interface adds tenant config fields. Acceptable; multi-tenant SaaS already lives with this.
  3. Testing matrix grows. Each interface has N providers × M tenant configs. Use contract tests + provider conformance suites to keep this tractable.

What this changes for ADR-0012's reshape matrix

ADR-0012's reshape matrix (the table mapping each GEAP surface to Keep / Adopt / Interop / No change) is reinterpreted through the abstraction principle:

  • "Adopt" for an open primitive (ADK, A2A, SPIFFE pattern, OTel semconv, in-toto, MCP) means direct dependency — these run inside our process or implement open specs.
  • "Adopt" for a cloud service (Model Armor, Vertex Eval, Cloud Trace, Agent Runtime managed deployment, GCP MCP servers, Agent Registry) means provider-interface integration — the cloud service is one provider among multiple behind a UpsQuad-owned interface.

Operationally, ADR-0012's matrix should be re-rendered with two added columns: Adoption mode and Interface name. The principal-architect can do this as a follow-up to ADR-0012 if desired, or the original matrix can stay as-is with this ADR understood to amend its semantics.

Risks bound to this decision

  1. Risk: engineering treats "abstracted" as a future-state aspiration and ships direct-dependency code today. Mitigation: principal-architect reviews every LLD against the principle; PR review rejects code that directly imports a cloud-provider SDK from the middleware/runtime/observability paths.
  2. Risk: provider interfaces leak provider-specific semantics, so swap becomes impractical even though the interface exists. Mitigation: write contract tests that exercise the interface against ≥2 providers in CI; if a contract test only passes for one provider, the interface is leaky.
  3. Risk: the 2.5-week-per-interface cost compounds and delays shipping. Mitigation: sequence the interfaces by customer urgency (ContentSafety first — needed for any pilot; EvalProvider second; observability is already abstracted via OTel; runtime deployment can stay informal until a second target is real).
  4. Risk: founder or customer pressure to "just integrate Model Armor and ship" pushes Architecture A back into the codebase. Mitigation: this ADR is the durable artefact. Reference it in any future debate.

Alternatives considered

  1. Edit ADR-0012 in place. Rejected — ADR-0012 is merged; editing its body retroactively breaks the audit trail. New ADR amending the prior is the correct pattern.
  2. Leave the ambiguity, trust engineering judgement at each LLD. Rejected — the cost of one direct-dependency adoption shipped into production is unwinding it from every tenant configuration. Cheaper to set the rule now.
  3. Adopt direct dependency for "GCP-friendly tenants only" and worry about portability later. Rejected — defers the architectural debt to the customer onboarding conversation, where it is most expensive to discover.
  4. Build everything in-house (no third-party cloud services at all). Rejected — ADR-0012 explicitly commits to adopting external primitives where they exist; this ADR refines how, not whether.

Sign-off

  • Founder (Vaisakh) — architectural alignment with ADR-0012's positioning
  • Principal Architect — LLD-review enforcement of the principle
  • Backend lead (Ashik) — implementation implications for internal/mcp/middleware, internal/runtime, internal/guardrail/content/ (new)
  • Product Manager — positioning consistency in PRD and customer messaging