ADR-0002: Dual Business Model (Cloud SaaS + On-Premise Licensing)

Status: Proposed
Date: 2026-04-09
Deciders: Vaisakh, Ashik (founders — sign-off pending); Principal Architect (proposal)
Supersedes: None
Related: ADR-0001 (hybrid local dev stack), #59 (CloudSQL deferral, closed not planned), PRD v1.7 (parallel draft — adds On-Premise business model section and Feature Tier Matrix), parallel CLAUDE.md update (two-mode framing + vendor-neutrality tenet)

1. Context

Until now, UpsQuad has been positioned as a Cloud SaaS product only. In this planning round, the founders have committed to a new business model: an On-Premise licensing offering sold alongside the existing Cloud SaaS offering. Both models are durable commitments — On-Premise is not a tactical carve-out for a single customer, it is a product line.

The rationale is market reach. UpsQuad is positioned as a multi-vertical platform. Three target verticals will not adopt a shared multi-tenant SaaS for data sovereignty and regulatory reasons:

Government — classified or citizen-PII workloads, residency mandates, air-gap requirements.
Legal — attorney–client privilege, matter confidentiality, jurisdictional data handling.
Healthcare (regulated segments) — PHI handling beyond what a standard HIPAA BAA covers.

Without an On-Premise path, these verticals are unreachable. With one, they unlock — but only if the same binary runs on-prem with no fork, because maintaining two codebases for a pre-revenue platform is not viable.

This decision has second-order architectural consequences that must be locked in now, during Phase 5, so that code written in Phases 5–10 does not accumulate Cloud-only assumptions that have to be unwound in Phase 11. In particular, vendor-neutrality — which was previously an implicit phase-specific posture (we haven't stood up managed infra yet per ADR-0001 §8) — becomes a durable tenet. This ADR captures that shift.

This ADR also strengthens the #59 closure (permanent CloudSQL dev instance, closed not planned). The #59 reasoning relied on "we haven't committed to a managed-infra dependency yet." On-Premise licensing turns that temporary posture into a permanent architectural constraint: even post-tripwire, the non-Cloud code paths must continue to work. Vendor-neutrality is no longer "nice to have until we have a customer"; it is "required because we are shipping to customers who host it themselves."

2. Decision

We adopt a dual business model with a single codebase. The same Context Engine binary runs in two modes:

Cloud mode — UpsQuad-operated SaaS, multi-tenant, priced per Cloud terms (seat-based Copilots + runtime × agent-kind metering for Agent Teams). UpsQuad operates the infrastructure.
On-Premise mode — customer-hosted on a CNCF-conformant Kubernetes cluster, distributed as a Helm chart, priced as a tier-based licensing fee with no usage metering and no phone-home by default. Air-gap is a first-class supported deployment.

Mode selection is a runtime configuration of the licensing module. There is no build flag, no separate binary, no fork, and no //go:build tag gating core behaviour.

Feature differences between the two modes are gated at runtime through the licensing module and the Feature Tier Matrix (owned by PRD v1.7).

3. Architectural consequences (normative)

The following six decisions are normative. Code written in Phase 5 and later must comply. Violations at merge time are treated the same as ADR-0001 §4 lint violations: a hard fail, not a review nit.

3.1 Vendor-neutrality constraint

No vendor-specific cloud SDK imports are permitted in internal/context/** or any other business-logic package. In particular:

Forbidden imports outside the cloud boundary:
- cloud.google.com/go/...
- github.com/aws/aws-sdk-go, github.com/aws/aws-sdk-go-v2/...
- github.com/Azure/azure-sdk-for-go/...
- Any other vendor-specific cloud SDK.
Permitted boundary directory: internal/cloud/<vendor>/, e.g. internal/cloud/gcs/, internal/cloud/openai/, internal/cloud/gemini/. All vendor-specific code lives here and only here. Business-logic packages import the abstract interface (§3.2, §3.3), never the vendor SDK.

internal/cloud/ is chosen over deployments/cloud/ because deployments/ is conventionally infra-as-code (Pulumi/Helm), whereas these are Go packages that ship inside the binary. Keeping them under internal/cloud/ makes the Go import graph the source of truth for the boundary, which is what a lint rule can enforce.
Enforcement: a forbidigo rule (or a small AST analyzer under scripts/lint/) added to golangci-lint that fails CI on any internal/context/** (and in due course, internal/workflow/**, internal/employee/**, etc.) import of the forbidden prefixes above. Mirrors the ADR-0001 §4 pattern.
Current known violations (tech debt, surfaced as follow-up issues, fixed in Phase 11 per the roadmap):
- Memory snapshot path (PR #36, internal/context/memory/...) imports the GCS client directly. Must move behind §3.2.
- Compaction path (PR #35, internal/context/compaction/...) and guardrail secondary-model path consume OPENAI_API_KEY / Gemini SDK directly. Must move behind §3.3.
- Any other direct vendor imports identified by an audit grep at the start of Phase 11.
These are not fixed in this ADR. Phase 5 continues with the current code unchanged. Phase 11 is the remediation window. New code from this ADR's acceptance onwards must not add new violations.

3.2 Storage abstraction (interface-first)

A new package internal/context/storage (or equivalent — final name at implementer's discretion during Phase 11) declares the blob storage interface. The existing GCS client moves to internal/cloud/gcs and implements the interface. An S3/MinIO implementation for On-Premise is a Phase 11 deliverable and is not written now.

Interface (normative; exact method set may be refined during Phase 11 implementation, but the shape is fixed):

package storage

import (
    "context"
    "io"
    "time"
)

// BlobStorage is the minimum surface the Context Engine needs from any blob
// store (GCS, S3, MinIO, Azure Blob, local filesystem for tests). All
// implementations must be safe for concurrent use.
type BlobStorage interface {
    // Get returns a reader for the object at key. Caller must Close the reader.
    // Returns ErrNotFound if the key does not exist.
    Get(ctx context.Context, key string) (io.ReadCloser, error)

    // Put writes the full contents of r to key. Any existing object at key is
    // overwritten. The implementation is responsible for durability before
    // returning nil.
    Put(ctx context.Context, key string, r io.Reader, opts PutOptions) error

    // Delete removes the object at key. Deleting a non-existent key is not an
    // error (idempotent).
    Delete(ctx context.Context, key string) error

    // Stat returns metadata for the object at key. Returns ErrNotFound if the
    // key does not exist.
    Stat(ctx context.Context, key string) (ObjectInfo, error)

    // List enumerates keys under prefix. Pagination is handled by the
    // implementation; callers iterate until the returned iterator is
    // exhausted. Listings are eventually consistent unless the implementation
    // documents otherwise.
    List(ctx context.Context, prefix string) Iterator
}

type PutOptions struct {
    ContentType  string
    CacheControl string
    Metadata     map[string]string
}

type ObjectInfo struct {
    Key          string
    Size         int64
    ETag         string
    ContentType  string
    LastModified time.Time
    Metadata     map[string]string
}

type Iterator interface {
    Next(ctx context.Context) (ObjectInfo, bool, error)
    Close() error
}

// Sentinel errors returned by all implementations.
var (
    ErrNotFound = errors.New("storage: object not found")
)

Constraints:

Business-logic packages (internal/context/memory/..., future snapshot and export paths) take a BlobStorage via dependency injection. No package under internal/context/** imports internal/cloud/gcs or any other vendor impl directly.
Phase 5 code that currently imports cloud.google.com/go/storage directly is pre-existing tech debt (§3.1) and is exempt until Phase 11. Net-new code from ADR acceptance must comply.
No feature of BlobStorage may rely on GCS-specific capabilities (e.g., GCS Object Lifecycle Management, resumable upload session IDs, signed URL formats). If a feature is needed that a reasonable S3-compatible store cannot provide, it must be raised as a new ADR.

3.3 LLM gateway abstraction (interface-first)

A new package internal/context/llm (or equivalent) declares the LLM gateway interface. Existing OpenAI and Gemini clients move to internal/cloud/openai and internal/cloud/gemini respectively and implement the interface. A future internal/cloud/openai_compatible implementation covers the "bring-your-own-LLM" case for On-Premise customers (any OpenAI-API-compatible endpoint — vLLM, Ollama, LocalAI, Together, Fireworks, etc.). That implementation is not written now; only the interface and the re-homing of existing clients is in scope for Phase 11.

Interface (normative shape):

package llm

import "context"

// LLMGateway is the sole entry point for any LLM call from the Context Engine.
// No package under internal/context/** may import a vendor-specific LLM SDK
// directly once Phase 11 remediation lands.
type LLMGateway interface {
    // Complete runs a chat-completion style request. Streaming is modeled via
    // the returned CompletionStream; for non-streaming callers, helpers in
    // this package collapse the stream to a single response.
    Complete(ctx context.Context, req CompletionRequest) (CompletionStream, error)

    // Embed returns vector embeddings for the supplied inputs. Batch size
    // limits are implementation-specific and surfaced via ModelInfo.
    Embed(ctx context.Context, req EmbedRequest) (EmbedResponse, error)

    // ModelInfo returns the declared capabilities of the configured model:
    // context window, max output tokens, supports tool-calling, supports
    // structured output, pricing dimension hints, etc.
    ModelInfo(ctx context.Context) (ModelInfo, error)
}

type CompletionRequest struct {
    Model       string
    Messages    []Message
    Tools       []ToolSpec
    MaxTokens   int
    Temperature float32
    Metadata    map[string]string // tenant_id, agent_id, trace_id, etc.
}

type CompletionStream interface {
    Next(ctx context.Context) (CompletionChunk, bool, error)
    Close() error
}

type EmbedRequest struct {
    Model  string
    Inputs []string
}

type EmbedResponse struct {
    Vectors [][]float32
    Usage   Usage
}

type ModelInfo struct {
    Provider           string
    Model              string
    ContextWindow      int
    MaxOutputTokens    int
    SupportsToolCalls  bool
    SupportsStructured bool
}

type Usage struct {
    PromptTokens     int
    CompletionTokens int
    TotalTokens      int
}

Constraints:

Nothing under internal/context/** may read OPENAI_API_KEY, GEMINI_API_KEY, or any other provider-specific secret directly. All provider configuration lives behind the gateway. The gateway resolves credentials from a Config that is wired at process startup.
CompletionRequest.Metadata is the hand-off point for per-call attribution (tenant_id, agent_id, trace_id). The metering and audit subsystems read from here — they do not attempt to tap the provider SDK directly.
On-Premise customers configure a single LLMGateway implementation (usually openai_compatible pointing at their internal endpoint). Cloud uses one or more concrete implementations (OpenAI for primary, Gemini for secondary per current compaction design).
If a request truly requires a capability no openai_compatible endpoint can provide (e.g., Gemini-specific vision-with-grounding), that capability must be declared via ModelInfo.SupportsXxx and the caller must degrade gracefully when false. Hard-coding "this only works on Gemini" inside internal/context/** is forbidden.

3.4 Licensing module (interface stub now, real implementation Phase 11)

A new package internal/context/licensing (or equivalent) declares the licensing interface. A no-op Cloud implementation that returns "all features enabled, no expiry" is written as part of the Phase 5/6 scaffolding work so that downstream code can reference feature flags via the licensing interface from day one — rather than via environment variables, build tags, or hard-coded booleans that will have to be unwound later.

The real On-Premise implementation — cryptographic validation, grace periods, tier matrix enforcement — is Phase 11.

Interface (normative shape):

package licensing

import (
    "context"
    "time"
)

// LicenseValidator is the sole authority on what features are enabled in a
// running Context Engine process. Business-logic code asks this interface,
// not environment variables, not build tags, not hard-coded constants.
type LicenseValidator interface {
    // Validate checks the current license state. Called at startup and
    // periodically thereafter. Returns an error if the license is invalid,
    // expired beyond grace period, or cannot be verified.
    Validate(ctx context.Context) error

    // FeatureEnabled reports whether a named feature is currently available.
    // Unknown features return false (fail-closed). The feature name space is
    // owned by the Feature Tier Matrix in PRD v1.7.
    FeatureEnabled(ctx context.Context, feature string) bool

    // ExpiresAt returns the hard expiry of the current license. For the
    // Cloud no-op implementation, this returns a sentinel far-future time.
    ExpiresAt(ctx context.Context) time.Time

    // Tier returns the current tier identifier (e.g. "cloud", "onprem-basic",
    // "onprem-enterprise", "onprem-government"). Used for metrics labeling
    // and the Feature Tier Matrix lookup.
    Tier(ctx context.Context) string
}

Constraints:

Phase 5 code that currently branches on environment variables for feature gating must migrate to FeatureEnabled once the no-op Cloud implementation lands. Net-new code written after ADR acceptance must use FeatureEnabled from the start.
FeatureEnabled is fail-closed: unknown feature names return false. This means adding a new feature requires explicitly registering it in the tier matrix; you cannot "accidentally" ship an enabled feature by forgetting the matrix entry. The rationale is that an On-Premise customer on a lower tier must not get a higher-tier feature through a matrix omission.
The no-op Cloud implementation returns true for any feature name (it is the one exception to fail-closed). This is acceptable because the Cloud deployment is operator-controlled — if a feature shouldn't ship, we don't deploy the code. In On-Premise, we can't control that, so fail-closed is mandatory.
Trust root decision: the On-Premise implementation uses an offline asymmetric signature model. Licenses are JSON documents signed with an UpsQuad-owned private key (Ed25519). The binary ships the corresponding public key. Validation is purely offline — no check-in to an UpsQuad server is required. This is mandatory to support true air-gap deployments (Government, Legal) where outbound network is not permitted. A grace period and periodic re-validation catch clock-skew and tampered system time; a compromised private key rotation is a new release artifact, not a runtime call-home. Flagged for founder review in §6, but the default decision is offline asymmetric.

3.5 Helm chart as primary On-Premise distribution

The canonical On-Premise distribution is a Helm chart at deployments/helm/upsquad/ (created in Phase 11; does not exist today). Customers install it into their own CNCF-conformant Kubernetes cluster. OCI image distribution is via ghcr.io (per ADR-0001 §9d) with a per-release immutable tag; air-gap customers pull the images into their own registry via a documented mirroring procedure.

Pulumi stays for Cloud only. The existing empty Pulumi scaffold at infra/pulumi/upsquad-infra/ (created per ADR-0001 §9e, merged in PR #61) remains the home for Cloud-only managed-infra declarations — CloudSQL, GKE Autopilot, Memorystore, Cloud Load Balancers, Secret Manager, etc. — once the ADR-0001 §8 tripwire fires.

Normative constraint on Pulumi: No Pulumi resource should reference a vendor-specific managed service whose underlying functionality is also required in On-Premise, unless an equivalent on-prem path exists. Concretely:

CloudSQL is fine (Postgres is required everywhere, the on-prem path is customer-managed Postgres via Helm values — Pulumi just provisions the Cloud-specific flavor).
Memorystore is fine for the same reason (Redis, customer-managed on-prem).
Cloud Tasks or Pub/Sub or Firestore or Bigtable are not fine for anything on the critical path, because there is no on-prem equivalent and that would force a Cloud-only code branch in internal/context/**. If a Pulumi change lands that adds such a service on a critical path, it is an ADR violation.

3.6 Feature Tier Matrix hand-off

The authoritative catalog of which features exist in which tier lives in PRD v1.7's Feature Tier Matrix section (being drafted in parallel by the Product Manager). The licensing module reads from this matrix at runtime via the feature-name keys in FeatureEnabled. Matrix changes are PRD changes and follow the normal PRD review cycle; they are not code changes.

The architect's constraint is that the matrix structure must be a flat feature_name → {tiers: [...], default_enabled: bool} mapping — no nested logic, no boolean expressions. If a feature's availability depends on some other feature being present, that is modelled by the caller checking both features, not by expressions inside the matrix. This keeps the matrix machine-readable and auditable, which matters when an On-Premise customer asks "what exactly am I paying for at the Enterprise tier."

4. Consequences

4.1 What we gain

Addressable market expansion. Government, Legal, and regulated Healthcare become reachable. These verticals were previously unreachable under a Cloud-only posture and are disproportionately large per-customer relative to Cloud SMB.
Code-quality forcing function. Vendor abstractions force cleaner seams between business logic and infra. The same seams make testing easier — fakes/in-memory implementations of BlobStorage and LLMGateway drop straight into unit tests without needing the ADR-0001 compose stack for every suite.
Durable CloudSQL deferral. The #59 closure (no permanent CloudSQL dev instance until a tripwire fires) was justified on phase-specific grounds. Vendor-neutrality as a tenet makes the same justification durable: the compose stack continues to cover the regression class we actually care about, and even post-tripwire, the non-managed paths stay exercised because on-prem customers run them.
Licensing module now, not later. Having the LicenseValidator interface in place from Phase 5 means every feature flag added between now and Phase 11 is already in the right shape when the real implementation arrives. No retrofit.

4.2 What we accept

Every future backend change must consider "does this work on-prem too?" This is an ongoing review-time obligation. The principal architect enforces it during code review; the forbidigo lint enforces the hardest cases automatically.
Abstraction indirection cost. Small runtime overhead from interface calls, small authoring overhead in creating two implementations (real + test fake) per abstraction. Judged acceptable.
Phase 11 is real work, not a rename. Moving existing GCS and LLM code behind the new interfaces, writing the S3/MinIO implementation, writing the openai_compatible implementation, writing the real LicenseValidator, and authoring the Helm chart together amount to a substantial phase. This ADR does not minimise that work; it commits to it.
Feature parity pressure. Cloud customers will see features that On-Premise customers don't have (e.g., managed observability, certain integrations). The Feature Tier Matrix is how we make that legible. The tension — "the Cloud version is better than what I'm paying Enterprise for" — is a product and pricing problem to be managed in PRD v1.7, not a code problem.
Observability stack gap. Prometheus + Grafana as currently assumed for Cloud may not be acceptable in all on-prem environments. See §6.
Support surface expansion. Supporting customer-managed Kubernetes clusters across unknown distributions (OpenShift, Rancher, vanilla kubeadm, EKS-A, etc.) is a support-cost line item. Not an architectural problem but flagged so the founders size it.

4.3 Tech debt surfaced now (to be tracked as follow-up issues after this ADR lands)

Each becomes a new GitHub issue labelled tech-debt phase-11 when this ADR is accepted:

GCS hardcoded in memory snapshot path. internal/context/memory/... imports cloud.google.com/go/storage directly. Must move behind §3.2. Estimated: medium — the call sites are localised but the interface design requires care.
OPENAI_API_KEY consumed directly in multiple places. Compaction (PR #35), guardrail secondary-model, and assembly pipeline callers all read the env var directly and call the OpenAI SDK. Must move behind §3.3. Estimated: medium-large — many call sites, but mostly mechanical once the gateway interface is stable.
Gemini SDK consumed directly in compaction. Same remediation as above; goes under internal/cloud/gemini.
Prometheus + Grafana assumed Cloud-only. The observability stack is not obviously on-prem-friendly for air-gap customers who may have their own monitoring. Needs a "bring-your-own-observability" clause in PRD v1.7 — the engine exposes /metrics in Prometheus format, and that's the contract. Whether the customer runs Prometheus themselves is their call. Estimated: small (documentation + Helm chart values).
Audit of remaining vendor imports. A grep -r "cloud.google.com\|aws-sdk\|azure-sdk" internal/ at the start of Phase 11 will identify anything the above misses. Estimated: small.
Secret plumbing. internal/context/** packages that read env vars for provider credentials must migrate to a config-passed-in model so the licensing and cloud boundary owns where secrets come from. Estimated: small.

5. Alternatives rejected

Two codebases (fork for On-Premise). Doubles engineering cost instantly. Divergence is not "possible" — it is certain, because the pressure to ship a Cloud-only feature will always be stronger than the pressure to backport. Every diverged line is a support nightmare and a security-patch nightmare. Rejected.
Cloud-only, no On-Premise. Rejected because the founders have committed to On-Premise as a new business model. Also rejected because it would cede Government, Legal, and regulated Healthcare to competitors with an on-prem story.
On-Premise as a VM appliance (OVA / qcow2 image). Rejected in favor of Helm. Modern enterprise customers in the target verticals operate Kubernetes. VM appliances are painful to update (stateful upgrade dance), painful to scale, and painful to monitor against existing customer tooling. Helm puts us inside the customer's existing operational model.
Usage metering on On-Premise too ("phone-home lite"). Rejected because customers paying a tier-based licensing fee will not tolerate phone-home — especially in Government and Legal, where any outbound traffic is a compliance event. Optional opt-in telemetry is discussed in §6 but is not the default.
Cloud-only with a "managed dedicated" offering (UpsQuad operates a single-tenant Cloud stack for enterprise customers). Rejected because it still puts the customer's data inside UpsQuad's infrastructure perimeter, which is the exact objection the target verticals raise. Managed dedicated addresses cost-of-multi-tenancy concerns, not data-sovereignty concerns. The target verticals care about the latter.

6. Open questions (for founder sign-off)

These are decisions the architect has a default recommendation for but wants explicit founder input before the ADR flips to Accepted.

Licensing key trust root — offline asymmetric key vs online check-in?
- Recommendation: offline asymmetric (Ed25519 signed license JSON).
- Rationale: Government and Legal verticals will reject online check-in as a matter of policy. Offline asymmetric supports true air-gap, and a compromised signing key is a new release, not a runtime call-home. The operational downside (no remote kill switch) is acceptable because revocation at the contract level is the real control; a kill switch we can never land in an air-gapped customer environment is not a real control anyway.
- Alternative: short-lived tokens with a weekly online check-in, with a 30-day grace period. Rejected as the default because the air-gap case makes it impossible.
Telemetry on On-Premise — zero phone-home, or opt-in only?
- Recommendation: zero phone-home for Government and Legal tiers; opt-in for Healthcare and commercial On-Premise tiers, off by default.
- Rationale: Matches the data-sovereignty posture that makes On-Premise attractive in the first place. Opt-in collection is still useful for non-regulated on-prem customers who want us to help them troubleshoot, but it must not be on by default. A single default-on telemetry field will cost us a Government deal the first time it's audited.
- Alternative: always-opt-in, uniformly. Also defensible — simpler to explain.
Observability stack on On-Premise — Prometheus + Grafana only, or allow customer-supplied?
- Recommendation: customer-supplied is first-class. The engine exposes /metrics in Prometheus format and that is the contract. The Helm chart ships optional Prometheus + Grafana sub-charts for customers who don't already have one, gated behind observability.bundled = false by default.
- Rationale: Most enterprise customers in the target verticals already run Prometheus, Datadog, Dynatrace, Splunk, Elastic, or OpenTelemetry Collector pipelines. Forcing them to run ours on top is a deal-loser. Shipping the bundled stack as an optional convenience for customers without existing tooling is the best of both.
- Alternative: ship bundled observability as the default and require customers to opt out. Rejected — too opinionated for enterprise procurement.

7. Files affected by this ADR

No code changes in this PR. This ADR is a blueprint. The file list below is informational — it tells downstream agents where things will land in Phase 9+ / Phase 11, not what to create today.

internal/context/storage/ — new interface package (Phase 11)
internal/cloud/gcs/ — destination for existing GCS client (Phase 11 move)
internal/context/llm/ — new interface package (Phase 11)
internal/cloud/openai/ — destination for existing OpenAI client (Phase 11 move)
internal/cloud/gemini/ — destination for existing Gemini client (Phase 11 move)
internal/cloud/openai_compatible/ — new BYO-LLM implementation (Phase 11)
internal/context/licensing/ — new interface package (no-op Cloud implementation lands in Phase 9; real On-Premise implementation in Phase 11)
deployments/helm/upsquad/ — new Helm chart (Phase 11)
scripts/lint/ — forbidigo rules for vendor-neutrality enforcement (follow-up to this ADR, can land in Phase 5)
infra/pulumi/upsquad-infra/ — unchanged in scope; clarified here as Cloud-only (no on-prem constructs ever land here)
CLAUDE.md — being updated in a parallel PR to add the vendor-neutrality tenet and the two-mode framing
docs/UpSquad_Complete_PRD.md — being updated to v1.7 in a parallel PR to add the On-Premise business model section and the Feature Tier Matrix

8. What this ADR does NOT do

It does not implement any of the abstractions. No new .go files, no code moves. Phase 11 is the implementation window for §3.2, §3.3, §3.5. The LicenseValidator no-op Cloud stub (§3.4) is the earliest implementation and can land as early as Phase 9.
It does not change the in-flight Phase 4 / Phase 5 scope. #22 versioning, #57 compose stack downstream tasks, #58 re-validation gate — all continue unchanged.
It does not reopen #59 (CloudSQL deferral). It strengthens the closure argument.
It does not weaken any ADR-0001 hard condition. The compose stack, PgBouncer configuration, lint rules, and smoke test contract in ADR-0001 remain fully in force. ADR-0002 adds obligations; it subtracts none.
It does not commit to a specific On-Premise customer, a specific pricing tier, or a specific launch date. Those are product decisions owned by PRD v1.7.
It does not approve any managed-infra spend. ADR-0001 §8 still governs when managed infra is stood up.

1. Context​

2. Decision​

3. Architectural consequences (normative)​

3.1 Vendor-neutrality constraint​

3.2 Storage abstraction (interface-first)​

3.3 LLM gateway abstraction (interface-first)​

3.4 Licensing module (interface stub now, real implementation Phase 11)​

3.5 Helm chart as primary On-Premise distribution​

3.6 Feature Tier Matrix hand-off​

4. Consequences​

4.1 What we gain​

4.2 What we accept​

4.3 Tech debt surfaced now (to be tracked as follow-up issues after this ADR lands)​

5. Alternatives rejected​

6. Open questions (for founder sign-off)​

7. Files affected by this ADR​

8. What this ADR does NOT do​

1. Context

2. Decision

3. Architectural consequences (normative)

3.1 Vendor-neutrality constraint

3.2 Storage abstraction (interface-first)

3.3 LLM gateway abstraction (interface-first)

3.4 Licensing module (interface stub now, real implementation Phase 11)

3.5 Helm chart as primary On-Premise distribution

3.6 Feature Tier Matrix hand-off

4. Consequences

4.1 What we gain

4.2 What we accept

4.3 Tech debt surfaced now (to be tracked as follow-up issues after this ADR lands)

5. Alternatives rejected

6. Open questions (for founder sign-off)

7. Files affected by this ADR

8. What this ADR does NOT do