ADR-0010: Gosec runs informational-only until G115 baseline is triaged
- Status: Proposed (awaiting founder sign-off per CLAUDE.md escalation list: security-adjacent policy change)
- Date: 2026-04-20
- Decision owners: DevOps Engineer (proposer), Principal Architect (review), Founder (sign-off)
- Related: Issue #728 §P3, Workflow
.github/workflows/security-scan.yml, ADR-0011 (future — G115 per-site remediation)
Context
The Security Scan GitHub Actions workflow had a 96% failure rate over 438 runs / 14 days (2026-04-06 → 2026-04-20). Root-cause investigation (issue #728 P3) sampled 14 recent failing runs and produced the following per-job failure rate:
| Job (check-run name) | Required by branch protection | Fails in sample |
|---|---|---|
Go Vulnerability Check (govulncheck) | Yes | 0 / 12 |
Container Image Scan (trivy) | Yes | 0 / 12 |
Data-Class Registry Coverage (class-coverage) | Yes | 0 / 12 |
Go Security Analysis (gosec) | No | 12 / 12 |
(The remaining 2 of 14 runs were from PR #732's path-filter feature branch and were excluded as branch-local noise.)
Gosec is the sole driver of the 96% figure. It is not in the required-checks list, so its failures never blocked a merge — the signal has been red-rotting without anyone acting on it. 54 findings were reported, uniform across all sampled runs:
- 51 × G115 (CWE-190, integer overflow
int → int32) — all in gRPC response marshaling where the sourceintis already bounded by page size, CSV row count, or scope registry size. Example sites:internal/bulkimport/service.go(8 sites),internal/audit/grpcserver/server.go(3 sites),internal/compliance/grpcserver.go(4 sites), plus 36 more in proto response builders. One existing// #nosec G115 -- bounded by page_size and wall-clockannotation atinternal/audit/grpcserver/server.go:200confirms the team has already reviewed this pattern once and accepted it. - 2 × G118 (CWE-400, goroutine uses
context.Background) atinternal/gateway/jti.go:183andcmd/compliance-engine/main.go:355. Both sites have inline comments documenting intentional context-detachment (avoid caller-timeout aborting a long-running refresh / lease-release). - 1 × G703 (CWE-22, path traversal via taint) at
cmd/reconcile-report/main.go:145. Internal ops CLI tool;outputDiris an operator-supplied flag andreportDate.Format("2006-01-02")is a controlled date string. Not a network-exposed surface.
None of the 54 findings is an actionable remote-exploit vector. They are a mix of proto-marshaling false positives (G115), documented intentional patterns (G118), and one low-risk internal-CLI site (G703).
Decision
Set continue-on-error: true on the gosec job in .github/workflows/security-scan.yml, mirroring the existing govulncheck pattern (in the same workflow, adopted for the analogous stdlib-CVE-noise problem).
This keeps gosec in the pipeline: the check-run continues to be posted on every PR, the logs continue to be captured, and new findings introduced in future commits remain visible to reviewers. It merely stops gosec from failing the overall workflow until the G115 baseline is addressed.
We do not remove gosec, do not add it to branch protection, and do not silence any specific rule via allowlist. Those are separate future decisions to be made after the G115 remediation ADR is written (ADR-0011, follow-up).
Consequences
Positive:
- Security Scan's measured success rate jumps from ~4% to ≥80% once the change lands. AC-P1-c ("post-fix success rate > 80% over 7-day window") becomes measurable with confidence — the 3 non-gosec jobs have a 100% success rate in the sample, so the workflow pass-rate will track the
changesjob's pass-rate (near-100%) once gosec stops being a gate. - Eliminates false-red noise in PR check-run UIs. Reviewers stop habituating to a red "Security Scan" badge on every PR.
- Zero loss of signal — the gosec log is still attached to every run. A reviewer or scheduled scan can still surface new findings.
Negative:
- Any newly introduced gosec finding (including potentially a real vulnerability) is now non-blocking. Mitigations:
- The ADR-0011 follow-up will categorize every existing finding and either fix it in code or annotate it with a justified
// #nosec <rule> -- <reason>. After that, re-enabling blocking mode becomes viable. - If a critical vulnerability surfaces in the interim, the code-review process (required architect approval on every PR) remains the primary line of defence.
- The ADR-0011 follow-up will categorize every existing finding and either fix it in code or annotate it with a justified
Neutral:
- The 54 findings documented above constitute a baseline. ADR-0011 should treat any finding not on the documented baseline as a new issue that requires remediation. A simple mechanism: stored SARIF from the current baseline, compared run-over-run via
gosec-baselineor equivalent (follow-up work, out of scope for this ADR).
Rollback
Revert the two-line workflow edit (continue-on-error: true + the explanatory comment block) via a revert PR. No data migration, no configuration drift — gosec returns to blocking mode immediately on merge.
Out of scope
- Per-site
// #nosecannotations for the 3 non-G115 findings. Deferred to ADR-0011 / a small follow-up PR once this ADR is accepted. - Platform-wide gosec rule exclusions (e.g.,
-exclude=G115). Considered but rejected as a first step: a config-level exclusion hides future real G115 findings along with the baseline; thecontinue-on-errorapproach keeps all findings visible. - Moving gosec into branch-protection required-checks. Cannot do until the baseline is remediated; deferred to ADR-0011.
- Modifying branch protection, required checks list, or any other gate in this workflow.