Skip to content

ADR-0012: Security controls posture (M1)

Context

RFC-0006 surfaced six load-bearing security decisions that needed MASTER's call before #11 (Auth.js) and #16 (admin) could be scoped: Auth.js scope; CSP style-src posture given Radix portal inline-style behaviour; rate-limit topology; audit-log retention; vulnerability disclosure surface; and Cloudflare paid-tier acceptance. The threat model and security baseline shipped alongside the RFC and surface ~50 controls, most of which are unambiguous; this ADR captures the six that aren't.

Decision

Q1 — Auth.js: canonical defaults plus targeted hardening. Defaults of Auth.js v5 (checks: ["pkce", "state"], JWE-encrypted JWT in __Secure- HttpOnly cookies, SameSite=Lax, automatic session-ID rotation on sign-in) plus: explicit allowDangerousEmailAccountLinking: false per provider; callbacks.signIn rejecting unverified emails; events.signIn / signOut / linkAccount wired to audit_log; least-privilege OAuth scopes (openid email profile, read:user user:email); custom pages.error to suppress verbose errors; session.maxAge = 7 days; admin authz defence-in-depth (middleware and auth() check inside every admin server component).

Q2 — CSP style-src: interim 'unsafe-inline', tighten when Radix supports nonces. Land style-src 'self' 'unsafe-inline' with a documented residual risk and a // TODO(security) marker in next.config.ts. script-src stays nonce-locked the whole time ('self' 'strict-dynamic' 'nonce-{REQUEST_NONCE}'). Track the upstream Radix nonce-API issue; flip to nonce-only style-src when a Radix release supports it. Re-evaluate at the quarterly stack review (SB-082).

Concrete starting CSP — pass through the Google CSP Evaluator before final commit:

default-src 'self';
script-src 'self' 'strict-dynamic' 'nonce-{REQUEST_NONCE}';
style-src 'self' 'unsafe-inline'; /* Radix portal limitation; tighten later */
img-src 'self' data: <retailer-cdn-allow-list>;
font-src 'self';
connect-src 'self';
frame-ancestors 'none';
form-action 'self';
base-uri 'none';
object-src 'none';
upgrade-insecure-requests;

Q3 — Rate-limit topology: Cloudflare WAF only at M1; app-side bucket the moment #17 attaches CPC payouts. One free-tier Rate Limiting Rule scoped to /api/* POST + /api/go/* + /api/auth/* at 60 req/min/IP, action = managed challenge, plus zone-wide Bot Fight Mode. Sufficient when the worst-case targeted-listing inflation is metric corruption. The threat upgrades to direct theft the moment third-party CPC payouts attach to clicks; at that point #17 must ship a Postgres-backed per-(ipHash, listingId) token bucket as a non-negotiable acceptance criterion. Mandatory tie: any #17 PR that ships CPC payout without the bucket is a regression of this ADR.

Q4 — Audit-log retention: indefinite in Postgres at M1. No archival policy. Table is genuinely tiny at hobby scale (10s of rows/day). Transition to 1 year hot in Postgres + cold archive to R2 indefinitely when (a) the table crosses ~1 GB, or (b) #40 lands a retention obligation that constrains audit_log. Don't pre-engineer archival before there's data to archive.

Q5 — Vulnerability disclosure: security.txt + GPG key + minimal credits page. Publish /.well-known/security.txt listing the contact email, GPG fingerprint, and a 90-day expiry per RFC 9116. Upload the public key to a keyserver. Maintain a one-line public credits page on the docs site. Cost ≈ $0; responsible-disclosure surface present from launch. Bug-bounty platform sign-up (Bugcrowd / HackerOne) explicitly out of scope at hobby budget; revisit when affiliate revenue is real and worth defending. Tracked by SB-081.

Q6 — Cloudflare paid security features: Free tier at M1. Bot Fight Mode + the single free Rate Limiting Rule + Turnstile cover M1's surface. Trigger to upgrade to Pro ($20/mo): traffic justifies a second rate-limit rule (separate /api/auth/* from /api/go/*), OR Super Bot Fight Mode's better headless / "likely automated" classification becomes necessary. Stop and ask is the rule when a security control discovered during implementation requires Pro or higher; this ADR does not pre-authorise the upgrade. Bot Management (Enterprise) explicitly out of scope.

Meta — single combined ADR. All six decisions land in this one ADR rather than six separate ones, since they are closely related and reading them together is materially easier than navigating six ADRs.

Re-evaluation cadence. Q1 and Q2 are independent of legal/privacy; revisit at the quarterly stack review. Q3 reopens at #17 implementation. Q4 and Q5 reopen if #40 lands a data-breach-notification obligation that constrains them. Q6 reopens on traffic.

Consequences

Positive

  • Q1: closes the email-collision merge attack (Auth.js Surface C threat) and the "OAuth callback didn't sign in but we don't know who tried" repudiation gap. Cost: ~1 day of config work over canonical defaults — small.
  • Q2: ships shadcn/ui + Radix immediately without forking the library or losing accessibility primitives. script-src stays nonce-locked, which is where 99% of the XSS containment value lives.
  • Q3: M1 surface stays lean. No per-click DB write. The implementation hook to #17 is explicit and unambiguous, so the upgrade can't be forgotten.
  • Q4: zero archival code at M1. Decision is reversible at any time the table actually grows.
  • Q5: responsible-disclosure surface from day one. A serious researcher has a published path that doesn't require Twitter callouts.
  • Q6: $20/mo of Cloudflare Pro stays unspent until traffic justifies it. Trigger is documented.

Negative

  • Q2 residual XSS-via-style risk: an attacker who lands an XSS can change colours / layout / expose hidden elements via inline styles. The bound is real (no cookie exfiltration via styles) but not zero.
  • Q3 metric-corruption risk at M1: targeted-listing click inflation is possible on free-tier CF rate limiting, since the rule is gross-IP-bucket. Acceptable when clicks aren't paid for; not acceptable once they are.
  • Q4 indefinite growth of audit_log: storage cost will eventually grow. The cost is bounded by hobby-scale write volume (10s of rows/day) and the row size; even at M2+ scale this stays small for a long time.
  • Q5 mailbox burden: security@961tech.com (or MASTER's existing email) will receive both legitimate disclosures and noise/spam. Manageable; not free.
  • Q6 residual headless-bot click inflation: Free-tier Bot Fight Mode misses some "likely automated" traffic that Super Bot Fight Mode catches. Acceptable at current traffic.

Neutral

  • The threat model surfaces ~50 controls. This ADR locks 6. The other ~44 are either already in place, trivial one-file changes, or scheduled into the M1/M2 ticket flow per Reference → Security baseline. Status flip from proposed to decided on the affected baseline rows happens in a follow-up commit.
  • This ADR does not author a per-CVE response runbook or DR/RTO/RPO posture; those belong to #43 observability and RFC-0001 hosting respectively.

Alternatives considered

Q1 Alternative: canonical defaults only

Rejected. The targeted hardening costs ~1 day; the residual risks it closes (email-collision merge, OAuth callback repudiation gap) are real attack vectors against an aggregator that handles user accounts and admin actions. Canonical defaults alone leave #11 Auth.js under-specified.

Q1 Alternative: maximal (DB-strategy sessions, short JWT + refresh, mandatory provider 2FA)

Rejected. DB-strategy sessions add a DB roundtrip per request that materially worsens the Cloudflare Workers economics established by ADR-0006 hosting. Mandatory provider 2FA is unenforceable from our side. Net cost too high for the marginal gain at hobby scale.

Q2 Alternative: nonce-only style-src with Radix monkey-patching

Rejected. Radix does not currently expose a nonce prop on its portals. Patching the library is brittle across Radix upgrades and creates ongoing maintenance debt that a solo project cannot afford.

Q2 Alternative: skip Radix Popover/Tooltip/Dialog, build CSS-only

Rejected. Loses the shadcn/ui value-add (RFC-0004 chose shadcn precisely for these primitives) and forfeits accessibility wins those primitives provide.

Q3 Alternative: CF + app-side bucket from M1

Rejected at M1. Without CPC payouts attached to clicks, "metric corruption" is the worst case — annoying, not stealing money. The DB-write cost and counter-storage decision aren't justified before #17 attaches real money. Coupling the upgrade to #17 keeps M1 lean without leaving the gap unowned.

Q3 Alternative: CF + Workers Durable Objects bucket

Rejected. Durable Objects is paid past free tier and adds CF-binding complexity. The Postgres-backed bucket (Q3 Option B) is cheaper and architecturally simpler when the upgrade trigger fires.

Q4 Alternative: 90 days hot, no archive

Rejected. Forensic capacity capped at 90 days. The aggressive deletion has zero practical value at hobby scale where the table is tiny anyway.

Q4 Alternative: tiered retention (auth events 1 year; admin mutations forever; queries 90 days)

Rejected at M1. Adds policy code and per-event-class complexity for no current benefit. Revisit if #40 lands an obligation that demands tiering.

Q5 Alternative: nothing

Rejected. Bad PR exposure on the day a real CVE lands. Reports come in via random channels (GitHub issue, Twitter DM) with no SLA, no triage path. security.txt + GPG + credits page is one file plus one email plus one keyserver upload — not worth saving.

Q5 Alternative: Bugcrowd / HackerOne

Rejected at hobby budget. HackerOne Lite starts at $99/mo; Bugcrowd's hobby tier varies. Both attract low-quality reports unless scope is set very tightly. Defer until affiliate revenue is real.

Q6 Alternative: pre-authorise Pro tier

Rejected. The triggers for the upgrade are documented in the decision (second rate-limit rule needed; Super Bot Fight Mode classification needed). Pre-authorising the spend before either trigger fires is unprincipled. Stop and ask is the explicit rule for any ad-hoc upgrade requirement discovered during implementation.

Meta Alternative: one ADR per question (six ADRs)

Rejected. The six questions interlock: Q1 sets the auth surface that Q3 protects via rate limits; Q4 retention is constrained by Q5 disclosure expectations and Q1's audit_log writes; Q6 is implicit in Q3 (rate limit rule count) and Q2 (Bot Fight Mode level). Reading them together is materially easier than chasing six cross-linked ADRs.

References