Skip to content

Issue #43: KPIs, observability, performance budget

  • Issue: #43
  • Started: 2026-04-28
  • Completed: 2026-04-28

Goal

Three new docs that lock the M1 measurement story:

  1. A KPI reference at docs/reference/kpis.md ordering every metric by tier (north-star → secondary → health), each with its source, target, and cadence.
  2. A decision-time RFC at docs/rfc/0007-observability-stack.md picking the log shipper, error reporter, metrics layer, alerting destination, and dashboard surface — under a $20/mo M1 ceiling, constrained by RFC-0001 Cloudflare hosting.
  3. A performance budget reference at docs/reference/performance-budget.md with per-surface (frontend, backend, scraper, cost) numeric budgets, each anchored in a persona, retailer, or RFC fact.

After this lands, #28 page design, #19 hosting, #18 scraper queue, #14 alerts, #29 DB, and #44 security all have a measurement contract to design against.

Out of scope

  • Implementation. No code, no package.json changes, no wrangler config. The RFC describes the chosen stack; wiring it up is its own ticket after #19 lands.
  • Telegram bridge. MASTER's existing telegram-notify MCP is intentionally not wired in by default — alerts default to Cloudflare email Notifications. Telegram is an opt-in MASTER decision, surfaced as an open question.
  • Long-form analytics product. No funnels, retention cohorts, or segmentation tooling at M1. Cloudflare Web Analytics + Workers Analytics Engine cover the north-star metric without a dedicated product-analytics SaaS.
  • APM / distributed tracing. Deferred to M2+. Cloudflare Workers' performance.now() zeroing on CPU-bound spans makes server-side traces low-signal until tracing matures.

Approach

Decision-time RFC, not exploratory. The constraints — Cloudflare hosting per RFC-0001, $5–20/mo budget, mobile-Lebanese audience, solo evening project, "ship M1 now" mindset — collapse the option space tightly. Five vendor options were researched in parallel via superpowers:dispatching-parallel-agents; verdicts converged on a Cloudflare-native primary stack with Sentry Free + Axiom Free as zero-cost augments. KPI tiering follows the standard north-star → secondary → health pyramid, anchored in personas §5.6 wedge pains and §6.3 LTV signals. Performance budget anchors every number in a persona, retailer fact, or RFC constraint — no Lighthouse-default cargo-culting.

Steps

  • Step 1: Read context.
  • Issue #43 body (via local docs since gh is unauthenticated on this box), personas.md §5.2 + §5.6 + §7, competitive-landscape.md §3.7 (craft baseline) + §4.4 (Lebanese retailers), tech-stack.md (no observability today), retailers.md (scraper context), RFC-0001 (Cloudflare Pages + Workers), RFC-0002 (pg-boss), architecture/deployment.md, architecture/ingest-pipeline.md.
  • Verification: scope crystallized into the three deliverables above.

  • Step 2: Dispatch parallel vendor research.

  • Five subagents in one message, each researching one option with current 2026 pricing pages:
    1. Sentry — error tracking, Cloudflare Workers SDK, free-tier specifics
    2. Highlight.io vs PostHog — all-in-one observability + product analytics
    3. Plausible vs Cloudflare Web Analytics — pageviews + RUM Web Vitals
    4. Cloudflare-native primitives — Workers Observability, Logs, Analytics Engine, Logpush, Web Analytics, Notifications
    5. Log shipping destinations — Axiom, Better Stack, Datadog, R2 + grep, Baselime/Cloudflare-acquired, Grafana Loki
  • Verification: all five returned with verified-from-live-page pricing + flags for stale claims.

  • Step 3: Pick the stack and write the RFC.

  • Convergence: Cloudflare-native primary (Web Analytics + Workers Observability + Analytics Engine) + Sentry Free for error grouping + Axiom Free for 30-day log retention. Total incremental observability spend = $0 above the $5/mo Workers Paid baseline already in RFC-0001.
  • Verification: docs/rfc/0007-observability-stack.md drafted with full alternatives table, trade-offs, and open questions.

  • Step 4: Write the KPI reference.

  • North-star: Weekly Qualified Outbound Clicks (WQOC) — distinct visitors per ISO week who clicked through to a retailer via /api/go/.... Defensible because the click is the moment the aggregator delivers value; everything before is intent.
  • Secondary tracks (Builder, Casual, Aggregator-core, Health) per persona-track and ticket scope.
  • Verification: docs/reference/kpis.md reads top-to-bottom, every metric has source + target + cadence.

  • Step 5: Write the performance budget.

  • Lebanese mobile reality (personas §5.2): mobile-first non-negotiable, 4G + occasional generator outages, image weight matters acutely.
  • Per-surface targets: frontend Core Web Vitals (LCP < 2.5s P75, INP < 200ms P75, CLS < 0.1, TTFB < 600ms) + bundle (initial JS < 100KB gz, hero img < 200KB) + backend (product-list < 100ms p95, search < 300ms p95, clickout 302 < 50ms p95) + scraper (full-roster < 30min) + cost (< $20/mo all-in M1).
  • Verification: docs/reference/performance-budget.md every number has a rationale citation.

  • Step 6: Cross-link.

  • Update docs/reference/index.md (add kpis + performance-budget rows).
  • Update docs/rfc/index.md (add RFC-0007 row).
  • Update docs/plans/index.md (add this plan row).
  • Verification: all three index updates land in the same commit.

  • Step 7: Build docs strict + commit.

  • mkdocs build --strict to catch broken refs.
  • One atomic commit docs(foundation): kpis + observability stack + perf budget (#43). No push, no PR.

Risks

Risk Likelihood Mitigation
RFC numbering collision (parallel worktrees may also be authoring 0005/0006) medium Used 0007 per ticket prompt instruction. If a collision surfaces, renumber in a follow-up commit before merge — the slug is unique.
Vendor pricing drifts before MASTER reads the RFC low Every pricing claim flagged with verification date (2026-04-28) and source URL. Re-spot-check before signoff if delayed >30 days.
Cloudflare Analytics Engine billing flips on (currently "you will not be billed" disclaimer in docs) low Cost trajectory analyzed under the published $0.25/M-write rate; even at full billing, M1 stays under $1/mo extra.
Sentry free-tier email-only alerts feel anaemic medium Wired only as a backstop; primary alerting is Cloudflare email Notifications + (optional) Tail Worker → Discord/Slack webhook. Telegram bridge surfaced as an open question for MASTER.
Performance budget numbers feel pulled from the air low Each anchored to personas §5.2 / a retailer fact / an RFC constraint / a verified Cloudflare PoP datum. Rationale column makes the source explicit.

Tests

  • mkdocs build --strict passes (catches broken cross-refs, missing files).
  • All four new files render with no link warnings.

No code, no vitest. Foundation ticket.

Doc updates

Per Contributing → what needs updating:

  • Reference: docs/reference/kpis.md (new), docs/reference/performance-budget.md (new), docs/reference/index.md (rows added)
  • RFC: docs/rfc/0007-observability-stack.md (new), docs/rfc/index.md (row added)
  • Plan: this file + docs/plans/index.md (row added)
  • Architecture: architecture/deployment.md § Observability is currently a stub — leave to the implementation ticket that follows RFC-0007 acceptance, not this foundation work
  • Glossary: no new terms emerged; defer
  • Issue body: not updating from this branch — closes via PR

Rollback

Pure docs. git revert <sha> and re-run mkdocs build --strict. No infra to unwind.