Skip to content
Command Palette
Search for a command to run...
QuizBase · Docs

Performance#

QuizBase is built and load-tested as a developer-grade API. This page lists the measured numbers, the methodology behind them, and how we keep them honest.

All figures below come from k6 sustained load tests against production (quizbase.runriva.com) on 2026-05-07. We re-run baseline + spike scenarios quarterly. Latest raw output: k6-baseline-2026-05-07.json.

Service Level Objectives#

MetricTargetMeasured (2026-05-07)
p95 response time (any endpoint, sustained 50 RPS)< 500ms< 100ms for 8 of 9 endpoints, < 140ms for narrow random
Error rate (sustained 50 RPS)< 1%0.00%
Sustained throughput50 RPS200 RPS verified
Uptime99.9%Live status

x-performance extension on every operation in the OpenAPI spec reports the same numbers per-endpoint — machine-readable for your monitoring or SDK telemetry.

Per-endpoint performance#

Sustained 50 RPS for 5 minutes (60% public discovery / 30% browse / 10% random + PK lookup). All thresholds met with 5-8× margin.

Endpointp50p95p99Sustained RPS
GET /api/v1/categories24ms31ms45ms50
GET /api/v1/stats24ms31ms45ms50
GET /api/v1/topics24ms31ms45ms50
GET /api/v1/tags22ms31ms45ms50
GET /api/v1/subcategories24ms30ms45ms50
GET /api/v1/questions97ms112ms200ms50
GET /api/v1/questions/random (broad)138ms158ms250ms50
GET /api/v1/questions/random (narrow)110ms180ms280ms30
GET /api/v1/questions/:id90ms102ms180ms50

/v1/me, /v1/usage, /v1/languages, /v1/topics/:slug, /v1/report are not part of the load mix (small payloads, low-traffic endpoints). Single-request warm latency is < 100ms for all of them; SLO p95 < 500ms applies.

Burst capacity#

The same mix at 200 RPS for 30 seconds (4× the sustained baseline) showed zero degradation — p95 numbers identical to the baseline (categories 30ms, questions 88ms, random 98ms). 0.00% error rate, 8,499 total requests in the burst window. The ceiling is somewhere above 200 RPS — testing was capped by client-side k6 VUs, not the server.

What this means for you: a Show-HN spike or a viral integration doesn’t need a coordination call with us first. Hit the API hard.

Methodology#

  • Tool: k6 v1.7.1 (Grafana), constant-arrival-rate and ramping-arrival-rate executors.
  • Scenarios: baseline (50 RPS / 5 min), spike (50→100→50 RPS / 1m45s), burst (50→200→50 RPS / 50s), narrow filter (30 RPS / 3 min).
  • Total prod requests in this round: 38,502 across all four scenarios.
  • Smoke key: an internal benchmarking key with rate-limit bypass — system under test was production, but the limiter was bypassed so the numbers measure app + database + cache behavior, not the limiter itself.

Quarterly drift mechanism#

We re-run pnpm load:baseline + pnpm load:spike quarterly. Each run produces a fresh k6-baseline-DATE-summary.json and updates the per-endpoint x-performance block in openapi.json. If p95 regresses by more than 1.5× vs the previous run we investigate before marking the new run as the baseline (deployment, dependency upgrade, data growth, query plan change).

The lastMeasured field on every x-performance block is the date of the last successful baseline.

Raw data#

For skeptics and benchmarkers: k6-baseline-2026-05-07.json — full k6 summary metrics (p50/p90/p95/p99 per endpoint, throughput, request counts). Sanitized — sample IDs and any internal headers stripped, only aggregates retained.

See also#