Performance#

QuizBase is built and load-tested as a developer-grade API. This page lists the measured numbers, the methodology behind them, and how we keep them honest.

All figures below come from k6 sustained load tests against production (quizbase.runriva.com) on 2026-05-07. We re-run baseline + spike scenarios quarterly. Latest raw output: k6-baseline-2026-05-07.json.

Service Level Objectives#

Metric	Target	Measured (2026-05-07)
p95 response time (any endpoint, sustained 50 RPS)	< 500ms	< 100ms for 8 of 9 endpoints, < 140ms for narrow random
Error rate (sustained 50 RPS)	< 1%	0.00%
Sustained throughput	50 RPS	200 RPS verified
Uptime	99.9%	Live status

x-performance extension on every operation in the OpenAPI spec reports the same numbers per-endpoint — machine-readable for your monitoring or SDK telemetry.

Per-endpoint performance#

Sustained 50 RPS for 5 minutes (60% public discovery / 30% browse / 10% random + PK lookup). All thresholds met with 5-8× margin.

Endpoint	p50	p95	p99	Sustained RPS
GET /api/v1/categories	24ms	31ms	45ms	50
GET /api/v1/stats	24ms	31ms	45ms	50
GET /api/v1/topics	24ms	31ms	45ms	50
GET /api/v1/tags	22ms	31ms	45ms	50
GET /api/v1/subcategories	24ms	30ms	45ms	50
GET /api/v1/questions	97ms	112ms	200ms	50
GET /api/v1/questions/random (broad)	138ms	158ms	250ms	50
GET /api/v1/questions/random (narrow)	110ms	180ms	280ms	30
GET /api/v1/questions/:id	90ms	102ms	180ms	50

/v1/me, /v1/usage, /v1/languages, /v1/topics/:slug, /v1/report are not part of the load mix (small payloads, low-traffic endpoints). Single-request warm latency is < 100ms for all of them; SLO p95 < 500ms applies.

Burst capacity#

The same mix at 200 RPS for 30 seconds (4× the sustained baseline) showed zero degradation — p95 numbers identical to the baseline (categories 30ms, questions 88ms, random 98ms). 0.00% error rate, 8,499 total requests in the burst window. The ceiling is somewhere above 200 RPS — testing was capped by client-side k6 VUs, not the server.

What this means for you: a Show-HN spike or a viral integration doesn’t need a coordination call with us first. Hit the API hard.

Methodology#

Tool: k6 v1.7.1 (Grafana), constant-arrival-rate and ramping-arrival-rate executors.
Scenarios: baseline (50 RPS / 5 min), spike (50→100→50 RPS / 1m45s), burst (50→200→50 RPS / 50s), narrow filter (30 RPS / 3 min).
Total prod requests in this round: 38,502 across all four scenarios.
Smoke key: an internal benchmarking key with rate-limit bypass — system under test was production, but the limiter was bypassed so the numbers measure app + database + cache behavior, not the limiter itself.

Quarterly drift mechanism#

We re-run pnpm load:baseline + pnpm load:spike quarterly. Each run produces a fresh k6-baseline-DATE-summary.json and updates the per-endpoint x-performance block in openapi.json. If p95 regresses by more than 1.5× vs the previous run we investigate before marking the new run as the baseline (deployment, dependency upgrade, data growth, query plan change).

The lastMeasured field on every x-performance block is the date of the last successful baseline.

Raw data#

For skeptics and benchmarkers: k6-baseline-2026-05-07.json — full k6 summary metrics (p50/p90/p95/p99 per endpoint, throughput, request counts). Sanitized — sample IDs and any internal headers stripped, only aggregates retained.