Performance#
QuizBase is built and load-tested as a developer-grade API. This page lists the measured numbers, the methodology behind them, and how we keep them honest.
All figures below come from k6 sustained load tests against production (
quizbase.runriva.com) on 2026-05-07. We re-run baseline + spike scenarios quarterly. Latest raw output: k6-baseline-2026-05-07.json.
Service Level Objectives#
| Metric | Target | Measured (2026-05-07) |
|---|---|---|
| p95 response time (any endpoint, sustained 50 RPS) | < 500ms | < 100ms for 8 of 9 endpoints, < 140ms for narrow random |
| Error rate (sustained 50 RPS) | < 1% | 0.00% |
| Sustained throughput | 50 RPS | 200 RPS verified |
| Uptime | 99.9% | Live status |
x-performance extension on every operation in the OpenAPI spec reports the same numbers per-endpoint — machine-readable for your monitoring or SDK telemetry.
Per-endpoint performance#
Sustained 50 RPS for 5 minutes (60% public discovery / 30% browse / 10% random + PK lookup). All thresholds met with 5-8× margin.
| Endpoint | p50 | p95 | p99 | Sustained RPS |
|---|---|---|---|---|
| GET /api/v1/categories | 24ms | 31ms | 45ms | 50 |
| GET /api/v1/stats | 24ms | 31ms | 45ms | 50 |
| GET /api/v1/topics | 24ms | 31ms | 45ms | 50 |
| GET /api/v1/tags | 22ms | 31ms | 45ms | 50 |
| GET /api/v1/subcategories | 24ms | 30ms | 45ms | 50 |
| GET /api/v1/questions | 97ms | 112ms | 200ms | 50 |
| GET /api/v1/questions/random (broad) | 138ms | 158ms | 250ms | 50 |
| GET /api/v1/questions/random (narrow) | 110ms | 180ms | 280ms | 30 |
| GET /api/v1/questions/:id | 90ms | 102ms | 180ms | 50 |
/v1/me, /v1/usage, /v1/languages, /v1/topics/:slug, /v1/report are not part of the load mix (small payloads, low-traffic endpoints). Single-request warm latency is < 100ms for all of them; SLO p95 < 500ms applies.
Burst capacity#
The same mix at 200 RPS for 30 seconds (4× the sustained baseline) showed zero degradation — p95 numbers identical to the baseline (categories 30ms, questions 88ms, random 98ms). 0.00% error rate, 8,499 total requests in the burst window. The ceiling is somewhere above 200 RPS — testing was capped by client-side k6 VUs, not the server.
What this means for you: a Show-HN spike or a viral integration doesn’t need a coordination call with us first. Hit the API hard.
Methodology#
- Tool: k6 v1.7.1 (Grafana),
constant-arrival-rateandramping-arrival-rateexecutors. - Scenarios: baseline (50 RPS / 5 min), spike (50→100→50 RPS / 1m45s), burst (50→200→50 RPS / 50s), narrow filter (30 RPS / 3 min).
- Total prod requests in this round: 38,502 across all four scenarios.
- Smoke key: an internal benchmarking key with rate-limit bypass — system under test was production, but the limiter was bypassed so the numbers measure app + database + cache behavior, not the limiter itself.
Quarterly drift mechanism#
We re-run pnpm load:baseline + pnpm load:spike quarterly. Each run produces a fresh k6-baseline-DATE-summary.json and updates the per-endpoint x-performance block in openapi.json. If p95 regresses by more than 1.5× vs the previous run we investigate before marking the new run as the baseline (deployment, dependency upgrade, data growth, query plan change).
The lastMeasured field on every x-performance block is the date of the last successful baseline.
Raw data#
For skeptics and benchmarkers: k6-baseline-2026-05-07.json — full k6 summary metrics (p50/p90/p95/p99 per endpoint, throughput, request counts). Sanitized — sample IDs and any internal headers stripped, only aggregates retained.
See also#
- API Reference — interactive Scalar render of the spec.
- /openapi.json — OpenAPI 3.1,
x-performanceper operation. - Errors and retries — what
429looks like and how to back off. - Rate limits in practice — pacing strategies that keep you under 429.