Rate limits in practice#
Plan your capacity#
| Tier | Burst (per 10s) | Requests/day | Good for |
|---|---|---|---|
| Free | 10 | 500 | Prototypes, hobby apps, low-traffic blogs |
| Indie | 30 | 10,000 | Indie devs, side projects, small SaaS |
| Pro | 200 | 100,000 | Production apps, games, high-traffic sites |
| Enterprise | custom | custom | Custom rate limits, custom SLA, private datasets |
Current pricing lives at /pricing. All keys on your account share one counter — qb_pk_* and qb_sk_* are scopes for browser vs server use, not separate quotas. REST requests and MCP tools/call / resources/read / prompts/get consume from the same bucket.
Read the headers#
Every response — even 200 OK — carries IETF RateLimit-* headers:
HTTP/1.1 200 OK
RateLimit-Limit: 10
RateLimit-Remaining: 7
RateLimit-Reset: 6
RateLimit-Policy: 10;w=10 Parse once, log, alert:
Idempotency#
All GET endpoints are idempotent by definition — you can retry safely. The server de-duplicates by X-Request-Id for observability, but nothing is stored twice even if you retry without an id.
Keep X-Request-Id from your first attempt in logs so you can trace retries.
Backoff with jitter#
async function quizbaseFetch<T>(
url: string,
key: string,
opts: { maxAttempts?: number; logger?: Console } = {}
): Promise<T> {
const { maxAttempts = 5, logger = console } = opts;
for (let attempt = 1; attempt <= maxAttempts; attempt++) {
const res = await fetch(url, { headers: { 'X-API-Key': key } });
if (res.ok) return res.json() as Promise<T>;
// Permanent 4xx — don't retry
if (res.status >= 400 && res.status < 500 && res.status !== 429) {
const body = await res.text();
throw new Error(`${res.status} ${res.statusText}: ${body}`);
}
// 429 — respect Retry-After exactly
// 5xx — exponential backoff with jitter, capped at 60s
const retryAfter = res.status === 429
? parseInt(res.headers.get('Retry-After') ?? '60', 10) * 1000
: Math.min(60_000, 1000 * 2 ** attempt) + Math.floor(Math.random() * 1000);
logger.warn(`[quizbase] attempt ${attempt} got ${res.status}, sleeping ${retryAfter}ms`);
await new Promise((r) => setTimeout(r, retryAfter));
}
throw new Error(`Exhausted ${maxAttempts} retries`);
} Pattern: cache in front#
A pre-built local cache eats 80% of your traffic. For apps that don’t need freshness:
// Simple in-memory cache for question lists (Node server)
const cache = new Map<string, { body: unknown; expires: number }>();
async function cachedQuizbase(url: string, key: string, ttlMs = 300_000) {
const cached = cache.get(url);
if (cached && cached.expires > Date.now()) return cached.body;
const body = await quizbaseFetch(url, key);
cache.set(url, { body, expires: Date.now() + ttlMs });
return body;
} For distributed caches, use Redis with SETEX and the URL as the key.
Pattern: queue + worker#
For large syncs (e.g. mirroring the Polish catalog):
- Put the cursor URL on a queue (BullMQ, SQS, Redis stream)
- Worker picks off one URL, fetches, persists, pushes the
_links.nextback - Worker respects rate limits — if
RateLimit-Remaining < 5, sleepRateLimit-Resetseconds - Dead-letter after 5 failed attempts
This pattern drains millions of requests over hours without ever hitting 429.
Pattern: SWR on the client#
Browser apps (using qb_pk_* keys) should use Stale-While-Revalidate:
// With @tanstack/query or swr
useQuery({
queryKey: ['quiz', 'random', lang],
queryFn: () => fetch('/api/round').then((r) => r.json()),
staleTime: 5 * 60 * 1000, // 5 min — stay stale, don't hammer
gcTime: 30 * 60 * 1000, // 30 min — purge from memory
retry: (count, err) => count < 3 && !isClientError(err)
}); Monitoring checklist#
- ✅ Log
X-Request-Idon every non-2xx response — makes support tickets trivial - ✅ Alert when
RateLimit-Remaining / RateLimit-Limit < 0.2— lets you upgrade proactively - ✅ Track
Retry-Afterhistogram — helps decide if you need to scale horizontally or upgrade tier - ✅ Measure your own request-rate — your app’s view vs our headers should match
- ❌ Don’t poll faster after 429 — you’ll only make it worse
- ❌ Don’t swallow 401/403 as retryable — they indicate real config issues
FAQ#
What does a 429 response mean exactly?#
You exceeded your rate-limit window. QuizBase returns RFC 9457 Problem Details with status: 429, title: "Too Many Requests", plus IETF rate-limit headers (RateLimit-Limit, RateLimit-Remaining, RateLimit-Reset) and a Retry-After header (seconds until the next window). Honor Retry-After first — it’s the most accurate signal.
How do I implement exponential backoff with QuizBase?#
Start at the value in Retry-After (if present). On subsequent 429s, double the wait: delay = max(retryAfter, prevDelay * 2) capped at ~60s. Add ±20% jitter (delay * (0.8 + Math.random() * 0.4)) to prevent thundering-herd on tier-shared clients. Stop retrying after 5 attempts and surface the error to the caller.
Do retries count against my quota?#
Yes — every request that reaches the server counts, including the ones that get 429’d back. The 10 burst/10s is rolling; backing off into the next window resets your burst budget. The daily budget (500 requests/day on free) is what eventually caps you.
Can I use multiple API keys to bypass rate limits?#
No. Rate limits are per account, not per key. All keys you create share the same window. This is documented in /docs/authentication — multiple keys exist for app-level secret rotation, not capacity multiplication.
What’s the difference between burst limit and daily limit?#
Burst = 10 burst/10s on free tier (rolling 10-second window). Protects against bot scrapers. Daily = 500 requests/day rolling 24h window. Caps reasonable bulk-fetch jobs. Both are tier-scaled — higher paid tiers raise both proportionally.
Should I retry on 500 or 503 errors?#
Yes, but with capped exponential backoff. 5xx errors are server-side and usually transient (deploy, brief DB hiccup). Use the same backoff pattern as 429 but starting at a smaller baseline (1s) and giving up faster (3 retries). Don’t retry on 4xx (client error) except 429.
What about idempotency for POST requests?#
QuizBase REST is mostly GET-only. The only POST endpoint is /api/v1/report which is idempotent on (questionId, category) — multiple identical reports collapse to one server-side. You can retry safely.
How do I plan capacity before hitting rate limits?#
Estimate peak QPS = (your daily request count) / (~22 hours, leaving safety margin) / 1.5x for spikes. If your peak QPS approaches your tier’s burst limit, batch where possible (cursor pagination instead of N single fetches) or upgrade tier. The /docs/api/usage endpoint gives you current-day consumption you can graph.
Does the MCP server share the same rate limit?#
Yes. One tools/call = one REST request from the rate-limit perspective. Calling quizbase_random 100 times via MCP uses the same 100 budget as fetching /api/v1/questions/random 100 times via REST. There is no “agent discount”. See MCP server docs.
What if I’m building an embed (lead-gen widget, classroom Page) where the key is visible?#
Visible publishable keys (qb_pk_*) can be revoked from your dashboard if abused. Your widget hits free-tier limits sooner than server-side calls would. For high-volume embeds, proxy through a small /api/quizbase endpoint on your domain that adds your secret key server-side — your visitors share your quota but the key isn’t exposed. Patterns: quiz lead-gen widget, Moodle/Canvas embed.
See also#
- Errors and retries — the primitives
- GET /v1/questions — cursor pagination for batch jobs