What Is Rate Limiting?
Rate limiting controls how many requests a client can make to an API within a time window. It prevents abuse, protects against DDoS attacks, ensures fair usage, and controls costs. Common implementations: 100 requests per minute per IP, or 1,000 requests per hour per API key.
How Rate Limiting Works
Implementation with Redis: on each request, increment a counter keyed by IP + time window. If counter exceeds limit, return HTTP 429 Too Many Requests with a Retry-After header. Express.js has express-rate-limit middleware. Cloudflare and API gateways offer built-in rate limiting.
Key Concepts
- Token Bucket — Algorithm that allows burst traffic up to a limit while maintaining average rate
- Sliding Window — Counts requests in a rolling time window — smoother than fixed windows
- HTTP 429 — The standard response code for rate-limited requests — include Retry-After header
Frequently Asked Questions
How do I set rate limits?
Base limits on your infrastructure capacity and expected usage. Start generous, tighten based on monitoring. Different limits for authenticated vs unauthenticated, free vs paid tiers.