Rate limiting controls how many requests a client can make to an API within a time window. It prevents abuse, protects against DDoS attacks, ensures fair usage, and controls costs. Common implementations: 100 requests per minute per IP, or 1,000 requests per hour per API key.

How Rate Limiting Works

Implementation with Redis: on each request, increment a counter keyed by IP + time window. If counter exceeds limit, return HTTP 429 Too Many Requests with a Retry-After header. Express.js has express-rate-limit middleware. Cloudflare and API gateways offer built-in rate limiting.

Key Concepts

  • Token Bucket — Algorithm that allows burst traffic up to a limit while maintaining average rate
  • Sliding Window — Counts requests in a rolling time window — smoother than fixed windows
  • HTTP 429 — The standard response code for rate-limited requests — include Retry-After header

Frequently Asked Questions

How do I set rate limits?

Base limits on your infrastructure capacity and expected usage. Start generous, tighten based on monitoring. Different limits for authenticated vs unauthenticated, free vs paid tiers.