A load balancer distributes incoming network traffic across multiple servers to ensure no single server gets overwhelmed. It improves application availability, reliability, and performance by spreading the workload evenly.

How Load Balancer Works

When your app gets too much traffic for one server, you add more servers behind a load balancer. The load balancer receives all incoming requests and routes them to healthy backend servers using algorithms like round-robin, least connections, or weighted distribution.

Key Concepts

  • Health Checks — Load balancers periodically ping backend servers and stop sending traffic to unhealthy ones
  • Layer 4 vs Layer 7 — L4 load balancers route based on IP/port (fast). L7 balancers inspect HTTP headers and URLs (flexible, can route by path)
  • SSL Termination — The load balancer handles HTTPS encryption/decryption so backend servers don't have to

Frequently Asked Questions

What load balancer should I use?

AWS ALB/NLB for AWS deployments. Nginx or HAProxy for self-managed. Cloudflare for DNS-based load balancing. Most cloud providers include load balancing with their container orchestration services.

Do I need a load balancer for a small app?

Not until you need multiple server instances. Start with a single server. When you need horizontal scaling or zero-downtime deployments, add a load balancer.