Load Balancing
Distribute incoming traffic across multiple servers to maximise throughput, minimise latency, and prevent overload.
★★★★★5/5Deployment platform — Kubernetes, Docker, cloud config
Interactive visualization
LiveHow it works
Load balancers sit between clients and backend servers, distributing requests to ensure no single server is overwhelmed. They operate at different OSI layers:
— Layer 4 (Transport): routes based on IP/TCP — fast, no HTTP awareness (HAProxy, AWS NLB) — Layer 7 (Application): routes based on HTTP headers, URL, cookies — smarter, enables path-based routing (Nginx, AWS ALB)
Distribution algorithms: Round Robin, Least Connections, IP Hash (sticky sessions), Weighted Round Robin, Random.
Health checks poll backends; unhealthy instances are removed until they recover.
Why it matters
Without load balancing, a single server is a single point of failure and scalability ceiling. Load balancers are the fundamental building block for high availability.
✓ When to use
- →Any service running multiple backend replicas
- →High-traffic applications requiring horizontal scaling
- →Zero-downtime deployments using rolling updates
✗ When NOT to use
- →Single-instance development environments
- →When a service mesh already handles load distribution
Trade-offs
Horizontal scalability and high availability
Stateful sessions require sticky routing or external session store
Health-check-based automatic failover
Load balancer itself becomes a SPOF without redundancy
In production
ALB handles HTTPS termination and path-based routing for millions of apps
Anycast routing + load balancing across 300+ PoPs globally
Industry adoption
Related principles
API Gateway
LiveSingle entry point for all clients that handles routing, authentication, rate limiting, and protocol translation.
Service Mesh
LiveOffload cross-cutting network concerns (mTLS, retries, circuit breaking, observability) to a dedicated infrastructure layer via sidecar proxies.
Kubernetes Orchestration
LiveAutomate the deployment, scaling, and self-healing of containerised applications across a cluster of nodes.
CDN Architecture
LiveServe content from edge nodes geographically close to users, drastically reducing latency and origin load.