Overview
Direct Answer
A load balancer is a system that distributes incoming network traffic and computational workload across multiple backend servers or resources to prevent any single server from becoming a bottleneck. It operates as an intermediary between clients and servers, routing requests based on predefined algorithms and health checks.
How It Works
The system receives all incoming requests at a single entry point and applies distribution algorithms—such as round-robin, least connections, or weighted allocation—to forward traffic to available backend servers. It continuously monitors server health through periodic probes, automatically removing unresponsive servers from the rotation and redistributing their load to maintain service availability and optimal performance.
Why It Matters
Load balancing directly improves application uptime, reduces latency, and enables horizontal scalability by allowing organisations to add servers without redesigning infrastructure. It reduces infrastructure costs through efficient resource utilisation and prevents service degradation during traffic spikes, which is critical for maintaining user experience and revenue in production environments.
Common Applications
Web applications use load balancers to distribute HTTP requests across multiple application servers; e-commerce platforms employ them during peak traffic periods to handle transaction volume; and microservices architectures rely on them to route API calls across containerised service instances in cloud environments.
Key Considerations
Session persistence and state management can become complex when requests are distributed across servers; misconfigured health checks may cause legitimate servers to be removed from service. Load balancers themselves can become a single point of failure, requiring redundancy in their deployment.
More in Cloud Computing
Serverless Computing
Service ModelsA cloud execution model where the provider dynamically allocates resources, charging only for actual compute time used.
Multi-Tenancy
Strategy & EconomicsA software architecture where a single instance serves multiple customers, with each tenant's data isolated and invisible to others.
Cloud Repatriation
Strategy & EconomicsThe process of moving workloads back from public cloud environments to on-premises or private cloud infrastructure.
GraphQL
Architecture PatternsA query language for APIs that lets clients request exactly the data they need in a single request.
Public Cloud
Service ModelsCloud computing resources shared among multiple organisations and available to the general public over the internet.
Spot Instance
Service ModelsA cloud computing option that uses spare capacity at significantly reduced prices with the possibility of interruption.
Terraform
Deployment & OperationsAn open-source infrastructure as code tool for building, changing, and versioning infrastructure safely and efficiently.
Cloud Computing
Service ModelsThe delivery of computing services — servers, storage, databases, networking, software — over the internet on demand.