Load Balancer — Technology Wiki

Overview

Direct Answer

A load balancer is a system that distributes incoming network traffic and computational workload across multiple backend servers or resources to prevent any single server from becoming a bottleneck. It operates as an intermediary between clients and servers, routing requests based on predefined algorithms and health checks.

How It Works

The system receives all incoming requests at a single entry point and applies distribution algorithms—such as round-robin, least connections, or weighted allocation—to forward traffic to available backend servers. It continuously monitors server health through periodic probes, automatically removing unresponsive servers from the rotation and redistributing their load to maintain service availability and optimal performance.

Why It Matters

Load balancing directly improves application uptime, reduces latency, and enables horizontal scalability by allowing organisations to add servers without redesigning infrastructure. It reduces infrastructure costs through efficient resource utilisation and prevents service degradation during traffic spikes, which is critical for maintaining user experience and revenue in production environments.

Common Applications

Web applications use load balancers to distribute HTTP requests across multiple application servers; e-commerce platforms employ them during peak traffic periods to handle transaction volume; and microservices architectures rely on them to route API calls across containerised service instances in cloud environments.

Key Considerations

Session persistence and state management can become complex when requests are distributed across servers; misconfigured health checks may cause legitimate servers to be removed from service. Load balancers themselves can become a single point of failure, requiring redundancy in their deployment.

Related in Infrastructure

Container

A lightweight, portable software package that bundles application code with all its dependencies for consistent execution.

Docker

A platform for developing, shipping, and running applications in isolated containers with consistent environments.

Kubernetes

An open-source container orchestration platform for automating deployment, scaling, and management of containerised applications.

Auto-Scaling

Automatically adjusting compute resources based on current demand to maintain performance and optimise costs.

Virtual Machine

A software emulation of a physical computer that runs an operating system and applications independently.

Hypervisor

Software that creates and manages virtual machines, allowing multiple operating systems to share a single hardware host.

Object Storage

A data storage architecture managing data as objects rather than file hierarchies or block addresses.

Block Storage

A data storage technology that manages data as individual blocks, each acting as an independent hard drive.

Availability Zone

An isolated location within a cloud region with independent power, cooling, and networking for high availability.

Region

A geographic area containing one or more data centres where cloud services are hosted.

Container Orchestration

The automated management of containerised application deployment, scaling, networking, and availability across clusters of machines, with Kubernetes as the dominant platform.

More in Cloud Computing

Serverless Database

Strategy & Economics

A database service that automatically provisions, scales, and manages infrastructure on demand without manual server management.

REST API

Architecture Patterns

An API architectural style using HTTP methods and stateless communication for web service interaction.

Infrastructure as a Service

Service Models

Cloud computing model providing virtualised computing resources like servers, storage, and networking over the internet.

Terraform

Deployment & Operations

An open-source infrastructure as code tool for building, changing, and versioning infrastructure safely and efficiently.

Edge Computing

Architecture Patterns

Processing data near the source of data generation rather than in a centralised cloud data centre.

Serverless Computing

Service Models

A cloud execution model where the provider dynamically allocates resources, charging only for actual compute time used.

Cloud-Native

Service Models

An approach to building applications that fully exploit cloud computing advantages like elasticity, resilience, and automation.

Reserved Instance

Strategy & Economics

A cloud pricing model where users commit to a specific resource configuration for a term in exchange for discounted rates.