Overview
Direct Answer
Gradient boosting is an ensemble machine learning method that constructs models sequentially, with each successive model trained to predict and correct the residual errors left by the combined predictions of all previous models. The technique uses gradient descent optimisation to minimise a loss function across iterations.
How It Works
The algorithm initialises with a base learner, then iteratively fits new weak learners (typically decision trees) to the negative gradient of the loss function computed on training data. Each new model is weighted and added to the ensemble, with subsequent models focusing on instances or residuals where the existing ensemble performs poorly. Learning rates control the contribution of each addition, balancing model complexity against convergence speed.
Why It Matters
Gradient boosting achieves state-of-the-art predictive accuracy across classification and regression tasks, often outperforming alternative ensemble methods on tabular data. Organisations deploy it for high-stakes applications requiring robust generalisation, including credit risk assessment, fraud detection, and customer churn prediction, where incremental accuracy improvements directly translate to measurable business value.
Common Applications
Applications span financial services for loan default prediction, e-commerce for demand forecasting, healthcare for patient outcome modelling, and insurance for claims assessment. XGBoost, LightGBM, and CatBoost represent widely adopted open-source implementations used across industries for competition benchmarks and production systems.
Key Considerations
The sequential training process is computationally expensive and harder to parallelise than batch ensemble methods. Practitioners must carefully tune hyperparameters including learning rate, tree depth, and iteration count to avoid overfitting whilst maintaining interpretability.
Referenced By2 terms mention Gradient Boosting
Other entries in the wiki whose definition references Gradient Boosting — useful for understanding how this concept connects across Machine Learning and adjacent domains.
More in Machine Learning
Deep Reinforcement Learning
Reinforcement LearningCombining deep neural networks with reinforcement learning to enable agents to learn complex decision-making from raw sensory input.
Stochastic Gradient Descent
Training TechniquesA variant of gradient descent that updates parameters using a randomly selected subset of training data each iteration.
Transfer Learning
Advanced MethodsA technique where knowledge gained from training on one task is applied to a different but related task.
Automated Machine Learning
MLOps & ProductionThe end-to-end automation of the machine learning pipeline including feature engineering, model selection, hyperparameter tuning, and deployment, making ML accessible to non-experts.
Loss Function
Training TechniquesA mathematical function that measures the difference between predicted outputs and actual target values during model training.
Model Serialisation
MLOps & ProductionThe process of converting a trained model into a format that can be stored, transferred, and later reconstructed for inference.
SMOTE
Feature Engineering & SelectionSynthetic Minority Over-sampling Technique — a method for addressing class imbalance by generating synthetic examples of the minority class.
Ensemble Methods
MLOps & ProductionMachine learning techniques that combine multiple models to produce better predictive performance than any single model, including bagging, boosting, and stacking approaches.