Overview
Direct Answer
Bagging (Bootstrap Aggregating) is an ensemble method that reduces variance by training multiple models independently on random subsets of the training data drawn with replacement, then combining their predictions through averaging or voting. This approach is particularly effective for high-variance algorithms such as decision trees.
How It Works
The method generates B bootstrap samples by randomly sampling the original dataset with replacement, each typically the same size as the original. A base learner trains independently on each sample, producing B distinct models. For regression tasks, predictions are averaged across all models; for classification, a majority vote determines the final output. This independence between training iterations enables parallel computation.
Why It Matters
Bagging significantly improves model stability and generalisation performance, reducing overfitting without requiring algorithmic modification. Teams benefit from lower prediction variance and more reliable confidence intervals, which is critical for high-stakes decisions in finance, healthcare, and risk management where model robustness directly impacts business outcomes.
Common Applications
Random forests, a bagged ensemble of decision trees, are widely deployed in credit risk assessment, medical diagnosis support systems, and feature importance analysis. The technique also improves neural network robustness and is applied in manufacturing defect detection and customer churn prediction.
Key Considerations
Bagging reduces variance but provides minimal bias reduction; it works best with unstable learners prone to overfitting. Computational cost scales linearly with the number of models trained, and gains diminish as base learner stability increases, requiring practitioners to balance accuracy improvements against training overhead.
Referenced By1 term mentions Bagging
Other entries in the wiki whose definition references Bagging — useful for understanding how this concept connects across Machine Learning and adjacent domains.
More in Machine Learning
Ensemble Learning
MLOps & ProductionCombining multiple machine learning models to produce better predictive performance than any single model.
Elastic Net
Training TechniquesA regularisation technique combining L1 and L2 penalties, balancing feature selection and coefficient shrinkage.
Active Learning
MLOps & ProductionA machine learning approach where the algorithm interactively queries a user or oracle to label new data points.
K-Nearest Neighbours
Supervised LearningA simple algorithm that classifies data points based on the majority class of their k closest neighbours in feature space.
Catastrophic Forgetting
Anomaly & Pattern DetectionThe tendency of neural networks to completely lose previously learned knowledge when trained on new tasks, a fundamental challenge in continual and multi-task learning.
Hierarchical Clustering
Unsupervised LearningA clustering method that builds a tree-like hierarchy of clusters through successive merging or splitting of groups.
Anomaly Detection
Anomaly & Pattern DetectionIdentifying data points, events, or observations that deviate significantly from the expected pattern in a dataset.
Class Imbalance
Feature Engineering & SelectionA situation where the distribution of classes in a dataset is significantly skewed, with some classes vastly outnumbering others.