Bagging — Technology Wiki

Overview

Direct Answer

Bagging (Bootstrap Aggregating) is an ensemble method that reduces variance by training multiple models independently on random subsets of the training data drawn with replacement, then combining their predictions through averaging or voting. This approach is particularly effective for high-variance algorithms such as decision trees.

How It Works

The method generates B bootstrap samples by randomly sampling the original dataset with replacement, each typically the same size as the original. A base learner trains independently on each sample, producing B distinct models. For regression tasks, predictions are averaged across all models; for classification, a majority vote determines the final output. This independence between training iterations enables parallel computation.

Why It Matters

Bagging significantly improves model stability and generalisation performance, reducing overfitting without requiring algorithmic modification. Teams benefit from lower prediction variance and more reliable confidence intervals, which is critical for high-stakes decisions in finance, healthcare, and risk management where model robustness directly impacts business outcomes.

Common Applications

Random forests, a bagged ensemble of decision trees, are widely deployed in credit risk assessment, medical diagnosis support systems, and feature importance analysis. The technique also improves neural network robustness and is applied in manufacturing defect detection and customer churn prediction.

Key Considerations

Bagging reduces variance but provides minimal bias reduction; it works best with unstable learners prone to overfitting. Computational cost scales linearly with the number of models trained, and gains diminish as base learner stability increases, requiring practitioners to balance accuracy improvements against training overhead.

Referenced By1 term mentions Bagging

Other entries in the wiki whose definition references Bagging — useful for understanding how this concept connects across Machine Learning and adjacent domains.

Ensemble Methods·Machine Learning

Related in Advanced Methods

Semi-Supervised Learning

A learning approach that combines a small amount of labelled data with a large amount of unlabelled data during training.

Self-Supervised Learning

A learning paradigm where models generate their own supervisory signals from unlabelled data through pretext tasks.

Transfer Learning

A technique where knowledge gained from training on one task is applied to a different but related task.

Meta-Learning

Learning to learn — algorithms that improve their learning process by leveraging experience from multiple learning episodes.

Curriculum Learning

A training strategy that presents examples to a model in a meaningful order, typically from easy to hard.

Bandit Algorithm

An online learning algorithm that balances exploration of new options with exploitation of known good options to maximise reward.

More in Machine Learning

Active Learning

MLOps & Production

A machine learning approach where the algorithm interactively queries a user or oracle to label new data points.

Association Rule Learning

Unsupervised Learning

A method for discovering interesting relationships and patterns between variables in large datasets.

Gradient Boosting

Supervised Learning

An ensemble technique that builds models sequentially, with each new model correcting residual errors of the combined ensemble.

Matrix Factorisation

Unsupervised Learning

A technique that decomposes a matrix into constituent matrices, widely used in recommendation systems and dimensionality reduction.

Adam Optimiser

Training Techniques

An adaptive learning rate optimisation algorithm combining momentum and RMSProp for efficient deep learning training.

Naive Bayes

Supervised Learning

A probabilistic classifier based on applying Bayes' theorem with the assumption of independence between features.

Principal Component Analysis

Unsupervised Learning

A dimensionality reduction technique that transforms data into orthogonal components ordered by the amount of variance they explain.

SHAP Values

MLOps & Production

A game-theoretic approach to explaining individual model predictions by computing each feature's marginal contribution, based on Shapley values from cooperative game theory.