Overview
Direct Answer
Regularisation refers to a set of mathematical techniques that impose penalties on model complexity during training, constraining weight magnitudes or feature counts to reduce overfitting. By adding a regularisation term to the loss function, models learn simpler representations that generalise better to unseen data.
How It Works
Regularisation modifies the objective function by appending a penalty proportional to model parameters—typically L1 (sum of absolute weights) or L2 (sum of squared weights). During optimisation, the algorithm balances minimising training error against minimising this penalty, effectively shrinking less important weights towards zero and limiting the model's capacity to memorise noise.
Why It Matters
Overfitted models exhibit poor performance on production data despite strong training metrics, directly reducing business value and increasing deployment risk. Regularisation significantly improves model robustness and predictive reliability in real-world scenarios where training and operational data distributions diverge, lowering the cost of model retraining and failure mitigation.
Common Applications
Regularisation is standard in credit risk assessment, customer churn prediction, and medical image classification where high accuracy on held-out test sets is critical. L2 regularisation appears ubiquitously in regression and neural network training; L1 regularisation is preferred for feature selection in high-dimensional datasets such as genomics and financial forecasting.
Key Considerations
Selecting appropriate regularisation strength requires careful tuning via cross-validation; excessively strong penalties bias models towards underfitting and reduced discriminative power. The choice between L1 and L2 depends on whether feature sparsity or smooth weight decay is desired.
Cross-References(1)
Referenced By4 terms mention Regularisation
Other entries in the wiki whose definition references Regularisation — useful for understanding how this concept connects across Machine Learning and adjacent domains.
More in Machine Learning
UMAP
Unsupervised LearningUniform Manifold Approximation and Projection — a dimensionality reduction technique for visualisation and general non-linear reduction.
Batch Learning
MLOps & ProductionTraining a machine learning model on the entire dataset at once before deployment, as opposed to incremental updates.
Decision Tree
Supervised LearningA tree-structured model where internal nodes represent feature tests, branches represent outcomes, and leaves represent predictions.
Model Serving
MLOps & ProductionThe infrastructure and processes for deploying trained machine learning models to production environments for real-time predictions.
Multi-Task Learning
MLOps & ProductionA machine learning approach where a model is simultaneously trained on multiple related tasks to improve generalisation.
Markov Decision Process
Reinforcement LearningA mathematical framework for modelling sequential decision-making where outcomes are partly random and partly controlled.
Feature Engineering
Feature Engineering & SelectionThe process of using domain knowledge to create, select, and transform input variables to improve model performance.
Clustering
Unsupervised LearningUnsupervised learning technique that groups similar data points together based on inherent patterns without predefined labels.