Overview
Direct Answer
Learning rate is a hyperparameter that determines the magnitude of parameter updates during gradient descent optimisation. It directly scales the gradient signal, controlling how aggressively the model adjusts weights at each training iteration.
How It Works
During backpropagation, the model computes the loss gradient with respect to each parameter. The optimiser multiplies this gradient by the learning rate before applying the update. A rate of 0.01 means parameters shift by 1% of the gradient magnitude; 0.001 by 0.1%. This multiplicative factor fundamentally controls convergence speed and stability.
Why It Matters
Selecting an appropriate value directly impacts training efficiency, final model accuracy, and computational cost. Rates that are too high cause divergence or oscillation around optimal weights; rates too low extend training time unnecessarily, increasing infrastructure expenses and time-to-deployment.
Common Applications
Applied across neural network training in computer vision, natural language processing, and time series forecasting. Practitioners routinely adjust this parameter when training image classifiers, language models, and recommendation systems to balance convergence speed against solution quality.
Key Considerations
Optimal values vary significantly across datasets, model architectures, and optimiser algorithms (SGD versus Adam require different ranges). Many practitioners employ learning rate schedules or adaptive methods that adjust the rate dynamically during training rather than using static values.
Referenced By1 term mentions Learning Rate
Other entries in the wiki whose definition references Learning Rate — useful for understanding how this concept connects across Machine Learning and adjacent domains.
More in Machine Learning
UMAP
Unsupervised LearningUniform Manifold Approximation and Projection — a dimensionality reduction technique for visualisation and general non-linear reduction.
Machine Learning
MLOps & ProductionA subset of AI that enables systems to automatically learn and improve from experience without being explicitly programmed.
Active Learning
MLOps & ProductionA machine learning approach where the algorithm interactively queries a user or oracle to label new data points.
XGBoost
Supervised LearningAn optimised distributed gradient boosting library designed for speed and performance in machine learning competitions and production.
Supervised Learning
MLOps & ProductionA machine learning paradigm where models are trained on labelled data, learning to map inputs to known outputs.
K-Nearest Neighbours
Supervised LearningA simple algorithm that classifies data points based on the majority class of their k closest neighbours in feature space.
Unsupervised Learning
MLOps & ProductionA machine learning approach where models discover patterns and structures in data without labelled examples.
Linear Regression
Supervised LearningA statistical method modelling the relationship between a dependent variable and one or more independent variables using a linear equation.