Transfer Learning — Technology Wiki

Overview

Direct Answer

Transfer learning is a machine learning methodology in which a model trained on a source domain or task is repurposed and fine-tuned for application to a target domain or task. This approach leverages learned representations—weights, feature hierarchies, and patterns—rather than training from random initialisation.

How It Works

A pre-trained model, typically trained on large-scale datasets, serves as a feature extractor. During the adaptation phase, either the final layers are retrained on new data whilst keeping earlier layers frozen, or all parameters undergo fine-tuning with a lower learning rate. This process preserves foundational knowledge whilst specialising the model to new target characteristics.

Why It Matters

Organisations deploy this technique to reduce computational cost, accelerate model development cycles, and achieve higher accuracy with limited labelled data. In resource-constrained settings, leveraging pre-trained models substantially lowers training time and infrastructure requirements whilst improving convergence speed and generalisation performance.

Common Applications

Computer vision applications utilise ImageNet-pretrained models for medical image analysis, object detection, and satellite imagery classification. Natural language processing commonly applies models trained on large corpora to domain-specific tasks including sentiment analysis, named entity recognition, and document classification.

Key Considerations

Domain mismatch between source and target can degrade performance if the learned representations are insufficiently similar. Practitioners must balance the trade-off between preserving pre-trained weights and adapting to target-specific characteristics, requiring careful selection of which layers to freeze and adjustment of hyperparameters.

Related in Advanced Methods

Semi-Supervised Learning

A learning approach that combines a small amount of labelled data with a large amount of unlabelled data during training.

Self-Supervised Learning

A learning paradigm where models generate their own supervisory signals from unlabelled data through pretext tasks.

Meta-Learning

Learning to learn — algorithms that improve their learning process by leveraging experience from multiple learning episodes.

Curriculum Learning

A training strategy that presents examples to a model in a meaningful order, typically from easy to hard.

Bagging

Bootstrap Aggregating — an ensemble method that trains multiple models on random subsets of data and averages their predictions.

Bandit Algorithm

An online learning algorithm that balances exploration of new options with exploitation of known good options to maximise reward.

More in Machine Learning

Active Learning

MLOps & Production

A machine learning approach where the algorithm interactively queries a user or oracle to label new data points.

Automated Machine Learning

MLOps & Production

The end-to-end automation of the machine learning pipeline including feature engineering, model selection, hyperparameter tuning, and deployment, making ML accessible to non-experts.

Machine Learning

MLOps & Production

A subset of AI that enables systems to automatically learn and improve from experience without being explicitly programmed.

t-SNE

Unsupervised Learning

t-Distributed Stochastic Neighbour Embedding — a technique for visualising high-dimensional data in two or three dimensions.

Dimensionality Reduction

Unsupervised Learning

Techniques that reduce the number of input variables in a dataset while preserving essential information and structure.

Logistic Regression

Supervised Learning

A classification algorithm that models the probability of a binary outcome using a logistic function.

XGBoost

Supervised Learning

An optimised distributed gradient boosting library designed for speed and performance in machine learning competitions and production.

UMAP

Unsupervised Learning

Uniform Manifold Approximation and Projection — a dimensionality reduction technique for visualisation and general non-linear reduction.