Overview
Direct Answer
Few-shot learning is a machine learning paradigm where models achieve task performance through exposure to only a small number of labelled examples—typically between two and ten instances per class. This approach differs fundamentally from traditional supervised learning, which requires thousands of examples, and leverages transfer learning or in-context learning mechanisms to generalise from minimal data.
How It Works
The mechanism relies on the model's pre-trained representations and ability to recognise patterns from limited exemplars. In large language models, few-shot capability emerges through in-context learning, where examples are provided within the input prompt without parameter updates. Meta-learning approaches train models explicitly to adapt quickly to new tasks, whilst metric-learning methods learn similarity functions that can classify unseen data points based on proximity to support examples.
Why It Matters
Organisations benefit significantly from reduced labelling costs, faster deployment timelines, and the ability to address long-tail problems where abundant training data is infeasible. In regulated industries and specialised domains—such as medical imaging or legal document analysis—few-shot methods accelerate model development whilst maintaining data privacy and compliance requirements.
Common Applications
Applications include intent classification in customer service chatbots, rapid personalisation in recommendation systems, and medical diagnosis from limited patient records. Few-shot techniques are particularly valuable in rare disease detection, multilingual natural language processing, and content moderation where class distributions are highly imbalanced.
Key Considerations
Performance often remains lower than fully-supervised baselines, and quality of selected examples disproportionately influences outcomes. Practitioners must carefully curate exemplars and recognise that success depends heavily on the model's pre-training quality and task similarity to the training distribution.
Cross-References(2)
More in Artificial Intelligence
Neural Scaling Laws
Models & ArchitectureEmpirical relationships describing how AI model performance improves predictably with increases in model size, training data volume, and computational resources.
AI Interpretability
Safety & GovernanceThe degree to which humans can understand the internal mechanics and reasoning of an AI model's predictions and decisions.
AI Ethics
Foundations & TheoryThe branch of ethics examining moral issues surrounding the development, deployment, and impact of artificial intelligence on society.
Artificial Intelligence
Foundations & TheoryThe simulation of human intelligence processes by computer systems, including learning, reasoning, and self-correction.
AI Robustness
Safety & GovernanceThe ability of an AI system to maintain performance under varying conditions, adversarial attacks, or noisy input data.
ROC Curve
Evaluation & MetricsA graphical plot illustrating the diagnostic ability of a binary classifier as its discrimination threshold is varied.
Cognitive Computing
Foundations & TheoryComputing systems that simulate human thought processes using self-learning algorithms, data mining, pattern recognition, and natural language processing.
AI Agent Orchestration
Infrastructure & OperationsThe coordination and management of multiple AI agents working together to accomplish complex tasks, routing subtasks between specialised agents based on capability and context.