Feature Selection — Technology Wiki

Overview

Direct Answer

Feature selection is the process of identifying and selecting a subset of input variables that are most predictive or relevant for a machine learning model, while eliminating redundant, irrelevant, or noisy attributes. This differs from dimensionality reduction in that it retains interpretable original variables rather than transforming them.

How It Works

Selection methods operate through three primary approaches: filter methods evaluate variable importance independently using statistical measures; wrapper methods assess subsets by training models iteratively; and embedded methods select features during model training itself, such as regularisation-based approaches. Each method ranks or scores variables based on their contribution to predictive performance or correlation with the target output.

Why It Matters

Reducing input dimensionality decreases computational cost, training time, and model complexity whilst often improving generalisation and interpretability. In regulated industries, fewer variables simplify compliance documentation and explainability requirements. Smaller feature sets also mitigate the curse of dimensionality and reduce storage requirements in resource-constrained deployments.

Common Applications

Healthcare applications use feature selection to identify clinically relevant biomarkers from high-dimensional genomic or imaging datasets. Financial institutions apply it to credit risk models where only the most predictive variables are retained for regulatory reporting. Text classification and natural language processing tasks benefit significantly by selecting informative words or embeddings from vocabularies of millions of potential features.

Key Considerations

The optimal feature subset is often task-specific and dataset-dependent; techniques that perform well on one problem may not transfer directly to another. Over-aggressive feature removal risks discarding subtle but collectively important signals, whilst retaining too many variables undermines the efficiency and interpretability benefits.

Cross-References(1)

Machine Learning

Referenced By2 terms mention Feature Selection

Other entries in the wiki whose definition references Feature Selection — useful for understanding how this concept connects across Machine Learning and adjacent domains.

Elastic Net·Machine Learning Lasso Regression·Machine Learning

Related in MLOps & Production

Machine Learning

A subset of AI that enables systems to automatically learn and improve from experience without being explicitly programmed.

Supervised Learning

A machine learning paradigm where models are trained on labelled data, learning to map inputs to known outputs.

Unsupervised Learning

A machine learning approach where models discover patterns and structures in data without labelled examples.

Reinforcement Learning

A machine learning paradigm where agents learn optimal behaviour through trial and error, receiving rewards or penalties.

Multi-Task Learning

A machine learning approach where a model is simultaneously trained on multiple related tasks to improve generalisation.

Online Learning

A machine learning method where models are incrementally updated as new data arrives, rather than being trained in batch.

Batch Learning

Training a machine learning model on the entire dataset at once before deployment, as opposed to incremental updates.

Active Learning

A machine learning approach where the algorithm interactively queries a user or oracle to label new data points.

Ensemble Learning

Combining multiple machine learning models to produce better predictive performance than any single model.

Epoch

One complete pass through the entire training dataset during the machine learning model training process.

Model Serialisation

The process of converting a trained model into a format that can be stored, transferred, and later reconstructed for inference.

Model Serving

The infrastructure and processes for deploying trained machine learning models to production environments for real-time predictions.

More in Machine Learning

Boosting

Supervised Learning

An ensemble technique that sequentially trains models, each focusing on correcting the errors of previous models.

Bandit Algorithm

Advanced Methods

An online learning algorithm that balances exploration of new options with exploitation of known good options to maximise reward.

Gradient Descent

Training Techniques

An optimisation algorithm that iteratively adjusts parameters in the direction of steepest descent of the loss function.

Model Registry

MLOps & Production

A versioned catalogue of trained machine learning models with metadata, lineage, and approval workflows, enabling reproducible deployment and governance at enterprise scale.

Overfitting

Training Techniques

When a model learns the training data too well, including noise, resulting in poor performance on unseen data.

Ensemble Methods

MLOps & Production

Machine learning techniques that combine multiple models to produce better predictive performance than any single model, including bagging, boosting, and stacking approaches.

Data Augmentation

Feature Engineering & Selection

Techniques that artificially increase the size and diversity of training data through transformations like rotation, flipping, and cropping.

Feature Engineering

Feature Engineering & Selection

The process of using domain knowledge to create, select, and transform input variables to improve model performance.