Machine LearningSupervised Learning

Decision Tree

Overview

Direct Answer

A decision tree is a supervised learning model that recursively partitions data by selecting features that best separate classes or minimise prediction error, creating a hierarchical structure where each internal node represents a conditional test, branches represent feature value ranges, and terminal leaves output class labels or regression values.

How It Works

The algorithm greedily selects features at each node using splitting criteria such as Gini impurity, information gain, or variance reduction, then recursively applies this process to resulting subsets until stopping conditions are met (maximum depth, minimum samples per leaf, or pure nodes). Predictions follow the path from root to leaf determined by feature comparisons at each decision point.

Why It Matters

Decision trees deliver interpretable models critical for regulatory compliance and stakeholder trust, require minimal data preprocessing, and handle both categorical and continuous variables natively. Their transparency makes them invaluable in high-stakes domains where explainability directly impacts business decisions and accountability.

Common Applications

Medical diagnosis systems use trees to classify patient conditions; credit institutions employ them for loan approval decisions; retailers apply trees to customer segmentation and churn prediction; manufacturing organisations use them for quality control and fault detection.

Key Considerations

Trees are prone to overfitting on training data and produce unstable models sensitive to minor data perturbations; ensemble methods such as random forests and gradient boosting substantially improve generalisation performance. Feature scaling is unnecessary, but tree depth requires careful tuning to balance bias and variance.

Cited Across coldai.org3 pages mention Decision Tree

More in Machine Learning