## Growing Decision Trees

By default, `fitctree` and `fitrtree` use the standard CART algorithm [1] to create decision trees. That is, they perform the following steps:

1. Start with all input data, and examine all possible binary splits on every predictor.

2. Select a split with best optimization criterion.

• A split might lead to a child node having too few observations (less than the `MinLeafSize` parameter). To avoid this, the software chooses a split that yields the best optimization criterion subject to the `MinLeafSize` constraint.

3. Impose the split.

4. Repeat recursively for the two child nodes.

The explanation requires two more items: description of the optimization criterion and stopping rule.

Stopping rule: Stop splitting when any of the following hold:

• The node is pure.

• For classification, a node is pure if it contains only observations of one class.

• For regression, a node is pure if the mean squared error (MSE) for the observed response in this node drops below the MSE for the observed response in the entire data multiplied by the tolerance on quadratic error per node (`QuadraticErrorTolerance` parameter).

• There are fewer than `MinParentSize` observations in this node.

• Any split imposed on this node produces children with fewer than `MinLeafSize` observations.

• The algorithm splits `MaxNumSplits` nodes.

Optimization criterion:

• Regression: mean-squared error (MSE). Choose a split to minimize the MSE of predictions compared to the training data.

• Classification: One of three measures, depending on the setting of the `SplitCriterion` name-value pair:

• `'gdi'` (Gini's diversity index, the default)

• `'twoing'`

• `'deviance'`

For details, see `ClassificationTree` More About.

For alternative split predictor selection techniques, see Choose Split Predictor Selection Technique.

For a continuous predictor, a tree can split halfway between any two adjacent unique values found for this predictor. For a categorical predictor with L levels, a classification tree needs to consider 2L–1–1 splits to find the optimal split. Alternatively, you can choose a heuristic algorithm to find a good split, as described in Splitting Categorical Predictors in Classification Trees.

For dual-core systems and above, `fitctree` and `fitrtree` parallelize training decision trees using Intel® Threading Building Blocks (TBB). For details on Intel TBB, see https://www.intel.com/content/www/us/en/developer/tools/oneapi/onetbb.html.

## References

[1] Breiman, L., J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Boca Raton, FL: Chapman & Hall, 1984.