# resubPredict

Predict resubstitution response of tree

## Syntax

``Yfit = resubPredict(tree)``
``Yfit = resubPredict(tree,Subtrees=subtrees)``
``Yfit = resubPredict(tree,"Subtrees",subtrees)``
``````[Yfit,node] = resubPredict(___)``````

## Description

example

````Yfit = resubPredict(tree)` returns the responses `tree` predicts for the data `tree.X`. `Yfit` is the predictions of `tree` on the data that `fitrtree` used to create `tree`.```
````Yfit = resubPredict(tree,Subtrees=subtrees)` also prunes `tree` to the level specified by `subtrees`, before predicting responses.Before R2021a, use the equivalent syntax `Yfit = resubPredict(tree,"Subtrees",subtrees)`.```
``````[Yfit,node] = resubPredict(___)``` also returns the node numbers of `tree` for the resubstituted data, using any of the input arguments in the previous syntaxes.```

## Examples

collapse all

Load the `carsmall` data set. Consider `Displacement`, `Horsepower`, and `Weight` as predictors of the response `MPG`.

```load carsmall X = [Displacement Horsepower Weight];```

Grow a regression tree using all observations.

`Mdl = fitrtree(X,MPG);`

Compute the resubstitution MSE.

```Yfit = resubPredict(Mdl); mean((Yfit - Mdl.Y).^2)```
```ans = 4.8952 ```

You can get the same result using `resubLoss`.

`resubLoss(Mdl)`
```ans = 4.8952 ```

Load the `carsmall` data set. Consider `Weight` as a predictor of the response `MPG`.

```load carsmall idxNaN = isnan(MPG + Weight); X = Weight(~idxNaN); Y = MPG(~idxNaN); n = numel(X);```

Grow a regression tree using all observations.

`Mdl = fitrtree(X,Y);`

Compute resubstitution fitted values for the subtrees at several pruning levels.

```m = max(Mdl.PruneList); pruneLevels = 1:4:m; % Pruning levels to consider z = numel(pruneLevels); Yfit = resubPredict(Mdl,Subtrees=pruneLevels);```

`Yfit` is an `n`-by-`z` matrix of fitted values in which the rows correspond to observations and the columns correspond to a subtree.

Plot several columns of `Yfit` and `Y` against `X`.

```sortDat = sortrows([X Y Yfit],1); % Sort all data with respect to X plot(repmat(sortDat(:,1),1,size(Yfit,2)+1),sortDat(:,2:end)) % Vectorize for efficiency lev = num2str((pruneLevels)',"Level %d MPG"); legend(["Observed MPG"; lev]) title("In-Sample Fitted Responses") xlabel("Weight (lbs)") ylabel("MPG") h = findobj(gcf); set(h(4:end),LineWidth=3) % Widen all lines```

The values of `Yfit` for lower pruning levels tend to follow the data more closely than higher levels. Higher pruning levels tend to be flat for large `X` intervals.

## Input Arguments

collapse all

Regression tree, specified as a `RegressionTree` object created using the `fitrtree` function.

Pruning level, specified as a vector of nonnegative integers in ascending order or `"all"`.

If you specify a vector, then all elements must be at least `0` and at most `max(tree.PruneList)`. `0` indicates the full, unpruned tree and `max(tree.PruneList)` indicates the completely pruned tree (in other words, just the root node).

If you specify `"all"`, then `resubPredict` operates on all subtrees (in other words, the entire pruning sequence). This specification is equivalent to using `0:max(tree.PruneList)`.

`resubPredict` prunes `Mdl` to each level indicated in `Subtrees`, and then estimates the corresponding output arguments. The size of `Subtrees` determines the size of some output arguments.

To invoke `Subtrees`, the properties `PruneList` and `PruneAlpha` of `tree` must be nonempty. In other words, grow `tree` by setting `Prune="on"`, or by pruning `tree` using `prune`.

Data Types: `single` | `double` | `char` | `string`

## Output Arguments

collapse all

Predicted resubstitution response values for the training data, returned as a vector or a matrix. `Yfit` is of the same data type as the training response data `tree.Y`.

If the `Subtrees` name-value argument is a numeric scalar, then `Yfit` is returned as a column vector. Otherwise, `Yfit` is returned as a matrix with `m` columns, where `m` is the number of subtrees. Each column represents the predictions of the corresponding subtree.

Node numbers of `tree` where each data row resolves, returned as a numeric vector or a numeric matrix.

If the `Subtrees` name-value argument is a numeric scalar, then `node` is returned as an `n`-element column vector, where `n` is the number of rows of `tree.X`. Otherwise, `node` is returned as a matrix of size `n`-by-`m`, where `m` is the number of subtrees. Each column represents the node predictions of the corresponding subtree.

## Version History

Introduced in R2011a