What is the difference between oobPredict and predict with ensemble of bagged decision trees?

Question

Faranak am 26 Aug. 2024

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/2147934-what-is-the-difference-between-oobpredict-and-predict-with-ensemble-of-bagged-decision-trees

Kommentiert: Faranak am 29 Aug. 2024

1- I am using both fuctions to predict a response through random forest, but the predict function gives higher percentage of explained variance compared to oobPredict. Why is it so? - I think there is some fundamental thing that I have not yet fully grasped.

2- If there is something different between these methods in the way that they weigh trees how can I make these methods homogenous?

3- Can one use oobPredict in someway to make predictions with a new set of data?

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Malay Agarwal am 26 Aug. 2024

1
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/2147934-what-is-the-difference-between-oobpredict-and-predict-with-ensemble-of-bagged-decision-trees#answer_1505259

Bearbeitet: Malay Agarwal am 26 Aug. 2024

Hi @Faranak,

The "oobPredict" function is used to get a more realistic estimate of the performance of the model. For each data sample, the function only considers those trees for which the sample was out-of-bag during training. In other words, it only considers those trees which have not seen the sample during training. Since the trees have not seen the sample, the prediction can be incorrect and contribute to the model's error. This can lead to a lower percentage of explained variance.

On the other hand, the "predict" function uses all the trees to obtain a prediction for a sample. If the sample is from the training set, at least one tree must have seen the sample during training and the model can account for more of the variance in the dataset.

This is similar to having a training set and a validation set when training a neural network (https://en.wikipedia.org/wiki/Training,_validation,_and_test_data_sets).The network will always report a higher error and explain less of the variance on the validation set since the model is not explicitly trained on those samples. The out-of-bag samples act as the validation set since only those trees which haven't seen the sample during training have a say in the final prediction.

This is explained in the documentation of "oobPredict" (https://www.mathworks.com/help/stats/treebagger.oobpredict.html#bu0qyz1-2), albeit in a less direct manner:

"For each observation that is out of bag for at least one tree, oobPredict composes the weighted mean of the class posterior probabilities by selecting the trees in which the observation is out of bag. "

I don't think there is any way to make the outputs more homogenous since "oobPredict" will always choose a different set of trees to make a prediction for a sample as compared to the "predict" function. You can try experimenting with the "TreeWeights" name-value argument but I think that's unlikely to work since it only defines how to weigh the trees in the overall calculation of the prediction, and does not affect which trees will take part in the prediction.

Coming to your last question, the "oobPredict" function does not support making predictions on new data. It is simply to evaluate the model's performance by obtaining a less biased estimate of its error. For new data, please use the "predict" function.

Hope this helps!

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Faranak am 29 Aug. 2024

Thanks a lot Malay. Your answers made a lot of points clearer to me.

Melden Sie sich an, um zu kommentieren.

What is the difference between oobPredict and predict with ensemble of bagged decision trees?

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Akzeptierte Antwort

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Weitere Antworten (0)

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

What is the difference between oobPredict and predict with ensemble of bagged decision trees?

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Akzeptierte Antwort

1 Kommentar -1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Weitere Antworten (0)

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden