Classification Learner APP. Cross-validation, scatter plot and confusion matrix.

Question

Giansu am 25 Jan. 2021

1
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/726258-classification-learner-app-cross-validation-scatter-plot-and-confusion-matrix

Bearbeitet: Giansu am 2 Jun. 2021

I have a question regarding this app, hopefully some app-experts can help me :)

I read from the website: "If you use k-fold cross-validation, then the app computes the accuracy scores using the observations in the k validation folds and reports the average cross-validation error. It also makes predictions on the observations in these validation folds and computes the confusion matrix and ROC curve based on these predictions".

Ok for the accuracy but.. if you look at the confusion matrix generated after selecting "k-fold validation", you have integer values. How are they determined? It is not an average of the confusion matrices obtained by eack of the k validation folds... they are neither summed up, since the sum of all the elements corresponds with the number of the learning set trials provided... so?

The same for the scatter plot after training: you can notice correct and incorrect trials in the figure.. But are they considered correct/incorrect on the basis of the average results obtained in all the k validation folds? Or this depicts the classification obtained through only one representative fold?

Thanks in advance.

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Anshika Chaurasia am 10 Feb. 2021

1
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/726258-classification-learner-app-cross-validation-scatter-plot-and-confusion-matrix#answer_620197

Bearbeitet: Anshika Chaurasia am 10 Feb. 2021

Hi Giansu,

Let's understand the scatter plot and confusion matrix generated by Classification Learner App for k-fold cross-validation with an example of iris dataset having 150 samples and 5-fold cross-validation.

As we choose 5-folds, the app will partition the data into 5 disjoint sets or folds cross-validation. For each fold, the app trains a model using 4 folds as training data and remaining 1-fold (i.e. held-out fold) as validation data.

It means whenever we use k-fold cross-validation, all the 150 samples will be considered as validation data or held-out fold for once. For e.g., for first iteration 1st fold will be validation and remaining 4 folds will be training data and similarly for second iteration 2nd fold will be validation and remaining 4 folds will be training data.

Scatter plot: The each prediction shown in the scatter plot is obtained when that particular observation was a part of held-out fold or validation data while model was training.

Confusion Matrix: The confusion matrix depicts how correctly the model predicted the class of the observation when that particular observation was a part of held-out fold or validation data while model was training. Hence the values are integer in confusion matrix.

Accuracy: The accuracy is calculated for each k-fold and to calculate the accuracy for the model we do average.

Following are the scatter plot and confusion matrix which I got on iris data for 5-fold cross validation:

Hope it helps!

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Giansu am 31 Mai 2021

Bearbeitet: Giansu am 2 Jun. 2021

Thank you very much for the detailed answer!

I have 4 related questions (sorry :) ..

- is it possible to extract also the st.deviation of the accuracies in addition to the mean accuracy? (.. or the single accuracies to compute the st.dev. myself?)

- if I have a set of, for example, 58 samples, and I specify a non-multiple k (e.g. k= 50, which is the greater value selectable), how the folds are chosen? Are these of equal size or not? And the size of the held-out group? Is it possible to obtain these useful values from the app?

- is it possible to perform leave-one-out instead?

- how are the 'decision thresholds' chosen by default?

Thank you very much.

Melden Sie sich an, um zu kommentieren.

Classification Learner APP. Cross-validation, scatter plot and confusion matrix.

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Akzeptierte Antwort

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Weitere Antworten (0)

Siehe auch

Kategorien

Tags

Community Treasure Hunt

Classification Learner APP. Cross-validation, scatter plot and confusion matrix.

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Akzeptierte Antwort

1 Kommentar -1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Weitere Antworten (0)

Siehe auch

Kategorien

Tags

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden