customized Loss function for cross validation

1 Ansicht (letzte 30 Tage)
Af
Af am 14 Jun. 2019
Kommentiert: Af am 27 Jun. 2019
I trained a decision tree regression model with the following code:
MdlDeep = fitrtree(X,Y,'KFold',SbjNm,'MergeLeaves','off', 'MinParentSize',1,'Surrogate','on');
and customized the loss function to test the model accuracy:
LossEst(OutCnt)=kfoldLoss(CllTr{OutCnt},'LossFun',@TstLossFunIn);
the customized loss function was:
function lossvalue = TstLossFunIn(C,S,W)
DffTtl=(C-S).^2;
DffTtl=DffTtl.*W;
SSE=sum(DffTtl); SSTM=mean((C-mean(C)).^2);
lossvalue=(SSE/SSTM);
this results in a reasonable loss given my problem. However, I wanted to control the cross-validation procedure, so I modified the code to split the traning and testing dataset myself and see how the model performs:
for SbjCnt=1:SbjNm
TrnDt=X;
TrnDt(SbjCnt,:)=[];
TrnOut=Y;
TrnOut(SbjCnt)=[];
MdlDeep = fitrtree(TrnDt,TrnOut,'MergeLeaves','off','MinParentSize',1,'Surrogate','on');
TstDt=XS(SbjCnt,:);
EstY=predict(MdlDeep,TstDt);
end
Now I wanted to calculate the loss function. The thing is that in this case, the calculated loss is very much different from the loss function in the first scenario and the model does not seem to be accurate at all.
Any hint, why this works like that?
Best regards,
Afshin
  1 Kommentar
Af
Af am 27 Jun. 2019
kfold of matlab shuffles the samples in the training and test set, appareantly it cannot be garanteed that one specific subject is not included in the training set. There is a loss function which takes an input argument called "usenfort" showing which input in each partition should be used for testing. There one can see that the included samples in the test are shuffled.

Melden Sie sich an, um zu kommentieren.

Antworten (0)

Kategorien

Mehr zu Statistics and Machine Learning Toolbox finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by