Monte Carlo repetitions with customized partitions
1 Ansicht (letzte 30 Tage)
Ältere Kommentare anzeigen
Tobias Rieker
am 1 Apr. 2024
Kommentiert: Harald
am 4 Apr. 2024
k = 5; %number of partitions
c = cvpartition(Labels{2},"KFold", k ,"Stratify",true);
test_idx = test(c,"all");
for ii = 1:5
%%%% Divide into train and test set via logical indexing (5columns = 5
%%%% partitions. Label 1 & 3 are always used for testing
testIndices(:,ii) = logical([ones(numel(Labels{1}),1); test_idx(:,ii); ones(numel(Labels{3}),1)]);
end
c = cvpartition("CustomPartition",testIndices);
I want to customize partitions for cross-validation, but with some of the samples to be tested for in each partition. Is there a way to do it?
I tried using cvpartition, but I can either customize the partitions and get the Error: "Each observation must be present in one test set."
Or I use monte carlo repetitions which allows for samples to be used more than once as testing set, but then I cant customize the sets anymore.
I'm thankful for any hint.
9 Kommentare
Harald
am 3 Apr. 2024
Duh... that makes sense. I suppose it will take some fiddling to address this.
I would try this strategy:
- To be able to determine which columns were chosen, add fake data 1:numColumns to on top of the x-values and some nonsense value that does not appear in your y-values on top of the y-values that you supply to sequentialfs.
- Identify which of the y-values passed to the function (either yTrain or yTest) contains the nonsense value. Extract the corresponding row of x-values from xTrain or xTest. This will tell you which columns were sent into the function.
- Extract the corresponding columns from SFS_xAlwaysIn and add it to the test data. Be sure to remove the fake data of the first step.
I expect this to be somewhat tricky and would be happy to try to help, but would really need some sample data for SFS_xtrain and SFS_ytrain to play with. Perhaps I should be able to infer this, but I am not even sure of the data type of SFS_ytrain.
Best wishes,
Harald
Akzeptierte Antwort
Harald
am 4 Apr. 2024
I have now tried the approach discussed in the comments with sample data based on fisheriris.mat.
%% Sample data
load fisheriris.mat
species = categorical(species);
% Shuffle data
order = randperm(length(species));
meas = meas(order,:);
species = species(order,:);
SFS_xtrain = meas(1:130,:);
SFS_ytrain = species(1:130);
SFS_xAlwaysIn = meas(131:end,:);
SFS_yAlwaysIn = species(131:end);
%% Add fake data
SFS_xtrain = [1:size(SFS_xtrain, 2); SFS_xtrain];
SFS_ytrain = ["nonsense"; SFS_ytrain];
%% Your code (for now without setting "nfeatures" and "options")
k = 5;
c = cvpartition(SFS_ytrain,"KFold", k ,"Stratify",true);
% opts = statset("UseParallel",true);
fun = @(XTrain,yTrain,XTest,yTest) callErrorFun(XTrain,yTrain, XTest, yTest, SFS_xAlwaysIn, SFS_yAlwaysIn);
[toKeep, ranking] = sequentialfs(fun,SFS_xtrain,SFS_ytrain,"cv",c);
%% A helper function
function err = callErrorFun(XTrain,yTrain, XTest, yTest, SFS_xAlwaysIn, SFS_yAlwaysIn)
if sum(yTrain == "nonsense") == 1
idx = yTrain == "nonsense";
columns = XTrain(idx, :);
XTrain(idx,:) = [];
yTrain(idx) = [];
elseif sum(yTest == "nonsense") == 1
idx = yTest == "nonsense";
columns = XTest(idx, :);
XTest(idx,:) = [];
yTest(idx) = [];
else
error("Something unexpected happened. Revisit the approach...")
end
XTrain = [XTrain; SFS_xAlwaysIn(:, columns)];
yTrain = [yTrain; SFS_yAlwaysIn];
err = errorFun(XTrain,yTrain,XTest,yTest);
end
%% Your function
function error = errorFun(XTrain,yTrain,XTest,yTest)
% Create the model with the learning method of your choice
classifier = fitcdiscr(XTrain,yTrain);
% Calculate the number of test observations misclassified
ypred = predict(classifier,XTest);
error = nnz(ypred ~= yTest);
end
I hope you'll find this to be helpful.
Best wishes,
Harald
2 Kommentare
Harald
am 4 Apr. 2024
Glad it's working for you! If you found the answer to be helpful, please consider "accept"-ing it.
Best wishes,
Harald
Weitere Antworten (0)
Siehe auch
Kategorien
Mehr zu Classification Trees finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!