Filter löschen
Filter löschen

I have a dataset. I want to do cross validation. How can I divide it into testing and training data ?

2 Ansichten (letzte 30 Tage)
Hi, I have got 100 folders of which 50 are male and 50 are female. Each folder contains 6 images. I want to implement cross validation. I just know how to implement the cross validation if I just have the 100 distinct images. But here I have 100 different folders how to divide them for cross validation ? Need help.

Antworten (1)

dpb
dpb am 22 Aug. 2016
Just randomize over the overall list...
  2 Kommentare
chinnurocks
chinnurocks am 22 Aug. 2016
When I had just 100 images, each from 100 subject. I am able to randomise. But struck with how to randomise folders. For your reference I am attaching you my code.
if true
clc;
clear;
pngFiles = dir('*.png'); %Gets all the png files
%csvFiles = dir('*.csv');
numFiles = length(pngFiles);
mydata = cell(1,numFiles); % Creates a cell to store the images.
data= cell (numFiles,1); % Creates cell to store the features obtained.
%mydata = zeros(numFiles);
% Reads all the files into the mydata cell and gets lbp into data cell.
for k = 1:numFiles
mydata{k} = imread(pngFiles(k).name);
img = mydata{k};
data{k,:}=lbp(img,1,8,0,'hist');
%data{k,:} = data{k,:}./1000;
%csvwrite(csvFiles(k).name,J);
end
%Shifting that feature data to a variable 'b'.
b=[];
for k= 2:numFiles
data{k,1} = [data{k-1,1};data{k,1}];
end
c= data{numFiles,1}; % moves data to c
% Creates a vector 'a' and assigns Label to my data.
for b=1:100
if b<51
a{b,1} = 'male';
else
a{b,1} = 'female';
end
end
groups = ismember(a,'male'); % ismember gives logic '1' if it finds male or else '0'.
%# load iris dataset
%groups = ismember(species,'setosa'); %# create a two-class problem by giving 1 if setosa is found in the species
%# number of cross-validation folds:
%# If you have 50 samples, divide them into 10 groups of 5 samples each,
%# then train with 9 groups (45 samples) and test with 1 group (5 samples).
%# This is repeated ten times, with each group used exactly once as a test set.
%# Finally the 10 results from the folds are averaged to produce a single
%# performance estimation.
p=10;
cvFolds = crossvalind('Kfold', groups, p); %# get indices of 10-fold CV of "groups" observation
cp = classperf(groups); %# init performance tracker
for i = 1:p %# for each fold
testIdx = (cvFolds == i); %# get indices of test instances
trainIdx = ~testIdx; %# get indices training instances
%# train an SVM model over training instances
svmModel = svmtrain( c(trainIdx,:), groups(trainIdx),'Showplot',false, ...
'Autoscale',true, 'Showplot',false, 'Method','QP', ...
'BoxConstraint',2e-1, 'Kernel_Function','rbf', 'RBF_Sigma',1000);
%# test using test instances
pred = svmclassify(svmModel, c(testIdx,:));
%# evaluate and update performance object
cp = classperf(cp, pred, testIdx);
end
%# get accuracy cp.CorrectRate
end
dpb
dpb am 22 Aug. 2016
Return the list of subdirectories first in an array then select randomly from that array....there are quite a number of threads with code on Answers that show how to traverse a subdirectory if that's an issue...

Melden Sie sich an, um zu kommentieren.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by