splitting dataset into training set and testing set

I have 400 images in my dataset(images).I want to split the dataset into 80% for training and 20% for testing.the below attached code works but , test_idx is empty?why?
train_idx contains 320 images.test_idx is empty.
clc
clear
% Load Image dataset
faceDatabase = imageSet('facedatabaseatt','recursive');
%splitting into training and testing sets
N = 400; % number of images
idx = 1:N ;
PD = 0.80 ;
train_idx = idx(1:round(PD*N)); % training indices
test_idx = idx(round(PD*N)+1:end,:) ; % test indices

 Akzeptierte Antwort

Akira Agata
Akira Agata am 15 Jan. 2020
You can split your dataset by using partition function, like:
[setTrain, setTest] = partition(faceDatabase, [0.8, 0.2], 'randomized');

6 Kommentare

I have to split the dataset manually, without using built-in functions.
OK. Then, how about the following?
N = imgSet.Count; % number of images
PD = 0.8;
train_idx = sort(randperm(N,round(N*PD)));
test_idx = setxor(train_idx,1:N);
setTrain = select(faceDatabase,train_idx);
setTest = select(faceDatabase,test_idx);
Thank you sir. The above code works if i copy all 400 images in a single folder.
But, the original dataset ORL Facedatabaseatt contains 40 folders s1,s2,.....,s40(10 images each of 40 persons).
like that 10x40=400 images.i have to choose 8 images from each person for testing and 2 images for testing.
the following error was occured .
Capture.PNG
Index exceeds the number of array elements (10).
Error in imageSet/selectProperties (line 526)
imgSet.ImageLocation = imgSet.ImageLocation(index);
Error in imageSet/select (line 394)
out(n) = selectProperties(this(end), index);
Error in split (line 12)
setTrain = select(faceDatabase,train_idx);
>>
I'm trying to use this to divide my dataset and getting the following error:
[imdsTrain, imdsTest] = partition(imds [0.8, 0.2], 'randomized');
Error using matlab.io.datastore.ImageDatastore/partition
Too many output arguments.
Any alternative suggestions?
>the original dataset ORL Facedatabaseatt contains 40 folders s1,s2,.....,s40(10 images each of 40 persons).
> like that 10x40=400 images.i have to choose 8 images from each person for training and 2 images for testing.
OK. in that case, I would recommend using imageDataStore function, like:
dataFolder = pwd; % if your 40 folders are stored in different folder, please change.
imgSet = imageDatastore(dataFolder,...
'IncludeSubfolders', true,...
'LabelSource', 'foldernames');
% Choose first 8 images from each folder and set them to training dataset, and 2 images for test dataset
[imgSetTrain, imgSetTest] = splitEachLabel(imgSet,0.8);
% If you want to choose 8 and 2 images from each folder randomly, please set 'randomized' option
[imgSetTrain, imgSetTest] = splitEachLabel(imgSet,0.8,'randomized');
please i split image data base but icant find [imgSetTrain, imgSetTest] how ican make output .txt
clc
F = fullfile('g:','iris-recognition---pm-diseased-human-driven-bsif-main','casia 2 device1');
imds = imageDatastore(F,'IncludeSubfolders',true,'LabelSource','foldernames');
labelCount = countEachLabel (imds)
% Choose first 8 images from each folder and set them to training dataset, and 2 images for test dataset
numTrainFiles= 0.8
% If you want to choose 8 and 2 images from each folder randomly, please set 'randomized' option
[imgSetTrain, imgSetTest] = splitEachLabel(imds,numTrainFiles);

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by