How to split a dataset in 3 sets using splitEachLabel using percentage such that each class appears in all 3 sets?
16 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
I've an image dataset with around 100 classes and the maximum number of images for one class is 59 whereas the minimum is 5. I try to split the data into training, validation and testing by using the following statement
[imdsTrain,imdsValidation, imdsTest] = splitEachLabel(imds,0.75,0.15,'randomize');
I got the error that training and validation data must have same labels.
I checked the imds and found that for classes having less number of images like 5, it puts 4 in training and 1 sometimes either in validation set and some in test data set. So all classes that are in training are not found in validation or test data set.
I solved it by increaing the validation percent to 0.2 instead of 0.15 but it doesn't seem a good solution.
Is there a way to split the dataset such that all classes are present in all 3 datasets? Preferably I want to make it using percentages and don't want to use integer such that it puts always 1 image in validation and test dataset.
0 Kommentare
Antworten (1)
Anmol Dhiman
am 3 Jul. 2020
Bearbeitet: Anmol Dhiman
am 3 Jul. 2020
Hi Faisal,
The second arguement (0.75) in splitEachLabel is proportion representing proportion of files to split, specified as a scalar in the interval (0,1) or a positive integer scalar. You can change its value for your problem.
Regards,
Anmol Dhiman
Siehe auch
Kategorien
Mehr zu Image Data Workflows finden Sie in Help Center und File Exchange
Produkte
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!