Selecting the same amount of data for all categories based on different labels
1 Ansicht (letzte 30 Tage)
Ältere Kommentare anzeigen
Killian Flynn
am 10 Okt. 2022
Beantwortet: David Hill
am 10 Okt. 2022
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1151548/image.png)
Hello. I am trying to edit the data in the input array above so that the new_data array has the same amount of bad,good and excellent quality trees for training of an AI. The bad, good and excellent are given by a label where [1; 0 ; 0] is bad, [0; 1 ; 0] is good and [0; 0 ; 1] is excellent. The bad trees have the least amount of data to train the AI so I am trying to reduce the good and and excellent data so they are all the same.
cvs_data =csvread('spruce tree timber quality.csv',1,0)
expert_ratings = cvs_data(:,12)' %inputs
inputs = cvs_data(:,1:11)
correct_labels = zeros(3,length(expert_ratings));
bad = 0;
good=0;
excellent=0;
histogram(expert_ratings)
Skew1 = skewness(expert_ratings)
for k =1 : length(expert_ratings)%length(expert_ratings)
quality = cvs_data(k,12)';
if quality >= 1 && quality <=4
bad = bad+1;
correct_labels(1:3, k) =[1; 0 ; 0];
end
if quality >= 5 && quality <=6
good = good+1;
correct_labels(1:3, k) =[0; 1 ; 0];
end
if quality >= 7 && quality <=10
excellent = excellent+1;
correct_labels(1:3, k) =[0; 0 ; 1];
end
end
[good_inputs,rows_rem] = rmoutliers(inputs,"mean")
transposed_rows = rows_rem'
good_inputs_tp = good_inputs'
correct_labels(:,transposed_rows)=[];
correct_labels
%select the same amount of bad, good and excellent data
correct_labels
good_inputs'
bad
good
excellent
for column = 1:length(bad)
if correct_labels(:,column) == [0 1 0]
new_data(column) = inputs(column)
end
end
new_data
I tried to do this in the last part of the code but it doesn't seem to be working. Any help would be greatly appreciated.
0 Kommentare
Akzeptierte Antwort
David Hill
am 10 Okt. 2022
cvs_data =readmatrix('spruce tree timber quality.csv');
quality=cvs_data(:,12);
bad_data=cvs_data(quality<=4,1:11);
good_data=cvs_data(quality>=5&quality<=6,1:11);
excellent_data=cvs_data(quality>=7,1:11);
m=min([size(bad_data,1),size(good_data,1),size(excellent_data,1)]);
newdata=[bad_data(randperm(m),:);good_data(randperm(m),:);excellent_data(randperm(m),:)];
0 Kommentare
Weitere Antworten (0)
Siehe auch
Kategorien
Mehr zu Data Import from MATLAB finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!