over sampling method( SMOTE)

2 Ansichten (letzte 30 Tage)
Maryam Samami
Maryam Samami am 14 Aug. 2017
Bearbeitet: Walter Roberson am 4 Jul. 2018
Dear all, I have used SMOTE (an oversampling method for balancing data set),but after balancing, the obtained balanced data set has not the label column. the rows related to the balanced data set get increase but the label column would not increase. the main data set is 1000*25. the obtained balanced data set will be 2200*24. without label column. label column goes to "final_labels" parameter. it is 2200*1 but it contains only label 1. it must contain both labels 2 and 1 .
I will be so happy if any one would be able to guide me. any suggestion will be appreciated.
------------------------------------------------
this is my script code to balancing data set.
-----------------------------------------------------
load creditgerman.mat
a=creditgerman;
[n,m]=size(a);
total_rows=(1:n);
original_features=a(:,1:m-1);
original_mark=a(:,m);
[creditgerman_balanced_SMOTE,final_labels]=SMOTE(original_features, original_mark);
--------------------------------------------------------------------------
and this is the utilized SMOTE code.
function [final_features , final_mark] = SMOTE(original_features, original_mark)
ind = find(original_mark ==2);
% P = candidate points
P = original_features(ind ,:);
T = P';
% X = Complete Feature Vector
X = T;
% Finding the 5 positive nearest neighbours of all the positive blobs
I = nearestneighbour(T, X, 'NumberOfNeighbours', 6);
I = I';
[r, c] = size(I);
S = [];
th=0.3;
for i=1:r
for j=2:c
index = I(i,j);
new_P=P(i,:)+((P(index,:)-P(i,:))*rand);
S = [S;new_P];
end
end
original_features = [original_features;S];
[r c] = size(S);
mark = ones(r,1);
original_mark = [original_mark;mark];
train_incl = ones(length(original_mark), 1);
I = nearestneighbour(original_features', original_features', 'NumberOfNeighbours', 6);
I = I';
for j = 1:length(original_mark)
neighbors = I(j, 2:6);
len = length(find(original_mark(neighbors) ~= original_mark(j,1)));
if(len >= 2)
if(original_mark(j,1) == 1)
train_incl(neighbors(original_mark(neighbors) ~= original_mark(j,1)),1) = 0;
else
train_incl(j,1) = 0;
end
end
end
final_features = original_features(train_incl == 1, :);
final_mark = original_mark(train_incl ==1, :);
end
-----------------------------------------------------------

Antworten (0)

Kategorien

Mehr zu Matrix Indexing finden Sie in Help Center und File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by