Remove duplicate variables depending on a second variable
8 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Marty Dutch
am 21 Sep. 2015
Kommentiert: Marty Dutch
am 22 Sep. 2015
Dear experts, I have a list of variables where I need te remove duplicate variables. However, in case of duplicate variables I want to keep the varibles that have value 1 in the second column. In cases when there are multiple duplicates with a 1 then it needs to keep randomly only one variable. See example below: Here I want to keep the variable BG1028 where the data in the third column is 1.3. For BG1030, I want to keep the variable with 3.0 or 0.3 in the third column. I hope it is clear. Im puzzling how to do this. This is the code I came up with so far.
ppn(:,1) = {'BG1026';'BG1027';'BG1028';'BG1028';'BG1028';'BG1029';'BG1030';'BG1030';'BG1030';'BG1030'};
ppn(:,2) = {'0';'0';'1';'0';'0';'1';'1';'0';'1';'0'};
ppn(:,3) = {'1.2';'2.2';'1.3';'0.2';'8.9';'3.4';'3.0';'0.3';'1.3';'0.3'};
% find duplicates
ppn2 = ppn(:,1);
idx = find(strcmp(ppn2(1:end-1),ppn2(2:end)))+1;
%remove duplicates
ppn((idx),:) = [];
0 Kommentare
Akzeptierte Antwort
Kirby Fears
am 21 Sep. 2015
Hi Marty,
Try the code below.
% Defining ppn (all at once)
ppn = [ {'BG1026';'BG1027';'BG1028';'BG1028';'BG1028';'BG1029';...
'BG1030';'BG1030';'BG1030';'BG1030'},... % start col 2
{'0';'0';'1';'0';'0';'1';'1';'0';'1';'0'},... % start col 3
{'1.2';'2.2';'1.3';'0.2';'8.9';'3.4';'3.0';'0.3';'1.3';'0.3'}];
% Storing ppn column 2 as numerical values
bPpn=cell2mat(cellfun(@(c)str2double(c),ppn(:,2),...
'UniformOutput',false));
% Deleting all duplicates with 0 in bPpn
idx = strcmp(ppn(1:end-1,1),ppn(2:end,1));
delidx = ([idx;false] | [false;idx]) & ~bPpn;
ppn(delidx,:)=[];
clear bPpn idx delidx;
% Get names of remaining duplicates
chooseNames = ppn([strcmp(ppn(1:end-1,1),ppn(2:end,1));false],1);
% Loop over chooseNames and keep one at random
if numel(chooseNames)>0,
for j=1:numel(chooseNames),
dupidx=find(strcmp(chooseNames{j},ppn(:,1)));
dupidx(randi(numel(dupidx)))=[];
ppn(dupidx,:)=[];
end,
end,
Hope this helps.
2 Kommentare
Weitere Antworten (1)
the cyclist
am 21 Sep. 2015
This is not the world's most efficient code, but is a very straightforward implementation of what you want (or at least my understanding of it). It displays the indices you want to keep.
It's not documented at all, but I tried to use some intuitive variable names, so maybe you can figure it out.
ppn(:,1) = {'BG1026';'BG1027';'BG1028';'BG1028';'BG1028';'BG1029';'BG1030';'BG1030';'BG1030';'BG1030'};
ppn(:,2) = {'0';'0';'1';'0';'0';'1';'1';'0';'1';'0'};
ppn(:,3) = {'1.2';'2.2';'1.3';'0.2';'8.9';'3.4';'3.0';'0.3';'1.3';'0.3'};
[unique_ppn,~,indexFromUniqueBackToAll] = unique(ppn(:,1));
number_unique_ppn = numel(unique_ppn);
indices_to_keep = [];
for np = 1:number_unique_ppn
index_to_this_ppn = find((indexFromUniqueBackToAll==np));
if numel(index_to_this_ppn) == 1
indices_to_keep = [indices_to_keep; index_to_this_ppn];
else
remove_zero_index = ismember(ppn(index_to_this_ppn,2),'0');
index_to_this_ppn(remove_zero_index) = [];
random_one_to_keep = index_to_this_ppn(randi(numel(index_to_this_ppn)));
indices_to_keep = [indices_to_keep; random_one_to_keep];
end
end
indices_to_keep
0 Kommentare
Siehe auch
Kategorien
Mehr zu Filter Banks finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!