Filter löschen
Filter löschen

Remove duplicate variables depending on a second variable

8 Ansichten (letzte 30 Tage)
Marty Dutch
Marty Dutch am 21 Sep. 2015
Kommentiert: Marty Dutch am 22 Sep. 2015
Dear experts, I have a list of variables where I need te remove duplicate variables. However, in case of duplicate variables I want to keep the varibles that have value 1 in the second column. In cases when there are multiple duplicates with a 1 then it needs to keep randomly only one variable. See example below: Here I want to keep the variable BG1028 where the data in the third column is 1.3. For BG1030, I want to keep the variable with 3.0 or 0.3 in the third column. I hope it is clear. Im puzzling how to do this. This is the code I came up with so far.
ppn(:,1) = {'BG1026';'BG1027';'BG1028';'BG1028';'BG1028';'BG1029';'BG1030';'BG1030';'BG1030';'BG1030'};
ppn(:,2) = {'0';'0';'1';'0';'0';'1';'1';'0';'1';'0'};
ppn(:,3) = {'1.2';'2.2';'1.3';'0.2';'8.9';'3.4';'3.0';'0.3';'1.3';'0.3'};
% find duplicates
ppn2 = ppn(:,1);
idx = find(strcmp(ppn2(1:end-1),ppn2(2:end)))+1;
%remove duplicates
ppn((idx),:) = [];

Akzeptierte Antwort

Kirby Fears
Kirby Fears am 21 Sep. 2015
Hi Marty,
Try the code below.
% Defining ppn (all at once)
ppn = [ {'BG1026';'BG1027';'BG1028';'BG1028';'BG1028';'BG1029';...
'BG1030';'BG1030';'BG1030';'BG1030'},... % start col 2
{'0';'0';'1';'0';'0';'1';'1';'0';'1';'0'},... % start col 3
{'1.2';'2.2';'1.3';'0.2';'8.9';'3.4';'3.0';'0.3';'1.3';'0.3'}];
% Storing ppn column 2 as numerical values
bPpn=cell2mat(cellfun(@(c)str2double(c),ppn(:,2),...
'UniformOutput',false));
% Deleting all duplicates with 0 in bPpn
idx = strcmp(ppn(1:end-1,1),ppn(2:end,1));
delidx = ([idx;false] | [false;idx]) & ~bPpn;
ppn(delidx,:)=[];
clear bPpn idx delidx;
% Get names of remaining duplicates
chooseNames = ppn([strcmp(ppn(1:end-1,1),ppn(2:end,1));false],1);
% Loop over chooseNames and keep one at random
if numel(chooseNames)>0,
for j=1:numel(chooseNames),
dupidx=find(strcmp(chooseNames{j},ppn(:,1)));
dupidx(randi(numel(dupidx)))=[];
ppn(dupidx,:)=[];
end,
end,
Hope this helps.
  2 Kommentare
Marty Dutch
Marty Dutch am 22 Sep. 2015
Hi Kirby,
Thanks for your response. And this works perfectly! Although I forgot to mention something... The script you've written deletes duplicates when they have a zero. In cases when there are multiple duplicates with a zero then it needs to keep randomly only one variable.
I really appreciate your time helping me! I'll have a look at your script and maybe I can adapt it on my own.
Marty Dutch
Marty Dutch am 22 Sep. 2015
Wait, it works now. I just deleted this part of your code:
% Deleting all duplicates with 0 in bPpn
idx = strcmp(ppn(1:end-1,1),ppn(2:end,1));
delidx = ([idx;false] | [false;idx]) & ~bPpn;
ppn(delidx,:)=[];
clear bPpn idx delidx;

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (1)

the cyclist
the cyclist am 21 Sep. 2015
This is not the world's most efficient code, but is a very straightforward implementation of what you want (or at least my understanding of it). It displays the indices you want to keep.
It's not documented at all, but I tried to use some intuitive variable names, so maybe you can figure it out.
ppn(:,1) = {'BG1026';'BG1027';'BG1028';'BG1028';'BG1028';'BG1029';'BG1030';'BG1030';'BG1030';'BG1030'};
ppn(:,2) = {'0';'0';'1';'0';'0';'1';'1';'0';'1';'0'};
ppn(:,3) = {'1.2';'2.2';'1.3';'0.2';'8.9';'3.4';'3.0';'0.3';'1.3';'0.3'};
[unique_ppn,~,indexFromUniqueBackToAll] = unique(ppn(:,1));
number_unique_ppn = numel(unique_ppn);
indices_to_keep = [];
for np = 1:number_unique_ppn
index_to_this_ppn = find((indexFromUniqueBackToAll==np));
if numel(index_to_this_ppn) == 1
indices_to_keep = [indices_to_keep; index_to_this_ppn];
else
remove_zero_index = ismember(ppn(index_to_this_ppn,2),'0');
index_to_this_ppn(remove_zero_index) = [];
random_one_to_keep = index_to_this_ppn(randi(numel(index_to_this_ppn)));
indices_to_keep = [indices_to_keep; random_one_to_keep];
end
end
indices_to_keep

Kategorien

Mehr zu Filter Banks finden Sie in Help Center und File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by