Filter löschen
Filter löschen

Find double repetitions in a (sorted) array.

1 Ansicht (letzte 30 Tage)
bbb_bbb
bbb_bbb am 20 Okt. 2017
Kommentiert: Andrei Bobrov am 23 Okt. 2017
Given an array submitted in a form of struct field, containing integer numbers. For convenience, let's assume that the numbers are already sorted in ascending order:
>> s.x
ans =
2
ans =
2
ans =
5
ans =
5
ans =
5
ans =
8
ans =
8
Find indexes of elements, which occur exact 2 times:
ind =
1 2 6 7
  4 Kommentare
Jan
Jan am 21 Okt. 2017
Bearbeitet: Jan am 21 Okt. 2017
The iterative growing of arrays is a standard mistake from the view point of efficiency. Simply pre-allocate:
d = diff(x);
j = 0;
ind = zeros(1, numel(d));
indi = 1;
for i=1:numel(d)
if d(i)==0
j=j+1;
else
if j==1
ind(indi) = i-1;
ind(indi+1) = i;
indi = indi + 2;
end
j=0;
end
end
ind = ind(1:indi-1);
This does not catch the case, if the last two elements are equal.
bbb_bbb
bbb_bbb am 21 Okt. 2017
This does not catch the case, if the last two elements are equal.
Adding this line repairs this:
if d(end)==0, d(end+1)=1; end
This variant seems to be the fastest.

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Andrei Bobrov
Andrei Bobrov am 20 Okt. 2017
Bearbeitet: Andrei Bobrov am 23 Okt. 2017
x= [2; 2; 5; 5; 5; 8; 8; 13; 13; 13; 13];
[~,~,g] = unique(x); % OR for last versions of MATLAB: g = findgroups(x)
c = accumarray(g,1:numel(x),[],@(x){x});
out = cell2mat(c(cellfun(@numel,c) == 2));
or
[a,~,g] = unique(x);
out = find(ismember(x,a(accumarray(g,1) == 2)));
or (FIXED)
out = reshape(strfind([1,diff(x(:)')~=0,1],[1 0 1]) + [0;1],[],1);
out = reshape(bsxfun(@plus,strfind([1,diff(x(:)')~=0,1],[1 0 1]),[0;1]),[],1); % for old MATLAB
  13 Kommentare
bbb_bbb
bbb_bbb am 23 Okt. 2017
This is ok. I sligtly modified it for speed. The fastest and concisiest algorithm of all suggested!
out = strfind([true,diff(x')~=0,true],[1 0 1]);
out = reshape([out;out+1],1,[]);
Andrei Bobrov
Andrei Bobrov am 23 Okt. 2017
Thank you mister Bbb_bbb.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (4)

Rik
Rik am 20 Okt. 2017
Bearbeitet: Rik am 20 Okt. 2017
It always pays off to get rid of loops and/or pre-allocating your output.
x= [2; 2; 5; 5; 5; 8; 8; 13; 13; 13; 13];
s = struct('x', num2cell(x));
x=[s.x];
%only newer releases: 0.000778 seconds
tic
count=histcounts(x,0.5 : max(x)+0.5);
ind=find(sum(x==find(count==2)'));
toc
%should work on most releases: 0.000628 seconds
tic
count=histcounts(x,0.5 : max(x)+0.5);
count=find(count==2);
ind=find(sum(repmat(x,length(count),1)==repmat(count',1,length(x))));
toc
%your loop: 0.001100 seconds
tic
d=diff(x); j=0; ind=[];
for i=1:numel(d)
if d(i)==0
j=j+1;
else
if j==1
ind(end+1)=i-1;
ind(end+1)=i;
end
j=0;
end
end
toc
  8 Kommentare
Rik
Rik am 21 Okt. 2017
That's because x has a different shape:
x1= [2; 2; 5; 5; 5; 8; 8; 13; 13; 13; 13];
s = struct('x', num2cell(x1));
x2=[s.x];
x1 is 11x1 and x2 is 1x11
bbb_bbb
bbb_bbb am 21 Okt. 2017
This variant isn't working at big arrays:
x=randi([1,1e6],1e5,1); x=sort(x)';
count=histcounts(x,0.5 : max(x)+0.5);
count=find(count==2);
ind=find(sum(repmat(x,length(count),1)==repmat(count',1,length(x))));
Error using repmat
Maximum variable size allowed by the program is exceeded.

Melden Sie sich an, um zu kommentieren.


Image Analyst
Image Analyst am 20 Okt. 2017
You didn't tag it as homework. Is it? This will do it:
% Assignment of a struct with a field containing integer numbers
x= [2; 2; 5; 5; 5; 8; 8; 13; 13; 13; 13];
s = struct('x', num2cell(x));
numbers = [s.x]
[groupNumber, groupValue] = findgroups(numbers)
counts = histcounts(groupNumber)
ofGroupSize2 = find(counts == 2) % Find those only if they have a length of 2.
values = groupValue(ofGroupSize2)
indexes = find(ismember(numbers, values))
  2 Kommentare
bbb_bbb
bbb_bbb am 20 Okt. 2017
Bearbeitet: bbb_bbb am 21 Okt. 2017
No, its no homework - so called "just-for-fun project".
[groupNumber, groupValue] = findgroups(numbers)
Undefined function or variable 'findgroups'.
Matlab 2015a
Image Analyst
Image Analyst am 20 Okt. 2017
Bearbeitet: Image Analyst am 21 Okt. 2017
You can use regionprops() instead of findgroups() if you have an old version and have the Image Processing Toolbox. See my separate answer with demo code.

Melden Sie sich an, um zu kommentieren.


Jan
Jan am 21 Okt. 2017
Your code looks like the input is sorted. The other approaches do not have this limitation. If it is really sorted:
d = [true; diff(x) ~= 0]; % TRUE if values change
b = x(d); % Elements without repetitions
k = find([d', true]); % Indices of changes
n = diff(k);
is2 = find(n==2);
ind4 = reshape([k(is2); k(is2)+1], 1, []);
Code taken from FEX: RunLength.

Image Analyst
Image Analyst am 21 Okt. 2017
Bearbeitet: Image Analyst am 21 Okt. 2017
If you have the Image Processing Toolbox, you can use regionprops():
% Assignment of a struct with a field containing integer numbers
x= [2; 2; 5; 5; 5; 8; 8; 13; 13; 13; 13];
s = struct('x', num2cell(x));
numbers = [s.x] % A labeled "image"
% Find lengths of each run of numbers plus the indexes where they occur.
props = regionprops(numbers, 'Area', 'PixelIdxList')
% Extract from structure into one vector.
allLengths = [props.Area]
% Find those only if they have a length of 2.
ofGroupSize2 = find(allLengths == 2)
% Find indexes of those runs with length 2.
indexes = [props(ofGroupSize2).PixelIdxList]
% Shape into row vector
indexes = reshape(sort(indexes(:)), 1, [])
  1 Kommentar
bbb_bbb
bbb_bbb am 21 Okt. 2017
This works, but the variant is the longest (6.2 sec on 1e6 elements vector). The fastest is 0.03 sec.

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Matrix Indexing finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by