Grouping identical matrices in cell array

9 Ansichten (letzte 30 Tage)
Karsten Paul
Karsten Paul am 31 Mai 2021
Kommentiert: Karsten Paul am 31 Mai 2021
Consider two cell arrays (here of size 6x1), where each entry contains a matrix, e.g.
a = { [1 2; 3 4] [2 2 3; 2 2 3] [1 2; 3 4] [1 2; 3 4] [2 3 2; 3 4 5] [2 2 3; 2 2 3] }';
b = { [5 6; 7 8] [2 2 3; 2 2 3] [9 9; 9 9] [5 6; 7 8] [2 3 2; 3 4 5] [2 2 3; 2 2 3] }';
I want to find an array, which assigns a group to each of the six entries, i.e.
groups = [1 2 3 1 4 2];
A group is defined by identical a{i} and b{i} entries (or up to a tolerance). I came up with the following brute-force code
n = length(a);
groups = zeros(n,1);
counter = 0;
for i = 1:n
if groups(i)~=0
continue;
end
counter = counter + 1;
groups(i) = counter;
for j = i+1:n
if groups(j)~=0
continue;
end
if isequal(a{i},a{j}) && isequal(b{i},b{j})
groups(j) = counter;
end
end
end
which is quite inefficient due to the for-loops. Is there a smarter way of finding these groups? Thanks :)
  1 Kommentar
Stephen23
Stephen23 am 31 Mai 2021
"which is quite inefficient due to the for-loops"
I doubt that the loops themselves are consuming much time. Have you run the profiler?

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Jan
Jan am 31 Mai 2021
Bearbeitet: Jan am 31 Mai 2021
Start with a simplified version of your code:
n = numel(a);
groups = zeros(n, 1);
counter = 0;
for i = 1:n
if groups(i) == 0
counter = counter + 1;
groups(i) = counter;
for j = i+1:n
if groups(j) == 0 && isequal(a{i}, a{j}) && isequal(b{i}, b{j})
groups(j) = counter;
end
end
end
end
Now let's assume the cell arrays a and b are huge, e.g. 1e6 elements. Then comparing 1e6 with 1e6-1 elements takes a lot of time. It might be cheaper to create a hash at first:
Hash = cell(1, n);
for k = 1:n
Hash{k} = GetMD5({a{k}, b{k}}, 'Array', 'bass64');
end
[~, ~, groups] = unique(Hash, 'stable');
% With 1e6 elements per cell, R2018b, Win10:
% Elapsed time is 21.341034 seconds. % Original
% Elapsed time is 21.286879 seconds. % Cleaned
% Elapsed time is 6.252804 seconds. % Hashing
  1 Kommentar
Karsten Paul
Karsten Paul am 31 Mai 2021
Great, works perfectly and considerably faster. My cell arrays are indeed quite huge, between 1e4 and 1e6 elements. Thanks :)

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Matrix Indexing finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by