Matlab find unique column-combinations in matrix and respective index

Question

Benvaulter am 22 Mär. 2017

1
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/331309-matlab-find-unique-column-combinations-in-matrix-and-respective-index

Bearbeitet: Jan am 23 Mär. 2017

I have a large matrix with with multiple rows and a limited (but larger than 1) number of columns containing values between 0 and 9 and would like to find an efficient way to identify unique row-wise combinations and their indices to then build sums (somehwat like a pivot logic). Here is an example of what I am trying to achieve:

a =

uniqueCombs =

   2     3
   2     3
   2     1

numOccurrences =

 2
 1
 2

indizies:

[1;4]
[2]
[3;5]

From matrix a, I want to first identify the unique combinations (row-wise), then count the number occurrences / identify the row-index of the respective combination.

I have achieved this through generating strings with num2str and strcat, but this method appears to be very slow. Along these thoughts I have tried to find a way to form a new unique number through concatenating the values horizontally, but Matlab does not seem to support this (e.g. from [1;2;3] build 123). Sums won't work because they would remove the possibility to identify unique combinations. Any suggestions on how to best achieve this? Thanks!

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Guillaume am 22 Mär. 2017

3
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/331309-matlab-find-unique-column-combinations-in-matrix-and-respective-index#answer_259890

In MATLAB Online öffnen

More or less the same as Jan's, using accumarray instead of splitapply (I'm still old school!):

A = [ 1     2     3
      2     2     3
      3     2     1
      1     2     3
      3     2     1];
[B, ~, ib] = unique(A, 'rows');
numoccurences = accumarray(ib, 1);
indices = accumarray(ib, find(ib), [], @(rows){rows});  %the find(ib) simply generates (1:size(a,1))'

4 Kommentare
2 ältere Kommentare anzeigen2 ältere Kommentare ausblenden

Guillaume am 23 Mär. 2017

Bearbeitet: Guillaume am 23 Mär. 2017

In MATLAB Online öffnen

I suspect that accumarray will be faster as it is built-in compiled code whereas splitapply is m code, but I haven't conducted any test.

Note: for the indices,

indices = accumarray(ib, (1:numel(ib))', [], @(rows){rows});

is probably slightly faster, just not as concise.

Jan am 23 Mär. 2017

Bearbeitet: Jan am 23 Mär. 2017

In MATLAB Online öffnen

@Guillaume: I compare this with cellfun: In older versions Matlab contained the C-sources for this Mex function. Here calling a function handle is very expensive, because the Matlab tier has to be called. Therefore the implicitely defined methods provided by strings are much faster: 'length', 'isclass' etc.

Then using a compiled Mex function is not a real benefit, because mexCallMATLAB has some overhead. This might concern accumarray also. I guess that your accumarray approach is faster than the loop, but I know that it looks very cryptic ;-)

But now I can leave the speculations and run a test: With

A = randi([1, 100], 1e5, 3); % Test data

my loop takes 14.75 seconds, your accumarray approach takes 0.44 seconds. The results differ in the order of the indices. So perhaps this is wanted:

[B, iB, iA] = unique(A, 'rows');
indices     = accumarray(iA, (1:numel(iA)).', [], @(r){sort(r)});

The result is clear: @Benvaulter, please unaccept my answer and select Guillaume's, and of course use it also to save time and energy.

Melden Sie sich an, um zu kommentieren.

Answer 2

Jan am 22 Mär. 2017

1
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/331309-matlab-find-unique-column-combinations-in-matrix-and-respective-index#answer_259879

Bearbeitet: Jan am 23 Mär. 2017

In MATLAB Online öffnen

A = [ 1     2     3; ...
      2     2     3; ...
      3     2     1; ...
      1     2     3; ...
      3     2     1];
[B, iB, iA] = unique(A, 'rows');
G = unique(iA);
numOccurrences = splitapply(@sum, iA, G);

I cannot test a method to obtain the indices list as wanted. I assume this works with splitapply also. A simple loop approach at least:

n = length(G);
indices = cell(1, n);
for k = 1:n
  indices{k} = find(iA == G(k));
end

[EDITED] Code is tested now. Use the much faster solution of Guillaume for productive work.

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Benvaulter am 23 Mär. 2017

Perfect solution to my problem - thanks a lot!

Melden Sie sich an, um zu kommentieren.

Matlab find unique column-combinations in matrix and respective index

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Akzeptierte Antwort

4 Kommentare
2 ältere Kommentare anzeigen2 ältere Kommentare ausblenden

Weitere Antworten (1)

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Community Treasure Hunt

Matlab find unique column-combinations in matrix and respective index

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Akzeptierte Antwort

4 Kommentare 2 ältere Kommentare anzeigen2 ältere Kommentare ausblenden

Weitere Antworten (1)

1 Kommentar -1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

4 Kommentare
2 ältere Kommentare anzeigen2 ältere Kommentare ausblenden

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden