How do I identify unique rows based on multiple columns and calculate the average of the rest of the columns?

Question

Leon am 19 Okt. 2021

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/1567308-how-do-i-identify-unique-rows-based-on-multiple-columns-and-calculate-the-average-of-the-rest-of-the

Kommentiert: Leon am 19 Okt. 2021

I have a matrix as below:

A = [1 4 3 8; 4 5 6 9; 1 6 3 6; 2 6 9 3; 1 5 3 7];

My goal is to identify rows with both identical Column 1 and identical Column 3 values, and then calculate the average for the rest of the columns, i.e., Column 2 and Column 4, within these duplicate rows. In this example, the duplicate rows would be Rows # 1, 3, and 5. My ending matrix would be:

B = [1 5 3 7; 4 5 6 9; 2 6 9 3];

This is a much simplified example. In reality, I have 35 columns that need to be averaged, and millions of rows. What is the most efficient way of handling this? Do I have to write a loop and process each of the unique rows individually?

Many thanks!

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Kevin Holly am 19 Okt. 2021

1
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/1567308-how-do-i-identify-unique-rows-based-on-multiple-columns-and-calculate-the-average-of-the-rest-of-the#answer_812193

Bearbeitet: Kevin Holly am 19 Okt. 2021

In MATLAB Online öffnen

A = [1 4 3 8; 4 5 6 9; 1 6 3 6; 2 6 9 3; 1 5 3 7]
A = 5×4
   4     3     8
   5     6     9
   6     3     6
   6     9     3
   5     3     7

I am going to assume that any row that columns 1 and 3 are identical, irregardless of what pair, you want to ignore those rows in the averaging of other columns.

Here is my approach:

t = table(A(:,1),A(:,3))
t = 5×2 table
    Var1    Var2
    ____    ____

     1       3  
     4       6  
     1       3  
     2       9  
     1       3  
[C, ia, ic] = unique(t,'rows')
C = 3×2 table
    Var1    Var2
    ____    ____

     1       3  
     2       9  
     4       6  
ia = 3×1
     1
     4
     2
ic = 5×1
     1
     3
     1
     2
     1
ic==1
ans = 5×1 logical array
   1
   0
   1
   0
   1
A(ic==1,:)
ans = 3×4
     1     4     3     8
     1     6     3     6
     1     5     3     7
mean(A(ic==1,:))
ans = 1×4
     1     5     3     7
B = [mean(A(ic==1,:));A(ic~=1,:)]
B = 3×4
     1     5     3     7
     4     5     6     9
     2     6     9     3

Your answer:

B = [1 5 3 7; 4 5 6 9; 2 6 9 3]
B = 3×4
     1     5     3     7
     4     5     6     9
     2     6     9     3

Without showing work:

t = table(A(:,1),A(:,3));
[~,~,ic] = unique(t,'rows');
B = [mean(A(ic==1,:));A(ic~=1,:)]
B = 3×4
     1     5     3     7
     4     5     6     9
     2     6     9     3

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Leon am 19 Okt. 2021

Many thanks for the solution!

Melden Sie sich an, um zu kommentieren.

How do I identify unique rows based on multiple columns and calculate the average of the rest of the columns?

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Akzeptierte Antwort

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Weitere Antworten (0)

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

How do I identify unique rows based on multiple columns and calculate the average of the rest of the columns?

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Akzeptierte Antwort

1 Kommentar -1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Weitere Antworten (0)

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden