For Loop or function for repeating action

I have A with 225 x 2 vectors. One Column is a variable always ranking from 1-5 (like grades) and the second is also numeric. I now want to calculate the mean, median, first and third quantile of the second vector, for each grade score.
The result I need, need to be interpreted like: mean(age) of A students better than mean(age) of B students
Grades 1 2 3 [etc]
Mean
Median
1st Qntl
3rd Qntl
I did it all by manually, which is kind of a lot, because I have 8 hypothesis for which the calculations are almost the same (the matrix A is in reality 225*11 but I only need 2-3 vectors per hypothesis). Now I wonder if there is a way to "do it faster and more efficient" namely in a for loop?
where I can write something like:
for i = 1:5
if ERM == i
mean_Hyp_1 = nanmean(A(ERM==1;:,2))
meadian_Hyp_1 = nanmedian(A(ERM==i;:,2)
etc
end
end
Thanks in advance

 Akzeptierte Antwort

Vishwas
Vishwas am 19 Sep. 2017

0 Stimmen

You had the right idea. "find" function can be used to find all the rows where ERM == 1,2,.. in a loop and the result can be calculated.
Let me show this via an example:
a = [1;3;2;4;5;1;2;4;3;5;3;2;1]
b = [10;15;24;54;36;57;87;98;65;78;05;48;65]
input = [a b]
mean = []
median = []
for i = 1:5
mean(i) = nanmean(input(find(input(:,1)==i), 2))
median(i) = nanmedian(input(find(input(:,1)==i), 2))
end
I the case above, we are using the "find" function on the first column of input, extracting the indices for all values of input(:,1) == i and finding the mean of all the values from the second column.

9 Kommentare

Vishwas,
I have a question about your answer. When I put your code in an m-file in my R2017a, the find parts have a red underline, telling me that
If 'input' is an indexed variable, performance can be increased using logical indexing instead of FIND.
If I click fix, the word find is removed (matching the answer I have given above).
Would one be better than the other in this example (and in general)?
Walter Roberson
Walter Roberson am 19 Sep. 2017
It is better to avoid using "input" as a variable name, due to conflict with the frequently-used input() function.
Walter Roberson
Walter Roberson am 19 Sep. 2017
Omitting the find() is more efficient.
Hello Tim and Vishwas,
thank you both for answering.
I have now tried both of your codes and I get an error because the vector B for vector A=1,2..etc doesn't always have the same size.
Subscripted assignment dimension mismatch.
Error in HypotheseEins (line 54)
mean_A(i) = nanmean(A(A(:,1)==i,2:end));
AND
Subscripted assignment dimension mismatch.
Error in HypotheseEins (line 49)
mean(i) = nanmean(A(find(A(:,1)==i), 2:end))
Please don't wonder about my slightly differences. I said above that I my matrix is actually bigger than I used in the example to ask my question. Do I have to first somehow "fill" the smaller vectors with zeros to the maxlength of the biggest vector?
Thank you very much.
Tim Berk
Tim Berk am 19 Sep. 2017
Bearbeitet: Tim Berk am 19 Sep. 2017
mean_A(i) = nanmean(A(A(:,1)==i,2:end));
Takes the mean over multiple columns (2:end) as you seem to want it. But then it will also give multiple means (one for each column, as you want).
But you are still trying to put those multiple values into mean_A(i), which is a single location in the array mean_A.
Try
mean_A(i,:) = nanmean(A(A(:,1)==i,2:end));
Thank you very much! It works!
I changed it to display the grades as columns.
mean_A(:,i) = nanmean(A(A(:,1)==i,2:end));
My code looks more tidied up now and I can even put hypothesis together in one script.
I used display(mean_A) for the results to show in a "table" form. Do you by any chance know how I can name the rows and columns of the result?
Tim Berk
Tim Berk am 20 Sep. 2017
Have a look at the function table ( https://www.mathworks.com/help/matlab/ref/table.html )
Stephen23
Stephen23 am 20 Sep. 2017
@Vishwas Vijaya Kumar: is there a good reason for shadowing the inbuilt input function?
José-Luis
José-Luis am 20 Sep. 2017
Bearbeitet: José-Luis am 20 Sep. 2017
And mean().
And median().
And that's some pretty tortured indexing.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (1)

Tim Berk
Tim Berk am 19 Sep. 2017

1 Stimme

You can use the condition A(:,1) == i as indexing for which values in A(:,2) to consider, i.e.
A = [1 2 3 1 1 2 3; 4 5 6 7 8 9 0]'
for i = 1:3
mean_A(i) = nanmean(A(A(:,1)==i,2));
% etc..
end

1 Kommentar

Hi Tim,
Thanks to you my codes for all my hypothesis are hapening so much faster. Now I am on my last hypothesis, which is the same method as before with one constraint. Before I open a new question, I just wanted to see, if you can help.
matrix A with 5 columns. First column with grades (1-5) and second column with years ranking from 2008-2013. Rest of columns again numeric.
First: "Cluster" the years 2008-2010, 2011-2013, 2014-2016
Second: Search Grades between the years 2008-2010, 2011-2013, 2014-2016
Third: Calculate the means of every column according to grade and clustered year.
The main problem I have encountered is that Matlab doesn't let me write the expression
for i = 2008:2010 ...etc
I did it again manually (mean of each year for all variables). But I cannot include, like your previous code showed me.
for i= 1:5
...(A(:,i)==i)..etc

Melden Sie sich an, um zu kommentieren.

Kategorien

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by