find a correlation

I have a Matrix
X=[0.231914928 3.126057882 -1.752476846
-0.779092587 2.143243132 -1.944363312
-1.744892449 1.206497824 -2.267829067
-0.276817947 1.774687601 -1.768924258
-0.367233254 1.697905199 -1.508506912
-0.367233254 1.697905199 -1.508506912
-1.378240769 0.814907572 -1.700393377
-2.389248284 -0.060411815 -1.892279842
-1.333033116 0.860977013 -1.831972668
0.135041386 1.40613207 -1.333067858]
Y=[0.253549664
-0.231692981
0.768395971
2.988670669
-0.038625616
-0.038625616
-0.525155376
-1.011685136
0.961463336
3.181738034]
At first I want to calculate the correlation coefficient between all X columns which can be done like this
[R]=corrcoef(X)
then I want to see which pair of columns has the highest correlation together for example column 1 with 2 ? 1 with 3? 2 with 3?
then the one that has correlation more than 0.5 lets say for example columns 1 and 2 , then check their correlation with y and say which one is more correlated

2 Kommentare

Daniel Shub
Daniel Shub am 5 Sep. 2011
Mohammad, in general, I think your questions are great questions for Answers, but often I feel that I do not understand your questions. I think it would be easier for those of us who chose to answer questions, if you could spend a little more time composing your questions.
Niki
Niki am 5 Sep. 2011
Thanks Daniel for your comment, For sure, I will do my best , Thanks

Melden Sie sich an, um zu kommentieren.

 Akzeptierte Antwort

Grzegorz Knor
Grzegorz Knor am 5 Sep. 2011

0 Stimmen

For highest correlation:
maxR = max(max(triu(R,1))) % highest correlation
[row,col] = find(R==maxR,1,'first')
The greatest correlation occurs between the columns row an col.
Second question:
[r w] = max([corr(X(:,row),Y) corr(X(:,col),Y)])
w equal to 1 means that X(:,row) has a higher correlation than X(:,col). w equal to 2 mean that X(:,col) has a higher correlation than X(:,row).

6 Kommentare

Niki
Niki am 5 Sep. 2011
there is one problem
at first it only works for the highest correlation , if there are several then will not work, so lets say find the pair correlation in X higher than 0.5 , and then
regarding to the second question
R =
1.0000 0.8171 0.6448
0.8171 1.0000 0.1545
0.6448 0.1545 1.0000
means the highest correlation is between columns 1 and 2
but the command
[r w] = max([corr(X(:,row),Y) corr(X(:,col),Y)])
only calculate the correlation between column 2 and Y
and what about column 1 and Y?
Grzegorz Knor
Grzegorz Knor am 5 Sep. 2011
The command
[r w] = max([corr(X(:,row),Y) corr(X(:,col),Y)])
calculates the correlation between column 2 and Y and between 1 and Y. Look:
[corr(X(:,row),Y) corr(X(:,col),Y)]
Grzegorz Knor
Grzegorz Knor am 5 Sep. 2011
For first question:
[row,col] = find(triu(R,1)>=0.5)
Then you can use function corr.
Niki
Niki am 5 Sep. 2011
please take a look at the code that I used , and if you can solve the error
Thanks
Niki
Niki am 5 Sep. 2011
if you put the command, I can accept your answer
Niki
Niki am 5 Sep. 2011
corr(X(:,unique([row;col])),Y)

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (2)

the cyclist
the cyclist am 5 Sep. 2011

0 Stimmen

It wasn't perfectly clear to me if you wanted to find correlation with Y for all the columns that had r>0.5, or only the highest. This does all of them. Maybe you could tailor this to what you need.
r = corrcoef(X); % Correlation coefficiant
[i j] = find(r>0.5); % Indices of r > 0.5
indexToOffDiagonalElementsWithHighCorrelation = (i~=j); % Only use off-diagonal elements
XColumnsWithHighCorrelation = unique(i(indexToOffDiagonalElementsWithHighCorrelation))
for nx = 1:numel(XColumnsWithHighCorrelation)
rxy{nx} = corrcoef(X(:,XColumnsWithHighCorrelation(nx)),Y);
disp(rxy{nx})
end

3 Kommentare

Niki
Niki am 5 Sep. 2011
The first problem when you perform
>>r = corrcoef(X)
then you will have like this
r =
1.0000 0.8171 0.6448
0.8171 1.0000 0.1545
0.6448 0.1545 1.0000
But I would like to have like this
r =
0 0 0
0.8171 0 0
0.6448 0.1545 0
How can I do that ?
Oleg Komarov
Oleg Komarov am 5 Sep. 2011
use triu to zero out the upper diagonal.
Niki
Niki am 5 Sep. 2011
Thanks Oleg, I did not know " Triu " :D
Andrei put a command for that, Thanks

Melden Sie sich an, um zu kommentieren.

Andrei Bobrov
Andrei Bobrov am 5 Sep. 2011

0 Stimmen

[v id]= max(triu(corrcoef([X,Y]),1))
Variant last
R = triu(corrcoef([X,Y]),1)
Rx = R(1:end-1,1:end-1)
Rx05 = Rx.*(Rx>.5)
[ix jx] = find(Rx05==max(Rx05(:)))
cYX = R([ix,jx],4)
[vXY xi]= max(cYX)

7 Kommentare

Niki
Niki am 5 Sep. 2011
Andrei
when I use of your command
[v id]= max(triu(corrcoef([X,Y]),1))
v =
0 0.8171 0.6448 0.5133
what is the output?
first is 0
second is 0.8171 (between column 1 and 2)
third is 0.6448 (between column 1 and 3)
the last is 0.5133 (???)
Andrei Bobrov
Andrei Bobrov am 5 Sep. 2011
last between X(:,1) and Y , because v(4) = 1
Niki
Niki am 5 Sep. 2011
it is 1 because the highest was column 1 and column 2
therefore if i want to also have the X(:,2) and Y. then what should I do?
Andrei Bobrov
Andrei Bobrov am 5 Sep. 2011
R = triu(corrcoef([X,Y]),1)
Rx = R(1:end-1,1:end-1)
Rx05 = Rx.*(Rx>.5)
[ix jx] = find(Rx05==max(Rx05(:)))
cYX = R([ix,jx],4)
[vXY xi]= max(cYX)
Andrei Bobrov
Andrei Bobrov am 5 Sep. 2011
X(:,2) and Y -> R(2,4)
Niki
Niki am 5 Sep. 2011
I think you can not reach to the answer with this command please check this out
for example we have
>> X=rand(10);
>> Y=rand(1,10);
then we perform
>>[R]=corrcoef(X);
then if you perform
>>[v id]= max(triu(corrcoef([X,Y]),1))
v is different with R , which I only can see one value similar, could you please tell me what is happening with this command ?
Niki
Niki am 5 Sep. 2011
Andrei, I like your comment very much, Thanks

Melden Sie sich an, um zu kommentieren.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by