Why xcorr 'coef' is used by correlation coefficients?
Ältere Kommentare anzeigen
In reviewing questions and material on xcorr, it appears to be that for autocorrelation or cross-corelation coefficients, most responses suggest using the 'coef' option in xcorr. While this does give you a value between -1 and 1, I am not sure why this option is calculated with as xcorr(a,b)/(norm(a)*norm(b)) where a and b are column vectors, in most cases I would think an unbiased correlation should be used?
In my limited understanding, it seems the correlation values should be unbiased, then normalized...
For example autocorrelation, xcorr(a,'unbiased')./var(a),
To illustrate my point, if I autocorrelate a sine function, I would expect the lagged correlation coefficient to vary between 1 and -1 every time the cycle re-alignes itself. But the 'coef' option consistently deceases the correlation coefficient with lag. I realized this is because of how it is calculated, but I don't understand why it is calculated this way? Shouldn't the unbiased approach be used?
A simple example to illustrate this question: t=0:500; n=length(t); ts=5*sin(2*pi*(t./12)); lags=-250:250; test1=xcov(ts,250,'coef'); test2=xcov(ts,250,'unbiased')./var(ts);
figure; plot(t,ts); xlabel('time'); ylabel('amplitude'); figure; plot(lags,test2,'r'); hold on; plot(lags,test1);
Akzeptierte Antwort
Weitere Antworten (4)
Brian
am 5 Dez. 2011
1 Stimme
Wayne King
am 5 Dez. 2011
Hi Brian, 'coeff' is helpful because it gives you a convenient scale to interpret the results. It's the same reason why correlation in statistics is often more useful than covariance.
If I tell you that the maximum autocorrelation between two vectors is 4500 for example, it's hard to interpret what that means. That might mean that the two vectors are nearly perfectly correlated at that lag, or it might mean that their correlation is pretty small (near zero). That's because it all depends on the units of the input vectors. The 'coeff' option, however, makes it easier to interpret. If I tell you that the maximum correlation is 0.9, then you know there is a very strong correlation at a given lag.
To keep it in the sine wave context, note:
x = cos(pi/4*(0:99));
y = 4*cos(pi/4*(0:99)-pi/2);
[c,lags] = xcorr(y,x,10);
stem(lags,c);
Note the maximum correlation at lag 2 is 200. Again, very hard to know exactly what that means without knowing more about the signals.
But:
x = cos(pi/4*(0:99));
y = 4*cos(pi/4*(0:99)-pi/2);
[c,lags] = xcorr(y,x,10,'coeff');
stem(lags,c);
Now you see exactly what it means. The two signals are basically perfectly correlated.
Wayne King
am 5 Dez. 2011
Hi Brian, that is because you have fewer and fewer terms that enter the autocorrelation sum as the lag increases. The normalization in the denominator is based on all the data in the sequence, as is the autocorrelation at zero lag. That is not the case as you increase the lag.
That's why with:
[c,lags] = xcorr(ts);
stem(lags,c);
You see the autocorrelation decay. You don't use a different normalization term at different lags, which you would have to do get 1s or -1s at all your periods as you suggest.
Brian
am 5 Dez. 2011
0 Stimmen
Kategorien
Mehr zu Correlation and Convolution finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!