Expanding Sample Covariance Matrix

6 Ansichten (letzte 30 Tage)
Lemar DeSalis
Lemar DeSalis am 21 Aug. 2011
Hello!
I need to calculate the mean vector and the covariance matrix for sampled data. E.g. I have matrix with NumFeatures colums and NumSamples rows. I can then easily use "mean(MyMatrix)" and "cov(MyMatrix)".
However, what should I do if I want to extend the covariance matrix I got through the method described above?
So I have a covariance matrix calculated from the old samples, how can I add the influence of the new samples?
Is there an ease MATLAB-way to do that?
Thanks in advance!
  1 Kommentar
Oleg Komarov
Oleg Komarov am 21 Aug. 2011
The terminology you're using is not clear. Could you give an example.
For reference: http://www.mathworks.com/matlabcentral/answers/6200-tutorial-how-to-ask-a-question-on-answers-and-get-a-fast-answer

Melden Sie sich an, um zu kommentieren.

Antworten (2)

Lemar DeSalis
Lemar DeSalis am 22 Aug. 2011
% MyMatrix is a Matrix containing samples, in this case random data:
MyMatrix = rand( [NumSamples NumFeatures] );
% I need the mean vector and the covariance matrix:
MyMean = mean(MyMatrix);
MyCov = cov(MyMatrix);
% Now I got some new data:
MyLargerMatrix = vertcat(MyMatrix, SomeNewData);
% Calculate new values:
MyMean_New1 = mean(MyLargerMatrix)
MyCov_New1 = cov(MyLargerMatrix);
%%%%HERE IS MY QUESTION:
% But what to do, when the old data is not available anymore?
clear MyLargerMatrix, MyMatrix;
MyCov_New2 = ... ?
% How to update the covariance matrix, if you only have the old
% covariance matrix "MyMean", the number of old samples "NumSamples"
% and the new samples "SomeNewData"?
%
% MyCov_New2 should be identical to MyCov_New1, but MyCov_New2
% should be computed WITHOUT access to the old data.
% For the mean vector, this is easily possible, but how to do so for the covariance matrix?

Oleg Komarov
Oleg Komarov am 22 Aug. 2011
% Example inputs
A = rand(100,2);
B = randn(20,2);
C = [A;B];
% Sample covariances (normalized by N-1)
c1 = cov(A);
c2 = cov(B);
c3 = cov(C);
% Means
m1 = mean(A);
m2 = mean(B);
m3 = mean(C);
% Number of samples
nA = size(A,1);
nB = size(B,1);
nC = nA + nB;
% The question is: how to get c3 having only c1, c2, m1, m2?
% Keep in mind that:
  • cov(x,y) = E(xy) - E(x)E(y)
  • m3 = (m1*nA + m2*nB)/nC
  • same with E(xy)
  • cov is the sample covariance, thus we have to adjust for N-1
  • the following formula is valid for covariance only for covariance
ExEy12 = prod((m1*nA + m2*nB)/nC);
adj = nC/(nC-1);
(c1*(nA-1) + c2*(nB-1) + prod(m1)*nA + prod(m2)*nB)/nC*adj - ExEy12 * adj
c3
How to derive the variance is up to you. But you really just need paper and pencil.
  1 Kommentar
Lemar DeSalis
Lemar DeSalis am 23 Aug. 2011
Thanks, I was able to find a solution based on your code!

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Creating and Concatenating Matrices finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by