'pca' vs 'svd' or 'eig' functions

12 Ansichten (letzte 30 Tage)
Pranav Aggarwal
Pranav Aggarwal am 16 Mär. 2021
Kommentiert: Pranav Aggarwal am 18 Mär. 2021
Hi,
I am trying to generate the principal components from a set of data. However, i get an entirely different result when i use the 'pca' function compared to the 'eig' function. The 'eig' function gives the same results as the 'svd' function for my data.
I am using the raw data as input into the 'pca' function.
For 'eig' - I am calculating the correlation matrix and then using that as input into the 'eig' function.
I am very puzzled on why i get different results and would be grateful for your help! Code below:
testmat = rand(20,5);
testcorrelMat = corr(testmat);
testeig = eig(testcorrelMat);
testsvd = svd(testcorrelMat);
[testcoeff, ~, testlatent] = pca(testmat);
[sort(testsvd), sort(testeig), sort(testlatent)]

Akzeptierte Antwort

the cyclist
the cyclist am 16 Mär. 2021
You will get the same result from pca() if you standardize the input data first:
rng default
testmat = rand(20,5);
% Standardize the data
testmat = (testmat - mean(testmat))./std(testmat);
testcorrelMat = corr(testmat);
testeig = eig(testcorrelMat);
testsvd = svd(testcorrelMat);
[testcoeff, ~, testlatent] = pca(testmat);
[sort(testsvd), sort(testeig), sort(testlatent)]
ans = 5×3
0.2238 0.2238 0.2238 0.6422 0.6422 0.6422 0.8504 0.8504 0.8504 1.4606 1.4606 1.4606 1.8229 1.8229 1.8229
  2 Kommentare
Steven Lord
Steven Lord am 16 Mär. 2021
To normalize the data you can use the normalize function to normalize by 'zscore' (which is the default normalization method.)
rng default
testmat = rand(20,5);
% Standardize the data
testmat = normalize(testmat);
testcorrelMat = corr(testmat);
testeig = eig(testcorrelMat);
testsvd = svd(testcorrelMat);
[testcoeff, ~, testlatent] = pca(testmat);
results = [sort(testsvd), sort(testeig), sort(testlatent)]
results = 5×3
0.2238 0.2238 0.2238 0.6422 0.6422 0.6422 0.8504 0.8504 0.8504 1.4606 1.4606 1.4606 1.8229 1.8229 1.8229
format longg
results - results(:, 1)
ans = 5×3
0 1.11022302462516e-16 -1.94289029309402e-16 0 4.44089209850063e-16 -9.99200722162641e-16 0 -1.11022302462516e-16 3.33066907387547e-16 0 -1.33226762955019e-15 -1.55431223447522e-15 0 0 -8.88178419700125e-16
Looks pretty good to me.
Pranav Aggarwal
Pranav Aggarwal am 18 Mär. 2021
Thanks Steven and 'the cyclist' - solved!

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Dimensionality Reduction and Feature Extraction finden Sie in Help Center und File Exchange

Tags

Produkte


Version

R2017b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by