Calculating principal component scores from principal component coefficients of the new data

28 Ansichten (letzte 30 Tage)
Hi all,
I perfomed a PCA on dataset using the function
[coeff,score,latent,~,explained,mu]=pca(TrainingSet.X);
Then I generated new shapes (in the cartesian space) using a reduced number of principal components. Now I need to the principal component scores for these new shapes, but I can't figure out how!
Based on the fact that the original centered training data can be retrieved using
centeredData= score*coeff'
I used the following statements, which did not generate relevant results.
for i= 1:newShapesNum
newShapeScore(i,:)=newShape(i,:)*pinv(coeff(:,1:shapeModesNum)'); % i is the counter of new (generated) observations.
newSvalid=newShapeScore(i,:)*coeff(:,1:shapeModesNum)';
end
UPDATE
I also tried running a pca analysis on the new instances, and requested [score] and [coeff]. The mean shape looked good but using the centeredData formula above did not regenerate the original shape! I don't understand why though..
I'd appreciate your help in finding the principal component scores for the new shapes.
Many thanks
Amin
  2 Kommentare
Aditya Patil
Aditya Patil am 11 Mai 2021
Can you elaborate on the issue? Are you trying to convert new data as per the pca transformation? Or is the issue that pca transformation of new data is leading to poor results?
Amin Kassab-Bachi
Amin Kassab-Bachi am 11 Mai 2021
Thanks for responding. Actually I'm creating new instances with good quality. But it's my first time working with PCA so I'm not familiar with the terms. The new instances (in cartesian space) are created from randomly generated standard deviation values. I'm trying to recover their scores in principal component space because I need to correlate the scores to some output from another analysis later on. After many tests I finally got to the conclusion that scores are the standard deviation values I used. So for each principal component, for each new instance, I saved the generated SD [i.e. a random weight×sqrt(latent)]. Hopefully you can confirm this is correct.
Thanks

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Aditya Patil
Aditya Patil am 12 Mai 2021
To get the scores for new data, you need to first get the outputs mu and coeff.
X = rand(100, 5);
XTrain = X(1:75, :)
XTrain = 75×5
0.1441 0.3071 0.3775 0.8840 0.6683 0.8057 0.3544 0.5524 0.7381 0.9861 0.7959 0.0033 0.3544 0.6425 0.4665 0.9191 0.7689 0.0454 0.1116 0.5821 0.7176 0.1236 0.6015 0.8224 0.3409 0.2391 0.1492 0.9006 0.5579 0.6631 0.1738 0.4541 0.5185 0.6817 0.8653 0.6194 0.2851 0.5203 0.8938 0.2486 0.0550 0.3670 0.9562 0.1952 0.4238 0.2783 0.3371 0.4914 0.6739 0.2944
XTest = X(76:100,:)
XTest = 25×5
0.4050 0.8916 0.0311 0.9368 0.4693 0.4280 0.2849 0.0614 0.1172 0.3371 0.9347 0.9498 0.3593 0.3842 0.0361 0.6781 0.4363 0.2563 0.5025 0.2534 0.6973 0.2147 0.0580 0.2153 0.6004 0.9774 0.1824 0.5365 0.0387 0.3407 0.6281 0.8394 0.6062 0.0771 0.7966 0.1263 0.8900 0.5766 0.7521 0.1489 0.4293 0.8312 0.9448 0.5362 0.1901 0.4643 0.9553 0.6214 0.8245 0.4738
[coeff,scoreTrain,~,~,explained,mu] = pca(XTrain);
Now, to apply the same transformation, that is to get scores for new data, apply the following equation.
idx = 3; % Keep 3 principal components
scoreTest = (XTest-mu)*coeff(:,1:idx)
scoreTest = 25×3
0.1243 0.3578 0.3699 0.2510 -0.1932 -0.3583 0.5351 -0.2519 0.0646 0.1803 -0.2631 0.0597 0.3561 -0.1946 -0.0985 0.3395 -0.6057 -0.2079 0.3735 0.2247 -0.2527 -0.2488 0.1930 -0.0451 -0.1706 -0.0489 -0.1127 -0.0553 0.2642 0.2388
For more details, see the Apply PCA to New Data and Generate C/C++ Code documentation.
  1 Kommentar
Amin Kassab-Bachi
Amin Kassab-Bachi am 12 Mai 2021
Bearbeitet: Amin Kassab-Bachi am 12 Mai 2021
Thank you very much. This also confirmed what I calculated was correct. When testing my results previously I did not include mu, so the results did not look like anything useful! But now it's all starting to make more sense. Thanks.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Dimensionality Reduction and Feature Extraction finden Sie in Help Center und File Exchange

Produkte

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by