Massive slowdown for Apple Silicon in computing SVD

12 Ansichten (letzte 30 Tage)
Gregory
Gregory am 5 Dez. 2024
Bearbeitet: Mike Croucher am 29 Jul. 2025
I recently notice that there is an extreme slowdown in my version of Matlab while computing an SVD when the size of the matrix crosses some threshold. I came up with the following example that demonstrates my issue:
N = [10000 11000 12000 13000];
for i = 1:4
A = randn(N(i),3);
tic;
[U,S,V] = svd(A,0);
toc;
end
When I run this in Matlab R2024b (macOS Apple silicon), the output is:
Elapsed time is 0.000396 seconds.
Elapsed time is 0.000275 seconds.
Elapsed time is 0.000264 seconds.
Elapsed time is 0.083150 seconds.
Of course the exact numbers vary trial to trial, but the speed for the last run (where N = 13000) is consistently orders of magnitude slower.
When I run this same code on Matlab R2024b (Intel processor) on the same computer, this slow down does not happen. I was able to replicate this issue across two different Macs (one with M1 and another with M3) and different versions of Matlab (going back to R2023b).
Any idea why this might be happening in the silicon version?
Edit: I'm running macOS 15.1.1

Akzeptierte Antwort

Mike Croucher
Mike Croucher am 5 Dez. 2024
Bearbeitet: Mike Croucher am 29 Jul. 2025
Update:This has now been fixed as of R2025a Update 1
Your script on my M2 in R2025a Update 1:
>> slowSVD
Elapsed time is 0.000368 seconds.
Elapsed time is 0.000253 seconds.
Elapsed time is 0.000261 seconds.
Elapsed time is 0.000290 seconds.
Compared to R2024b:
Elapsed time is 0.000408 seconds.
Elapsed time is 0.000397 seconds.
Elapsed time is 0.000278 seconds.
Elapsed time is 0.145339 seconds.
In both cases I ran the script twice and reported the 2nd runtime in order to ensure I'm not including first run costs.
Thanks again for reporting this.
[My original response is below]
Hi
I have reproduced your times on my M2 MacbookPro using R2024b using both the default BLAS and also the Apple Silicon BLAS as described in my blog post https://blogs.mathworks.com/matlab/2023/12/13/life-in-the-fast-lane-making-matlab-even-faster-on-apple-silicon-with-apple-accelerate/
I am not sure what causes this but have reported it to development.
Thanks for the report.
Mike

Weitere Antworten (1)

Heiko Weichelt
Heiko Weichelt am 21 Dez. 2024
Thanks for reporting this.
We identified the problem and are working on improving this in a future release.
As a temporary workaround, we recommend replacing:
[U,S,V]=svd(A,0);
with
[Q,R]=qr(A,"econ"); [U,S,V]=svd(R); U=Q*U;
In general, this step is not needed as the SVD performs the QR inside itself. The LAPACK library currently used on Apple Silicon, however, had suboptimal tuning parameters for this case.
On my machine, the time for the largest example improved as following:
>> tic; [U,S,V]=svd(A,0); toc
Elapsed time is 0.086851 seconds.
>> tic; [Q,R]=qr(A,"econ"); [U,S,V]=svd(R); U=Q*U; toc
Elapsed time is 0.000977 seconds.
  3 Kommentare
Heiko Weichelt
Heiko Weichelt am 26 Mär. 2025
For the initial example, we also compute U, which is of same dimension as A, i.e., tall and skinny. Your solution isn't computing that yet.
Furthmore, the condition number of A.'*A might be as bad as the square of the condition number of A itself which can cause additional trouble for EIG. So I wouldn't advice this workaround as a general solution.
P Jeffrey Ungar
P Jeffrey Ungar am 26 Mär. 2025
Bearbeitet: P Jeffrey Ungar am 26 Mär. 2025
I neglected part of the solution. Yours is more complete, but the condition number consideration is hardly a problem for a small number of vectors, even very long ones. My application is to get an orthonormal basis for a small set of long vectors that are guaranteed to be linearly independent. They are, in fact, a set of eigenvectors for a degenerate eigenvalue already obtained by the likes of eigs(). These are not guaranteed to be orthogonal. Below shows finishing the work still gives much faster performance.
The performance of svd() right now on R2025a (prerelease) makes it virtually unusable. For laughs give it a single vector of length 1000000 and watch it take 12 seconds on M4 Max! I sincerely hope this problem is addressed properly by the time it is released.
>> A = randn(10000000,10);
tm = tic(); [V,S] = eig(A.'*A); U = A*V./sqrt(diag(S).'); delt=toc(tm)
delt =
0.1727
>> tm = tic(); [Q,R] = qr(A,"econ"); [U,~,~] = svd(R); U = Q*U; toc(tm);
Elapsed time is 0.390255 seconds.
>>

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Get Started with MATLAB finden Sie in Help Center und File Exchange

Produkte


Version

R2024b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by