Why Matlab becomes 100 time slower using vector?

2 Ansichten (letzte 30 Tage)
Yu Ma
Yu Ma am 9 Nov. 2020
Bearbeitet: per isakson am 21 Nov. 2020
Hello,
I'm dealing with interactions between particles for simulation. The 2D coordinate of N particles was record in matrix "pos" (n*2) matrix, so did the force matrix. However, the program run 100 time slower if using this matrix comparing with separated x,y and Forcex, Forcey vector. For example for a small system with 100 particles for 50 loops, the matrix one cost 0.2s, while the vector on cost only 0.002s.
I am wondering the reason. And how to solve it with compact code. Thanks a lot!
matrix code(100 times slower),took about 8s
clear;
N=100;
D = 0.2;
L = 0.8;
pos = zeros(N,2);
Force = zeros (N,2);
pos = rand(N,2);
tic
for n = 1:1000
for I=1:N
for J=I+1:N
dpos = pos(J,:) - pos(I,:);
dpos = dpos - round(dpos/L)*L;
d = sqrt(dpos * dpos');
if d < D
F = -100 * (D/d-1);
Force(I,:) = Force(I,:) + F * dpos;
Force(J,:) = Force(J,:) - F * dpos;
end
end
end
end
toc
vector code(complex but fast), took about 0.2s
clear
N=100;
D = 0.2;
L = 0.8;
x = zeros(N,1);
y = zeros(N,1);
Forcex = zeros(N,1);
Forcey = zeros(N,1);
x = rand(N,1);
y = rand(N,1);
tic
for n = 1:1000
for I=1:N
for J=I+1:N
dy=y(I)-y(J);
dy=dy-round(dy/L)*L;
if(abs(dy)<D)
dx=x(I)-x(J);
dx=dx-round(dx/L)*L;
d=dx^2+dy^2;
if(d<D^2)
d=sqrt(d);
F=-100*(D/d-1);
Forcex(J)=Forcex(J)+F*dx;
Forcex(I)=Forcex(I)-F*dx;
Forcey(J)=Forcey(J)+F*dy;
Forcey(I)=Forcey(I)-F*dy;
end
end
end
end
end
toc
  4 Kommentare
Yu Ma
Yu Ma am 9 Nov. 2020
Bearbeitet: Yu Ma am 9 Nov. 2020
the problem could be even shorter, why code 1 took 100 times longer time than code 2, and how to solve it?
%code 1, took 5s.
clear;
N=100;
pos = zeros(N,2);
pos = rand(N,2);
dpos = zeros (N,2);
tic
for n = 1:1000
for I=1:N
for J=I+1:N
dpos = pos(J,:) - pos(I,:);
end
end
end
toc
%code 2, took 0.05s
clear
N=100;
x = zeros(N,1);
y = zeros(N,1);
x = rand(N,1);
y = rand(N,1);
tic
for n = 1:1000
for I=1:N
for J=I+1:N
dy=y(I)-y(J);
dx=x(I)-x(J);
end
end
end
toc
Yu Ma
Yu Ma am 9 Nov. 2020
The problem could be even sever for n*2 matrix, if n much larger than the other dimension. for example:
%300 times slower
clear;
N=1000;
pos = zeros(N,2);
tic
for n = 1:10
for I=1:N
for J=1:N
pos(J,:)+1;
end
end
end
toc
clear
N=1000;
x = zeros(N,1);
y = zeros(N,1);
tic
for n = 1:10
for I=1:N
for J=1:N
dy=y(I)+1;
dx=x(I)+1;
end
end
end
toc

Melden Sie sich an, um zu kommentieren.

Antworten (2)

per isakson
per isakson am 9 Nov. 2020
Bearbeitet: per isakson am 10 Nov. 2020
I've edited my answer,
  • chosen better names
  • rerun the tests
  • deleted a profiler result, which I cannot reproduce.
I don't know. It's weird. R2018b, Win10, i7, 32GB.
I put your code into two functions, f2D and fXY. (I removed the statements clear.) Functions used to be much faster than scripts, but scripts are catching up.
>> fXY,f2D
Elapsed time is 0.148192 seconds.
Elapsed time is 1.219929 seconds.
>> fXY,f2D
Elapsed time is 0.138769 seconds.
Elapsed time is 1.214929 seconds.
And I created two scripts, sXY and s2D.
>> sXY, s2D
Elapsed time is 0.148508 seconds.
Elapsed time is 9.024698 seconds.
>> sXY, s2D
Elapsed time is 0.155476 seconds.
Elapsed time is 11.371648 seconds.
It's something with the 2D code in a script, which causes Matlab problems. Tech support might be interested.
One more thing:
I don't think it matters much in this case, but setting the seed to generate identical random numbers should help the timing.
Added a day later:
I repeated my timing test with the minimal working example of OP's comment
(I added the lhs to dpos = pos(J,:)+1;)
>> fXY_mwe, f2D_mwe
Elapsed time is 0.006180 seconds.
Elapsed time is 0.292015 seconds.
>> fXY_mwe, f2D_mwe
Elapsed time is 0.005649 seconds.
Elapsed time is 0.292584 seconds.
and scripts
>> sXY_mwe, s2D_mwe
Elapsed time is 0.019874 seconds.
Elapsed time is 6.370809 seconds.
>> sXY_mwe, s2D_mwe
Elapsed time is 0.027477 seconds.
Elapsed time is 5.744652 seconds.
My results confirms OP's results.
With the XY_mwe code the function is four times faster than the script and with the 2D_mwe code twenty times faster.
  3 Kommentare
Yu Ma
Yu Ma am 10 Nov. 2020
Thanks for your test. Any clue for the reason of such huge difference?
per isakson
per isakson am 21 Nov. 2020
Bearbeitet: per isakson am 21 Nov. 2020
Matlab has an Execution Engine, which uses just in time compilation (JIT) technique. The inner working of this engine is a secret of The MathWorks. Occasionally, they make presentations, e.g. Run Code Faster With the New MATLAB Execution Engine.
They recommend that we don't try to optimize our code based on experiments with the engine. What appears to be a good idea with one release might not be that good with the next. (Or something like that, I didn't find a reference.)
There are example of huge differences
The MathWorks calls for slow real life code examples, which would benefit significantly from better performance of the engine.
No, I haven't a clue, but I can speculate that the vector and the 2D cases uses different code-branches and that The MathWorks doesn't see the performance difference as a large problem.

Melden Sie sich an, um zu kommentieren.


Bruno Luong
Bruno Luong am 9 Nov. 2020
Bearbeitet: Bruno Luong am 9 Nov. 2020
I guess for-loop withh vector benefits MATLAB jit accelerator.
But here is a faster method if you insist on using matrix.
function benchtorusforce
N=100;
D = 0.2;
L = 0.8;
Force = zeros (N,2);
pos = rand(N,2);
ntest = 1000;
tic
for n = 1:ntest
for I=1:N
for J=I+1:N
dpos = pos(J,:) - pos(I,:);
dpos = dpos - round(dpos/L)*L;
d = sqrt(dpos * dpos');
if d < D
F = -100 * (D/d-1);
Force(I,:) = Force(I,:) + F * dpos;
Force(J,:) = Force(J,:) - F * dpos;
end
end
end
end
toc % Elapsed time is 1.213926 seconds.
% New method
tic
for n = 1:ntest
pos1 = reshape(pos,[N 1 2]);
pos2 = reshape(pos,[1 N 2]);
dpos = pos1-pos2;
dpos = dpos - round(dpos/L)*L;
d = sqrt(sum(dpos.^2,3));
F = -100 * (D./d-1);
F(d >= D) = 0;
F = triu(F,1);
Fd = F.*dpos;
Force = -permute(sum(Fd,2),[2 1 3])+sum(Fd,1);
end
Force = reshape(Force,[N 2]);
toc % Elapsed time is 0.292761 seconds.
  3 Kommentare
Bruno Luong
Bruno Luong am 10 Nov. 2020
I know that vecnorm has bad performance. I also don't rarely use dot for the same reason.
Yu Ma
Yu Ma am 10 Nov. 2020
Nice program, big hug!!

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Particle & Nuclear Physics finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by