Improve the speed of nested for loops through vectorization or similar methods

10 Ansichten (letzte 30 Tage)
My Matlab code has a function that is called 10^3 - 10^7 times. I'm curious if I can improve the speed of the function through vectorization or a similar method.
clc; clear all;
% Test data for function
u = rand(32,33);
Nx = 32;
Nz = 32;
Dz = rand(Nx+1,Nz+1);
u_z = zeros(Nx,Nz+1);
u_z_2 = zeros(Nx,Nz+1);
g = zeros(Nz+1,1);
% Method 1 - Original Implementation with double for loop
tic
for j=1:Nx
for ell=0:Nz
g(ell+1) = u(j,ell+1);
end
u_z(j,:) = (2.0)*Dz*g;
end
toc
% Method 2 - Remove one for loop
tic
for j=1:Nx
g=u(j,:)';
u_z_2(j,:) = (2.0)*Dz*g;
end
diff = norm(u_z - u_z_2,inf);
toc
Repeating these for loops 10,000 times gives
clc; clear all;
u = rand(32,33);
Nx = 32;
Nz = 32;
Dz = rand(Nx+1,Nz+1);
u_z = zeros(Nx,Nz+1);
u_z_2 = zeros(Nx,Nz+1);
g = zeros(Nz+1,1);
tic
for rep=1:10000
for j=1:Nx
for ell=0:Nz
g(ell+1) = u(j,ell+1);
end
u_z(j,:) = (2.0)*Dz*g;
end
end
toc
tic
for rep=1:10000
for j=1:Nx
g=u(j,:)';
u_z_2(j,:) = (2.0)*Dz*g;
end
end
toc
diff = norm(u_z - u_z_2,inf);
where the original implementation is slightly faster since the above code returns
Elapsed time is 0.771755 seconds.
Elapsed time is 1.079783 seconds.
Could the speed be improved through implementating vectorization or a similar method?

Akzeptierte Antwort

DGM
DGM am 18 Jul. 2021
Bearbeitet: DGM am 18 Jul. 2021
One big speed improvement you can do is to move the scalar multiplication of Dz outside the loop, but if you don't use a loop, it doesn't really matter.
% Test data for function (i'm using bigger arrays)
Nx = 320;
Nz = 320;
u = zeros(Nx,Nz+1);
ntests = 100; % number of test iterations to average exec time
Dz = rand(Nx+1,Nz+1);
u_z = zeros(Nx,Nz+1);
u_z_2 = zeros(Nx,Nz+1);
g = zeros(Nz+1,1);
% Method 1 - Original Implementation with double for loop
tic
for N = 1:ntests
for j=1:Nx
for ell=0:Nz
g(ell+1) = u(j,ell+1);
end
u_z(j,:) = (2.0)*Dz*g;
end
end
toc/ntests
ans = 0.0257
% Method 2 - Remove one for loop
tic
for N = 1:ntests
for j=1:Nx
g=u(j,:)';
u_z_2(j,:) = (2.0)*Dz*g;
end
end
toc/ntests
ans = 0.0227
immse(u_z,u_z_2) % result is identical
ans = 0
% simplified
tic
for N = 1:ntests
uuu = (2*Dz*u.').';
end
toc/ntests
ans = 9.1907e-04
immse(u_z,uuu) % result is identical
ans = 0
When you're trying to find out how to make things fast, it might matter how you scale the test to emphasize the execution time. Increasing the number of iterations or the size of the inputs may reveal different things. It all depends on what you expect to do.

Weitere Antworten (0)

Kategorien

Mehr zu Loops and Conditional Statements finden Sie in Help Center und File Exchange

Produkte


Version

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by