# Suggestions for vectorizing double/triple for loops in Matlab

3 views (last 30 days)
Matthew Kehoe on 13 Jul 2021
Edited: Jan on 13 Jul 2021
My Matlab code has multiple double/triple for loops which I believe cause a bottleneck when the index summation variables reach large values. I've tried to implement vectorization instead of double/triple for loops with some small performance gains. I'm interested in understanding best practices for changing double/triple for loops into vectorized Matlab code.
My real code includes 10-15 functions that implement variations of double/triple for loops. I'm curious on how I could improve or rewrite these for loops as vectorized code. I've included two examples below of a double for loop and a triple for loop. I'm interested in seeing if these could be improved. I plan to rewrite all of the for loops as vectorized code (if possible).
First example: This is a double for loop. Implementating the builtin vectorization in Matlab appears to increase the computational speed. I'm not sure if this could be done more efficiently.
close all;
clear all;
clc;
% Test 1: Small M,N
M = 50;
N = 50;
c = rand(N,M)+i*rand(N,M);
eps = .3;
delt = .4;
tStart = tic;
polyv = 0;
for n = 1:N
for m = 1:M
polyv = polyv + c(n,m)*eps^n*delt^m;
end
end
tEnd = toc(tStart);
fprintf('Testing against a double for loop \n');
fprintf('First test with small N,M\n');
fprintf('Total time by double for loop is %f seconds \n', tEnd);
% This may be a sloppy way of vectorizing the code. I'm not sure if there
% is a more efficient method
tStart = tic;
[nval,mval] = ndgrid(1:N,1:M);
vander = eps.^nval.*delt.^mval;
polyv1 = sum(sum(c.*vander));
tEnd = toc(tStart);
fprintf('Total time by vectorization is %f seconds \n', tEnd);
diff = norm(polyv1 - polyv,inf); % should be tiny
% Test 2: "Large" M,N
M = 1000;
N = 1000;
c = rand(N,M)+i*rand(N,M);
eps = .3;
delt = .4;
tStart = tic;
polyv = 0;
for n = 1:N
for m = 1:M
polyv = polyv + c(n,m)*eps^n*delt^m;
end
end
tEnd = toc(tStart);
fprintf('Second test with large N,M\n');
fprintf('Total time by double for loop is %f seconds \n', tEnd);
tStart = tic;
[nval,mval] = ndgrid(1:N,1:M);
vander = eps.^nval.*delt.^mval;
polyv1 = sum(sum(c.*vander));
tEnd = toc(tStart);
fprintf('Total time by vectorization is %f seconds \n', tEnd);
diff2 = norm(polyv1 - polyv,inf); % should be tiny
These two test return
Testing against a double for loop
First test with small N,M
Total time by double for loop is 0.008831 seconds
Total time by vectorization is 0.007447 seconds
Second test with large N,M
Total time by double for loop is 0.393363 seconds
Total time by vectorization is 0.163277 seconds
So there seems to be a small performance gain with the second method.
Second example: This is a triple for loop. I have created test data (my actual Matlab code code is far more complicated but I'm really only interested in improving performance).
close all;
clear all;
clc;
% Test data
N = 20;
M = 20;
N_Eps = 100;
N_delta = 100;
Nx = 32;
ru = zeros(N_Eps,N_delta);
rl = zeros(N_Eps,N_delta);
alpha = 0;
B = zeros(Nx,1);
C = zeros(Nx,1);
alpha_p = ones(Nx,1);
k_u = 2;
k_w = 1;
tStart = tic;
for ell=1:N_delta
for j=1:N_Eps
for p=1:N
% Real code is similar - this is just test data
if(alpha_p(p)^2 < k_u^2)
B(p) = 1;
end
if(alpha_p(p)^2 < k_w^2)
C(p) = 0;
end
end
ru(j,ell) = sum(B);
rl(j,ell) = sum(C);
end
end
tEnd = toc(tStart);
fprintf('Total time by triple for loop is %f seconds \n', tEnd);
% This has more meaning in my real Matlab code. I'm only inserting in test
% data to improve performance.
ee = 1.0 - ru - rl;
This returns
Total time by triple for loop is 0.013758 seconds
I'm not sure how to vectorize this code through a method similar to the first example.
Do you recommend (in general) vectorization over writing double/triple for loops? Should you always try to use vectorization and avoid writing any double/triple for loops? I'm curious what more experienced Matlab users do.

Jan on 13 Jul 2021
Edited: Jan on 13 Jul 2021
The power operation is very expensive, so avoid it whenever it is possible:
M = 50;
N = 50;
c = rand(N,M) + i*rand(N,M);
epsilon = 0.3; % Do not shadow the importan function "eps"
delt = 0.4;
tic;
for rep = 1:1000
polyv = 0;
for n = 1:N
for m = 1:M
polyv = polyv + c(n,m) * epsilon^n * delt^m;
end
end
end
toc
Elapsed time is 0.545244 seconds.
tic
for rep = 1:1000
polyv = 0;
c1 = 1;
for n = 1:N
c1 = c1 * epsilon;
c2 = 1;
for m = 1:M
c2 = c2 * delt;
polyv = polyv + c(n,m) * c1 * c2;
end
end
end
toc
Elapsed time is 0.218668 seconds.
tic
for rep = 1:1000
[nval,mval] = ndgrid(1:N,1:M);
vander = epsilon.^nval .* delt.^mval;
polyv1 = sum(sum(c .* vander));
end
toc
Elapsed time is 0.348249 seconds.
It has some severe drawbacks: It contains a lot of power operations, but they are called for the same inputs repeatedly. Avoid ndgrid, but calculate the powers of the vectors and create the matrix afterwards:
for rep = 1:1000
vander = (epsilon .^ (1:N).') .* (delt .^ (1:M));
polyv1 = sum(sum(c .* vander));
end
toc
Elapsed time is 0.421993 seconds.
But you can avoid the power operation by using cumprod():
tic
for rep = 1:1000
c1 = cumprod(repmat(epsilon, N, 1));
c2 = cumprod(repmat(delt, 1, M));
polyv = sum(c .* c1 .* c2, 'all');
end
toc
Elapsed time is 0.049699 seconds.
Please run this locally again to check the timings.
In your 2nd example, the repeated squaring of alpha_p, k_u and k_u is a waste of time. Do this once before the loops.
For real values, a^2 < b^2 is equivalent to abs(a) < abs(b).
A vectorized version of:
% Original:
for p=1:N
if(alpha_p(p)^2 < k_u^2)
B(p) = 1;
end
if(alpha_p(p)^2 < k_w^2)
C(p) = 0;
end
end
% Vectorized, alpha_p_2 = alpha_p .^ 2 etc. defined before the loops
B(alpha_p_2 < k_u_2) = 1;
C(alpha_p_2 < k_w_2) = 0;

R2020a

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!