Efficient way of Vectorization

7 Ansichten (letzte 30 Tage)
Nadeem Ahmed
Nadeem Ahmed am 29 Nov. 2022
Kommentiert: Matt J am 30 Nov. 2022
Hello, I searched everywhere for the efficient explaination of vectorization, I would like to know how can we use the technique of vectorization efficiently if we have this kind of problem
clc
clear
close all
n=1000;
C1=zeros(n,n);
C2=zeros(n,n);
A=rand(n,n);
B=rand(n,n);
tic
for i=2:n-1
for j=2:n-1
C1(i,j) = (A(i,j)*B(i,j-1) + A(i-1,j)*B(i+1,j-1))/(A(i,j+1)*B(i+1,j));
end
end
toc
Elapsed time is 0.049266 seconds.
%VECTORIZATION
tic
C2(2:n-1,2:n-1)=(A(2:n-1,2:n-1).*B(2:n-1,1:n-2) + A(1:n-2,2:n-1).*B(3:n,1:n-2))./(A(2:n-1,3:n).*B(3:n,2:n-1));
toc;
Elapsed time is 0.014871 seconds.
norm(C1-C2)
ans = 0
This is a very basic example, although it is showing the improvement after vectorization but not that enough. If I make more divison and multiplication in the same function, "vectorization" will become even worse than "for loop". If anybody have any suggestion regarding this, it would be very helpful for me.
  8 Kommentare
Mike Croucher
Mike Croucher am 30 Nov. 2022
Thanks. So for N,M=50, the code runs in 0.01 seconds on my machine.
Increasing to N,M=100, the code runs in 0.22 seconds
Trying N,M=200, I run out of memory on my 32Gb laptop.
What values of N and M are you interested in and how fast do you need the code to be?
Nadeem Ahmed
Nadeem Ahmed am 30 Nov. 2022
I will use this function for N,M>80 and I need to call this function more than thousands times therefore it should be of negligible time. Any suggestions are welcome.

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Matt J
Matt J am 30 Nov. 2022
Bearbeitet: Matt J am 30 Nov. 2022
Unfortunately, this is a situation where the for loop is the fastest option. This is because vectorized solution does much more memory allocation than it should. I have raised this issue with MathWorks staff, but am not sure what is being done on it.
function test
n=1000;
C1=zeros(n,n);
C2=zeros(n,n);
A=rand(n,n);
B=rand(n,n);
timeit(@()method1)
timeit(@()method2)
ans =
0.0161
ans =
0.0210
function method1
for i=2:n-1
for j=2:n-1
C1(i,j) = (A(i,j)*B(i,j-1) + A(i-1,j)*B(i+1,j-1))/(A(i,j+1)*B(i+1,j));
end
end
end
function method2
C2(2:n-1,2:n-1)=(A(2:n-1,2:n-1).*B(2:n-1,1:n-2) + A(1:n-2,2:n-1).*B(3:n,1:n-2))./(A(2:n-1,3:n).*B(3:n,2:n-1));
end
end
  9 Kommentare
Bruno Luong
Bruno Luong am 30 Nov. 2022
Bearbeitet: Bruno Luong am 30 Nov. 2022
I don't think the problem is allocating memory, but actually indexing with truncation index, which requires elements in memory to be rearranged.
I'm not surprised that to make a vectorize code as fast as the for-loop requires a big development of the internal engine (for instant using meta data that describe subarray of an array without copying the data).
Indexing is always the bottleneck of MATLAB.
Matt J
Matt J am 30 Nov. 2022
I don't think the problem is allocating memory, but actually indexing with truncation index
Not sure what a "truncation index" refers to here. In any case, the subsref operations are definitely to blame, since when we revise the test with the indexing done offline, the vectorized version is much more competitive with the loops:
function test
n=1000;
C1=zeros(n,n);
C2=zeros(n,n);
A=rand(n,n);
B=rand(n,n);
[Q1,Q2,Q3,Q4,Q5,Q6]=...
deal( A(2:n-1,2:n-1) , B(2:n-1,1:n-2), A(1:n-2,2:n-1),...
B(3:n,1:n-2), A(2:n-1,3:n), B(3:n,2:n-1) );
timeit(@()method1)
timeit(@()method2)
ans =
0.0149
ans =
0.0051
function method1
for i=2:n-1
for j=2:n-1
C1(i,j) = (A(i,j)*B(i,j-1) + A(i-1,j)*B(i+1,j-1))/(A(i,j+1)*B(i+1,j));
end
end
end
function method2
C2(2:n-1,2:n-1)=(Q1.*Q2 + Q3.*Q4)./(Q5.*Q6);
end
end

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (1)

Mike Croucher
Mike Croucher am 30 Nov. 2022
Switch the order of the loops around. It will be faster because you'll be operating on the matrix column-wise
test
loops
ans = 0.0884
loops 2
ans = 0.0203
vector
ans = 0.0928
function test
n=2000;
C1=zeros(n,n);
C2=zeros(n,n);
A=rand(n,n);
B=rand(n,n);
disp('loops')
timeit(@()loops)
disp('loops 2')
timeit(@()loops2)
disp('vector')
timeit(@()vector)
function loops
for i=2:n-1
for j=2:n-1
C1(i,j) = (A(i,j)*B(i,j-1) + A(i-1,j)*B(i+1,j-1))/(A(i,j+1)*B(i+1,j));
end
end
end
function loops2
for j=2:n-1
for i=2:n-1
C1(i,j) = (A(i,j)*B(i,j-1) + A(i-1,j)*B(i+1,j-1))/(A(i,j+1)*B(i+1,j));
end
end
end
function vector
C2(2:n-1,2:n-1)=(A(2:n-1,2:n-1).*B(2:n-1,1:n-2) + A(1:n-2,2:n-1).*B(3:n,1:n-2))./(A(2:n-1,3:n).*B(3:n,2:n-1));
end
end
  4 Kommentare
Nadeem Ahmed
Nadeem Ahmed am 30 Nov. 2022
Yes, you are right, beacsue I changed all my for loop but still I didn't get any improvement.
Dyuman Joshi
Dyuman Joshi am 30 Nov. 2022
This is neat, @Mike Croucher! Learned something new today :D

Melden Sie sich an, um zu kommentieren.

Produkte


Version

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by