Is vectorized code always faster than loops? Any exceptions?
Ältere Kommentare anzeigen
[EDIT: 20110727 09:35 CDT - reformat - WDR]
I have a critical chunk of a code that has six nested for-loops. I reduced the innermost three with vectorization and I see that the vectorized version (with exact same config of everything else and same computer) takes twice the run time. I ran each of them a few times and here are the results. Any light on understanding this behaviour is appreciated. Thanks.
% fem_nought is file with loops. Fem_optimised is one with the vectorized equivalent of the innermost 3 loops.
>>fem_optimized
Elapsed time is 10.073242 seconds.
>> fem_optimized
Elapsed time is 9.588474 seconds.
>> fem_optimized
Elapsed time is 9.872822 seconds.
>> fem_nought
Elapsed time is 4.047568 seconds.
>> fem_nought
Elapsed time is 3.678311 seconds.
>> fem_nought
Elapsed time is 3.672811 seconds.
Trimmed versions of both the codes are below: (decl of a lot of variables are removed)
LOOPS version:
for k=1:nel
for ri=1:8
for si=1:8
for mn=1:4
for nm=1:4
for km=1:4
r=.5*(a*p(mn)+r1+r2);
s=.5*(b*p(nm)+s3+s2);
t=.5*(c*p(km)+t1+t5);
a1=-.02*s+0.5*r*(1-r^2)+.05*t;
a2=-.05*t-.5*s;
%...............SHAPE FUNCTUION..........................
N(1)=((r-r2)/(r1-r2))*((s-s4)/(s1-s4))*((t-t5)/(t1-t5));
N(2)=((r-r1)/(r2-r1))*((s-s3)/(s2-s3))*((t-t6)/(t2-t6));
N(3)=((r-r4)/(r3-r4))*((s-s2)/(s3-s2))*((t-t7)/(t3-t7));
N(4)=((r-r3)/(r4-r3))*((s-s1)/(s4-s1))*((t-t8)/(t4-t8));
N(5)=((r-r6)/(r5-r6))*((s-s8)/(s5-s8))*((t-t1)/(t5-t1));
N(6)=((r-r5)/(r6-r5))*((s-s7)/(s6-s7))*((t-t2)/(t6-t2));
N(7)=((r-r8)/(r7-r8))*((s-s6)/(s7-s6))*((t-t3)/(t7-t3));
N(8)=((r-r7)/(r8-r7))*((s-s5)/(s8-s5))*((t-t4)/(t8-t4));
Nr(1)=(1/(r1-r2))*((s-s4)/(s1-s4))*((t-t5)/(t1-t5));
Nr(2)=(1/(r2-r1))*((s-s3)/(s2-s3))*((t-t6)/(t2-t6));
Nr(3)=(1/(r3-r4))*((s-s2)/(s3-s2))*((t-t7)/(t3-t7));
Nr(4)=(1/(r4-r3))*((s-s1)/(s4-s1))*((t-t8)/(t4-t8));
Nr(5)=(1/(r5-r6))*((s-s8)/(s5-s8))*((t-t1)/(t5-t1));
Nr(6)=(1/(r6-r5))*((s-s7)/(s6-s7))*((t-t2)/(t6-t2));
Nr(7)=(1/(r7-r8))*((s-s6)/(s7-s6))*((t-t3)/(t7-t3));
Nr(8)=(1/(r8-r7))*((s-s5)/(s8-s5))*((t-t4)/(t8-t4));
Ns(1)=((r-r2)/(r1-r2))*(1/(s1-s4))*((t-t5)/(t1-t5));
Ns(2)=((r-r1)/(r2-r1))*(1/(s2-s3))*((t-t6)/(t2-t6));
Ns(3)=((r-r4)/(r3-r4))*(1/(s3-s2))*((t-t7)/(t3-t7));
Ns(4)=((r-r3)/(r4-r3))*(1/(s4-s1))*((t-t8)/(t4-t8));
Ns(5)=((r-r6)/(r5-r6))*(1/(s5-s8))*((t-t1)/(t5-t1));
Ns(6)=((r-r5)/(r6-r5))*(1/(s6-s7))*((t-t2)/(t6-t2));
Ns(7)=((r-r8)/(r7-r8))*(1/(s7-s6))*((t-t3)/(t7-t3));
Ns(8)=((r-r7)/(r8-r7))*(1/(s8-s5))*((t-t4)/(t8-t4));
Nt(1)=((r-r2)/(r1-r2))*((s-s4)/(s1-s4))*(1/(t1-t5));
Nt(2)=((r-r1)/(r2-r1))*((s-s3)/(s2-s3))*(1/(t2-t6));
Nt(3)=((r-r4)/(r3-r4))*((s-s2)/(s3-s2))*(1/(t3-t7));
Nt(4)=((r-r3)/(r4-r3))*((s-s1)/(s4-s1))*(1/(t4-t8));
Nt(5)=((r-r6)/(r5-r6))*((s-s8)/(s5-s8))*(1/(t5-t1));
Nt(6)=((r-r5)/(r6-r5))*((s-s7)/(s6-s7))*(1/(t6-t2));
Nt(7)=((r-r8)/(r7-r8))*((s-s6)/(s7-s6))*(1/(t7-t3));
Nt(8)=((r-r7)/(r8-r7))*((s-s5)/(s8-s5))*(1/(t8-t4));
p1(ri,si,k)=a1*N(ri)*Ns(si)*w(mn)*w(nm)*w(km)*.125*a*b*c;
p2(ri,si,k)=a2*N(ri)*Nt(si)*w(mn)*w(nm)*w(km)*.125*a*b*c;
%Elemental Stiffness Matrix......................
ke(ri,si,k) = ke(ri,si,k) + p1(ri,si,k) + p2(ri,si,k);
end
end
end
end
end
end
VECTORIZED VERSION
for k=1:nel
r=.5*(a*p(mn)+r1+r2);
s=.5*(b*p(nm)+s3+s2);
t=.5*(c*p(km)+t1+t5);
Nr = zeros(4,4,4,8);
N = zeros(4,4,4,8);
Ns = zeros(4,4,4,8);
Nt = zeros(4,4,4,8);
for ri=1:8
for si=1:8
%...............SHAPE FUNCTUION..........................
Nr(:,:,:,1)=(1/(r1-r2))*((s-s4)/(s1-s4)).*((t-t5)/(t1-t5));
Nr(:,:,:,2)=(1/(r2-r1))*((s-s3)/(s2-s3)).*((t-t6)/(t2-t6));
Nr(:,:,:,3)=(1/(r3-r4))*((s-s2)/(s3-s2)).*((t-t7)/(t3-t7));
Nr(:,:,:,4)=(1/(r4-r3))*((s-s1)/(s4-s1)).*((t-t8)/(t4-t8));
Nr(:,:,:,5)=(1/(r5-r6))*((s-s8)/(s5-s8)).*((t-t1)/(t5-t1));
Nr(:,:,:,6)=(1/(r6-r5))*((s-s7)/(s6-s7)).*((t-t2)/(t6-t2));
Nr(:,:,:,7)=(1/(r7-r8))*((s-s6)/(s7-s6)).*((t-t3)/(t7-t3));
Nr(:,:,:,8)=(1/(r8-r7))*((s-s5)/(s8-s5)).*((t-t4)/(t8-t4));
N(:,:,:,1) = (r-r2).*Nr(:,:,:,1);
N(:,:,:,2) = (r-r1).*Nr(:,:,:,2);
N(:,:,:,3) = (r-r4).*Nr(:,:,:,3);
N(:,:,:,4) = (r-r3).*Nr(:,:,:,4);
N(:,:,:,5) = (r-r6).*Nr(:,:,:,5);
N(:,:,:,6) = (r-r5).*Nr(:,:,:,6);
N(:,:,:,7) = (r-r8).*Nr(:,:,:,7);
N(:,:,:,8) = (r-r7).*Nr(:,:,:,8);
Ns(:,:,:,1) = N(:,:,:,1)./(s-s4);
Ns(:,:,:,2) = N(:,:,:,2)./(s-s3);
Ns(:,:,:,3) = N(:,:,:,3)./(s-s2);
Ns(:,:,:,4) = N(:,:,:,4)./(s-s1);
Ns(:,:,:,5) = N(:,:,:,5)./(s-s8);
Ns(:,:,:,6) = N(:,:,:,6)./(s-s7);
Ns(:,:,:,7) = N(:,:,:,7)./(s-s6);
Ns(:,:,:,8) = N(:,:,:,8)./(s-s5);
Nt(:,:,:,1) = N(:,:,:,1)./(t-t5);
Nt(:,:,:,2) = N(:,:,:,2)./(t-t6);
Nt(:,:,:,3) = N(:,:,:,3)./(t-t7);
Nt(:,:,:,4) = N(:,:,:,4)./(t-t8);
Nt(:,:,:,5) = N(:,:,:,5)./(t-t1);
Nt(:,:,:,6) = N(:,:,:,6)./(t-t2);
Nt(:,:,:,7) = N(:,:,:,7)./(t-t3);
Nt(:,:,:,8) = N(:,:,:,8)./(t-t4);
kem = .125*a*b*c * N(:,:,:,ri).*w(mn).*w(nm).*w(km) ...
.* ( (-.02*s+0.5*r.*(1-r.^2)+.05*t).*Ns(:,:,:,si) ...
+ (-.05*t-.5*s).*Nt(:,:,:,si));
ke(ri,si,k) = sum(kem(:));
%
end
end
end
Akzeptierte Antwort
Weitere Antworten (2)
Daniel Shub
am 27 Jul. 2011
0 Stimmen
I am not sure if vectorization is always faster, but loops are not as expensive as they used to be, thanks to the JIT accelerator. I would guess there might be examples were loops are faster, but I cannot think of one off the top of my head.
2 Kommentare
cr
am 27 Jul. 2011
Daniel Shub
am 27 Jul. 2011
I am not the best person to answer that. I would suggest asking it as a new question to get a good answer.
cr
am 27 Jul. 2011
0 Stimmen
Kategorien
Mehr zu Loops and Conditional Statements finden Sie in Hilfe-Center und File Exchange
Produkte
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!