Vectorized code slower than loops?
2 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Alex Kurek
am 26 Aug. 2016
Bearbeitet: per isakson
am 5 Sep. 2016
This question is a bit an offspring from an other one, but I have the following two codes:
maxN = 100;
levels = maxN+1;
xElements = 101;
umn = complex(zeros(levels, levels)); % cleaning
bessels = ones(1201, 1201, 101); % 1.09 GB
negMcontainer = ones(1201, 1201, 100);
posMcontainer = negMcontainer;
tic
for j = 1 : xElements
for i = 1 : xElements
for n = 1 : 2 : maxN
nn = n + 1;
mm = 1;
m = 1:2:n;
numOfEl = ceil(n/2);
umn(nn, mm:mm+numOfEl-1) = bessels(i, j, nn) * posMcontainer(i, j, m);
end
end
end
toc
tic
for j = 1 : xElements
for i = 1 : xElements
for n = 1 : 2 : maxN
nn = n + 1;
mm = 1;
for m = 1 : 2 : n
umn(nn, mm) = bessels(i, j, nn) * posMcontainer(i, j, m);
mm = mm + 1;
end
end
end
end
toc
And it tourns out, that loops version is faste >2x. Why is that so? I know that i happens if vectorization requiers large temporary variables, but (it seems) it is not true here.
And generally, what (other than parfor) can I do to speed up this code?
Best regards, Alex
1 Kommentar
Alexandra Harkai
am 2 Sep. 2016
Not sure about the speedup possibilities just yet, but regarding the vectorisation, this may be helpful in seeing where the vector/loop implementations make a difference: http://www.matlabtips.com/matlab-is-no-longer-slow-at-for-loops/
Akzeptierte Antwort
per isakson
am 2 Sep. 2016
Bearbeitet: per isakson
am 3 Sep. 2016
Given
- Matlab stores matrices in column-major order.
- bessels and posMcontainer are both large
Possibly the transport of data between the memory and the cpu will be more efficient (the caches will work better) if
umn(nn, mm:mm+numOfEl-1) = bessels(i, j, nn) * posMcontainer(i, j, m);
was replaced by
umn(mm:mm+numOfEl-1,nn) = bessels(nn, i, j) * posMcontainer(m, i, j);
The same should apply to the "all-for-loop-case".
 
result =runperf('NestedLoops.m');
fullTable = vertcat(result.Samples);
varfun(@mean,fullTable,'InputVariables' ...
,'MeasuredTime','GroupingVariables','Name')
ans =
Name GroupCount mean_MeasuredTime
__________________ __________ _________________
NestedLoops/test 4 1.3266
NestedLoops/test_1 4 0.88148
NestedLoops/test_2 4 0.49775
where NestedLoops.m contains
X=rand(100,100,2000);
for ii=1:100
for jj=1:100
X(ii,jj,:)=10*X(ii,jj,:);
end
end
X=rand(100,100,2000);
for jj=1:100
for ii=1:100
X(ii,jj,:)=10*X(ii,jj,:);
end
end
X=rand(2000,100,100);
for jj=1:100
for ii=1:100
X(:,ii,jj)=10*X(:,ii,jj);
end
end
The "differences" between the "cases" are actually larger, since
>> tic, X=rand(100,100,2000);, toc
Elapsed time is 0.355542 seconds.
6 Kommentare
per isakson
am 3 Sep. 2016
Bearbeitet: per isakson
am 5 Sep. 2016
Thanks, but TLNR.
Neither do I, however I get the impression that Coder switches the order of the loops to account for the difference in major order.
"slowed down a bit in .mex"   Now, I believe that one should code for column-major in Matlab and that Coder adapts the C-code to row-major. However, it puzzles me that the difference in C is only "a bit", since in Matlab it's significant.
Weitere Antworten (0)
Siehe auch
Kategorien
Mehr zu Logical finden Sie in Help Center und File Exchange
Produkte
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!