GPU performance with short vectors
1 Ansicht (letzte 30 Tage)
Ältere Kommentare anzeigen
MatlabNinja
am 30 Mär. 2016
Bearbeitet: Joss Knight
am 20 Apr. 2016
Hello - I see GPU computation underperforming when used for vector manipulation with short lengths.
>> a = rand(1000000, 100,'gpuArray');
>> b= gather(a);
>> tic; for i=1:100 ; eval('q = zeros(1000000,1);for i = 1:100; q = b(:,i)+q;end') ; end;doc
Elapsed time is 45.489811 seconds.
>>tic; for i=1:100 ; eval('qq = zeros(1000000,1);for i = 1:100; q = a(:,i)+q;end') ; end;toc
Elapsed time is 0.875140 seconds.
same when done for short vectors see GPU computation under performing:
>> a = rand(200, 100,'gpuArray');
>>b= gather(a);
>> tic; for i=1:100 ; eval('q = zeros(200,1);for i = 1:100; q = b(:,i)+q;end') ; end;doc
Elapsed time is 0.021727 seconds.
>>tic; for i=1:100 ; eval('qq = zeros(200,1);for i = 1:100; q = a(:,i)+q;end') ; end;toc
Elapsed time is 0.833865 seconds.
Any insight will be appreciated.
Thank you.
0 Kommentare
Akzeptierte Antwort
Joss Knight
am 20 Apr. 2016
Bearbeitet: Joss Knight
am 20 Apr. 2016
Computation in a GPU core is significantly slower than in a modern CPU core. It makes up for that by having a lot of them - thousands. If you don't give it thousands of things to do at once, you're never going to beat the CPU.
In your simple computation above you are unnecessarily using a loop. This may have been for illustrative purposes, but if it reflects your actual code, you will gain back your performance by removing the loop, i.e.
q = sum(a, [], 2);
0 Kommentare
Weitere Antworten (1)
Siehe auch
Kategorien
Mehr zu GPU Computing finden Sie in Help Center und File Exchange
Produkte
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!