For loop indexing gpuArray
5 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
I am attempting to run a for loop around an array. For each index of the loop, I extract a vector of data, apply some operations and place it into a result matrix. A compressed example of what I am doing is below.
function test_system
ll = 2^16;
ww = 256;
ol = ll-ww+1;
%Run process on CPU.
data = rand(1,ll);
out = zeros(ww,ol);
for ii = 1:ol
subdata = data(ii:ii+ww-1);
out(:,ii) = fft(subdata);
end
%Run process on GPU
out2 = gpuArray(zeros(ww,ol));
gdata = gpuArray(data);
for ii = 1:ol
subdata = data(ii:ii+ww-1);
out2(:,ii) = fft(subdata);
end
%Run process on GPU using array fun.
out3 = arrayfun(@(x) execgpu(gdata,x),1:ol,'UniformOutput',false);
end
function ret = execgpu( data,idx )
ret = fft(data(idx:idx+256-1).');
end
- The CPU code executed in 2s.
- The GPU code executed in 17s.
- The arrayfun code executed in 350s.
This seems off. I only have the arrayfun version because when I googled around about this issue, a few links on the Mathworks website recommended that approach.
Note: the above is a representative subset of what I am attempting to do, not a complete subset. The other things being done require the above approach.
0 Kommentare
Antworten (1)
Edric Ellis
am 24 Mai 2016
Bearbeitet: Edric Ellis
am 24 Mai 2016
I think the best approach here is to vectorise your code so that you're not calling fft in a loop, nor indexing the gpuArray in a loop. (It's often relatively slow to index gpuArray data). In this case, you can vectorise by forming a matrix on which you can call fft to operate down the columns, like so:
% Parameters
ll = 2^16;
ww = 256;
ol = ll-ww+1;
% Build the input data
dataGpu = gpuArray.rand(1, ll);
% Create an index matrix that we're going to use with dataGpu
idxMat = bsxfun(@plus, (1:ww)', 0:(ol-1));
% Index dataGpu to form a matrix where each column is a sub-vector
% of dataGpu
dataGpuXform = dataGpu(idxMat);
% Make a single vectorised call to fft
out = fft(dataGpuXform);
On my rather old Tesla C2070 GPU, the fft call completes in 0.09 seconds.
0 Kommentare
Siehe auch
Kategorien
Mehr zu GPU Computing in MATLAB finden Sie in Help Center und File Exchange
Produkte
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!