Slice into gpuArray and perform functions on the GPU with arrayfun

2 Ansichten (letzte 30 Tage)
Hamad
Hamad am 11 Jan. 2015
Kommentiert: Joss Knight am 23 Feb. 2015
I would like to know how I can index into a given matrix to make pairwise combinations of column-vectors, and perform operations on these vectors - all on the GPU. So consider the simple function below:
function out = sum2Vecs(in1,in2) %in1 and in2 are (n x 1) vectors.
out = sum(in1,1) + sum(in2,1); %Output is a scalar "double".
end
Quick example: an array such as
fullMatrix = rand(3000,100);
Now I choose all pairwise column-vector combinations of "fullMatrix":
idxArray = nchoosek(1:100,2); %All possible pairwise index combinations of "fullMatrix".
nCombinations = length(idxArray);
And a simple for-loop performs the "sum2Vecs" function on each combination of two-column vectors:
for idx = 1 : nCombinations,
outArray(idx) = sum2Vecs( fullMatrix(:,idxArray(idx,1)) , fullMatrix(:,idxArray(idx,2)) );
end
Also, a parfor-loop with slicing works fine:
parfor idx = 1 : nCombinations,
in1 = fullMatrix(:,idxArray(idx,1));
in2 = fullMatrix(:,idxArray(idx,2));
outArray(idx) = sum2Vecs(in1,in2);
end
My goal is to be able to perform this loop on the GPU using e.g. "arrayfun". But I am relatively inexperienced with this, so I would appreciate any helpful pointers. What I am particularly interested in learning is how to efficiently index into an array like "fullMatrix" and send parts of it to each GPU worker efficiently.
Thanks very much. Hamad.

Antworten (1)

Matt J
Matt J am 11 Jan. 2015
Bearbeitet: Matt J am 11 Jan. 2015
In the generality that you've described, that kind of computation doesn't look like the kind of thing that's well-suited to the GPU . The GPU is for situations when you have lots of parallel tasks involving small chunks of data. The chunks in your example, two 3000x1 vectors, wouldn't likely be small enough unless the operation can be subdivided further.
For that specific example, I would probably try to vectorize on the GPU as follows,
idxArray = gpuArray( nchoosek(1:100,2).' ) ;
A= gpuArray(fullMatrix);
[m,n]=size(A);
outArray=sum( reshape(A(:,idxArray),2*m ,[]), 1 );
  4 Kommentare
Joss Knight
Joss Knight am 23 Feb. 2015
arrayfun can take a user-defined function, as long as that function carries out scalar operations. You can also index into arrays in that function as long as the array is passed in as an upvalue - see for instance here, the Mandelbrot example on this page and the Monte Carlo example here.
You need to remember that GPU cores are not like parallel workers. They cannot perform complex vector operations. Taken together, they perform complex vector operations, but not individually. In PCT a large number of complex algorithms have been implemented in such a way as to take maximum advantage of the GPU. If you are having trouble formulating your problem in a data-parallel way, then post your real code and we can have a look at whether it is inherently parallelisable. The example you gave - summing vectors - is easily vectorizable as Matt showed above.

Melden Sie sich an, um zu kommentieren.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by