Parallel Computing Toolbox (parfor slower than for, GPU slower than CPU)
9 views (last 30 days)
- How can I measure the time transfer to workers?
More precisely when doing parfor like:
how can I measure the time transfer of the array A to each worker? here A is an array of 200x200x100
Communication overhead: The specified variable appears inside a loop within different indexing expressions. Because the indices are inconsistent across the uses of the array created by the parfor loop, MATLAB sends the entire array to each worker, resulting in high data communication overhead. For example, the following code elicits this message for c, because there are two different indexing expressions for it.
2. GPU: Is there any functions like randn, randi, randsample for the GPU?
Ex: I randomly select some indexes
index = (randi([1 Nsources],1,Nsources_mut));
pob_out(index,1,2) = poblacion(index,1,2)
funpage develops in parallel on the GPU this operation:
(here Apage gpuArray of 200x200x100, and Fs 200 x200) In this way the GPU is doing 100 multiplications in parallel
How can I perform this operation in parallel on the GPU?
This operation is equivalent to
Are the functions trace or sum(sum(.) available on pagefun? Operations (I) and (II) are too slow.
3.If I done all this calculation with Matlab 2021, are they going to run in 2017? Since I have some colleagues that only have that version.
Joss Knight on 10 Apr 2021
1) CPU/parfor: How can I measure the time transfer when doing parfor (since parfor is slower than for when calling to a part of an array).
Your snippet of code indexes variable A like this: A(:,:,i). Because i is the loop variable, this should result in only the correct slices of A going to each worker. So the premise you state that the whole array is copied is (should be) incorrect.
There isn't a way that I know of to measure the data transfer overhead independently in a parfor, since data transfer and loop execution are interleaved. You can probably infer it from wall clock timings measured by tic and toc - perhaps someone else has some tricks up their sleeve.
2) GPU: Is there any functions like randn, randi, randsample for the GPU? I need to use it to random select some cordinates of an array at each loop.
Yes. Use 'gpuArray' as an optional argument to rand, randn or randi.
3) pagefun: Includes matrix multiplication. But, how can I performe sum(sum(A)) in parallel on the GPU?
sum(sum(A)) works on the GPU. Also sum(A,'all') or sum(A,[1 2]). It's a good idea to actually try things before asking a question!