Calculating the mean and standard deviation of a struct array (contains 7000 1*2000 array). GPU Arrayfun computation is much slower than CPU.
Ältere Kommentare anzeigen
I created an array (size 7000 * 2000), and hope to calculate the mean and standard deviation of each row of the array (not the value for the whole array). Hence, my desired output is 2 arrays which contains the mean and SD of each row (but not a single value).
I first transorm it into a GPU array, and then turn the data array into a struct array. Each array inside the Onestream is a 1*2000 array.
function OneStream = DisassembleArray(data)
tic
[numRows, numCols] = size(data);
for count = 1 : numRows
OneStream(count).f1 = data(count, :);
end
fprintf("Disassemble Array timing: %.5f seconds \n" , toc)
The mean and are calculated in GPU by arrayfun below.
clear all
data = normrnd (20, 1, [7000, 2000]);
tic
data= gpuArray (data);
fprintf("Moving array into GPU timing: %.3f seconds \n" , toc)
OneStream = DisassembleArray (data);
tic
dataMetrics.mean = arrayfun(@(x) mean(x.f1),OneStream);
fprintf("GPU Arrayfun mean timing: %.5f seconds \n" , toc)
tic
dataMetrics.SD = arrayfun(@(x) std(x.f1),OneStream);
fprintf("GPU Arrayfun SD timing: %.5f seconds \n" , toc)
I use the tic toc functions to count the time needed for the computation
Moving array into GPU timing: 0.536 seconds
Disassemble Array timing: 0.22666 seconds
GPU Arrayfun SD timing: 1.75950 seconds
GPU Arrayfun SD timing: 3.54295 seconds
I have another version of CPU code, which gives the same output as well.
clear all
data = normrnd (20, 1, [7000, 2000]);
tic;
% array size
dataMetrics.mean = zeros(7000, 1);
dataMetrics.SD = zeros(7000,1);
for countEnsembleSize = 1 : 7000
dataMetrics.mean(countEnsembleSize)= mean(data(countEnsembleSize, :));
dataMetrics.SD(countEnsembleSize) = std(data(countEnsembleSize, :));
end
fprintf("\n****Finding mean and std****\nFor loop timing: %.5f seconds \n" , toc)
This is the time needed for the CPU computation
Disassemble Array timing: 0.05250 seconds
CPU mean timing: 0.01380 seconds
CPU SD timing: 0.03785 seconds
GPU is supposed to be much faster than CPU, but in my case, GPU is 100 times slower (0.05s vs >5s).
I think there are some errors in my code, but I have no clue where.
Would you mind helping me? I am very grateful for your help.
Akzeptierte Antwort
Weitere Antworten (0)
Kategorien
Mehr zu GPU Computing finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!