gpu memory code optimization
7 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Dear Wizes,
I would appreciate if you could break this: My code includes gpuArray operations inside a for loop; the relevant portion is here:
- % allocate gpu memory:
- A=GPUArray.eye(x,'single'); B=GPUArray.zeros(y,x,'single'); C=GPUArray.zeros(x,y,'single'); % x>>>y
- for n=1:t %for loop begins
- ... % not relevant, B and C are 'filled' by specific matrix multiplications
- D=B*A; % size(D)= (y,x)
- E=C*D; % size(E)= (x,x)
- A=A-E;
- clear E D
- ...
- end
I must mention that all of A,B,C,D,E are different with each iteration in the for loop as they are reused.
The problem is that x is large, and A and E are huge (2 to 7Gb, depending on x), killing my gpu. I made it run, albeit slowly, by breaking E (performing operations row-wise in A for steps 6-7 above:
for i=1:size (A,1)
E=C(i,:)*D;
A(i,:)=A(i,:)-E;
clear E D
1. This works, but is very slow, I was wondering if there is a way to calculate the same for blocks of n rows at once, not one row at a time (with n scaled based on what the gpu can take, where x=kn+p, where p<n); or using mtimesx-like bsxfun routines for matrix multiplication.
2. It would be great if A could be broken in blocks of rows or columns, or in one at a time (row-wise or column-wise), however this is above my job description, given that A is the right multiplier in step 5. This would allow me to expand the size of x I can use.
Thank you, as always Octavio
6 Kommentare
Matt J
am 15 Dez. 2014
Are none of these matrices sparse? I know that the GPU doesn't support sparse matrices, but if they are sparse, maybe the CPU is better?
Antworten (0)
Siehe auch
Kategorien
Mehr zu Loops and Conditional Statements finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!