Parfor and data copy to workers

1 Ansicht (letzte 30 Tage)
Giovanni De Luca
Giovanni De Luca am 11 Dez. 2013
Bearbeitet: Matt J am 11 Dez. 2013
Hello,
I read this brief explanation about when MATLAB does a copy of data when working with the PCT in
but I would need a more clear answer, if possible. Assuming I have 2 available parallel workers, and I want to perform a time-expensive function fnc on a large sparse matrix A with dimension [10^5x10^5] and B with dimension [10^5x10^2], having a vector vec with scalar values and dimensione [1x1000], and the function return a positive scalar scal:
function scal=fnc(A,B,elem)
X=(elem*A)\B; % most expensive routine
% do something on X
scal=...
end
Then,
parfor i=1:size(vec,2) % 1000 cycles
scal(1,i)=fnc(A,B,vec(1,i));
end
s_max=max(scal(1,:));
My question is: how many times the data-copy is done, having 2 available workers and the loop index=1000? The point is that, if I split vec in 2 parts (as the number of available workers), i.e.
vec_new=[vec(1,1:500);vec(1,501:1000)]; % [2x500] matrix
and I slightly modify the function fnc, creating a new function fnc_new:
function scal=fnc_new(A,B,vec_elem)
scal=0;
for i=1:size(vec_elem,2)
X=(elem*A)\B;
% do something on X
scal_temp=...
end
scal=max(scal,scal_temp);
end
and
parfor i=1:size(vec_new,1) % 2 cycles
scal(1,i)=fnc_new(A,vec_new(i,:));
end
s_max=max(scal(1,:));
the two approach ( fnc plus first parfor, and fnc_new plus second parfor ) provide the same final results s_max , but I experienced a further speedup on the second one, I wonder if it's a data-copy issue. I hope I was clear. Thank you in advance.
  1 Kommentar
Matt J
Matt J am 11 Dez. 2013
Bearbeitet: Matt J am 11 Dez. 2013
It looks quite sub-optimal to be doing
X=(elem*A\B)
repeatedly in a loop for the same A and B. You should really be pre-computing X=A\B once. Then inside fnc() do
function scal=fnc(X,elem)
X=elem*X; % most expensive routine
% do something on X
scal=...
end
Once you do this, the rest of your question might be irrelevant, since you noted that X=(elem*A)\B is your bottleneck anyway. I can't see why the second version would be faster, though, not from the detail that your code provides.
Other than that, you appear to have a typo where you call
scal(1,i)=fnc_new(A,vec_new(i,:));
with only two arguments. Shouldn't it be
scal(1,i)=fnc_new(A,B,vec_new(i,:));

Melden Sie sich an, um zu kommentieren.

Antworten (0)

Kategorien

Mehr zu Parallel for-Loops (parfor) finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by