How to use parallel.pool.Constant to save on overheard for large array constant across all calls to parfeval?
Ältere Kommentare anzeigen
I want to use parallel.pool.Constant to cut down on overhead when calling parfeval in a for loop. However, it seems that it makes no difference. To show this, I use an example that is similar to the example here, but I don't slice the data. Similar results hold if the data is sliced.
N = 80; % number of iterations to run
ppool = parpool("Processes"); % mine defaults to 8 workers
data = rand(100); % data to use at every iteration
C = parallel.pool.Constant(data);
ticBytes(ppool); % only count transfer during for loop
F(1:N) = parallel.FevalFuture;
for i = 1:N
F(i) = parfeval(ppool,@(x) sum(x,"all"),1,C.Value); % just do something with the data
end
A = fetchOutputs(F); % collect it just because
tocBytes(ppool);
This is the result from tocBytes:

Now, if I do the same but pass in data instead of the parallel constant I get the following:
N = 80; % number of iterations to run
if isempty(gcp('nocreate'))
ppool = parpool("Processes"); % mine defaults to 8 workers
end
data = rand(100); % data to use at every iteration
ticBytes(ppool); % only count transfer during for loop
F(1:N) = parallel.FevalFuture;
for i = 1:N
F(i) = parfeval(ppool,@(x) sum(x,"all"),1,data); % just do something with the data
end
A = fetchOutputs(F); % collect it just because
tocBytes(ppool);
Result from tocBytes:

The total bytes sent to the workers are identical, but I was expecting a ~10-fold change since each worker should be called about 10 times. Am I just missing the purpose of parallel.pool.Constant? Is there some other tool I should use to reduce this overhead?
2 Kommentare
Daniel Bergman
am 26 Mai 2023
Walter Roberson
am 26 Mai 2023
Suppose you are using parfor instead of parfeval(), and you have
data = rand(100);
parfor i = 1:N
some calculation involving data
some other calculation
third calculation involving data
end
then does data get sent to all of the workers once at initialation time, or does data get sent to each worker the first time it needs data but never again for the same worker? Or does data get sent each iteration? Or does it get sent multiple times per iteration?
Using parpool.Constant brings some certainty into this: the data value is transmitted to each worker the first time the worker needs data and then gets held in memory on the worker.
Now imagine that the code appears to modify data. parfor can see the constant-ness and declare it is inconsistent to modify the constant.
Now imagine that the code uses clear . To be honest, I do not know what will happen in that case.
Akzeptierte Antwort
Weitere Antworten (1)
Walter Roberson
am 26 Mai 2023
0 Stimmen
Version history
R2023a: Constant objects no longer automatically transferred to workers
MATLAB will no longer automatically transfer Constant objects from your current MATLAB session to workers in a parallel pool. MATLAB will send the Constant object to workers only if the object is required to execute your code.
===
To me that implies that each client needs to actively "pull" the value the first time it needs it. Since different workers would need it at different times, that implies to me that whatever happened historically, the entire contents of the Constant are now being copied to the workers individually . Hypothetically, in the past, there might have been some kind of "broadcast" mode that was able to send the information to all of the workers at the same time without duplicating it for each worker.
The current documentation wording does not completely rule out the possibility that the constants might be deposited into shared memory, but I doubt that is happening.
4 Kommentare
Daniel Bergman
am 26 Mai 2023
Edric Ellis
am 30 Mai 2023
That's not quite how things work in practice. The intention of transferring Constant values to the workers only when necessary is to avoid sending values to pools that don't need the value. With the advent of backgroundPool, a single MATLAB client can have multiple pool-type-entities active at any given time - so it is advantageous to send Constant contents only to pools that need the value.
The only time you ought to notice any difference is if you were relying on observable side-effects of the parallel.pool.Constant constructor function.
Daniel Bergman
am 30 Mai 2023
Edric Ellis
am 30 Mai 2023
@Daniel Bergman um... I think that example could use correcting. Thanks for pointing that out.
Kategorien
Mehr zu Background and Parallel Processing finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!