Parpool time to launch excessive

30 Ansichten (letzte 30 Tage)
jgg
jgg am 28 Feb. 2016
Beantwortet: Seth am 9 Apr. 2018
Hello everyone
I'm working with a large server which is running a large computation. I've been trying to speed this up using the parallel computation toolbox, which means we need to launch and connect to a parallel pool. We've tried several ways of calling this. Right now, it looks like this:
cluster=parcluster('local');
cluster.NumWorkers=parpool_size;
parpool('local', parpool_size)
p = gcp('nocreate'); % If no pool, do not create new one.
if isempty(p)
a = 0
else
a = p.NumWorkers
end
The key line is the parpool command. The size is 32 right now, but we're hoping to scale it up to 200 or so if this works well.
Unfortunately, we cannot seem to get the parpool command to operate. It does not throw an error or crash, but is taking in excess of 4 hours to execute (at which point the job times out). Does anyone has any idea why this might be the case, or if there are any suggestions which can be taken to improve execution time?
If it is relevant, to improve speed, we are running this under the 2014b MATLAB compiler on a Linux based system (v83) but we see identical problems with 2015b as well (v90).

Akzeptierte Antwort

jgg
jgg am 30 Mär. 2016
I'm going to answer this question for myself, approximately.
It seems the issue was not specific to the MATLAB2014b compiler; we replicated the issue with 2015b and 2013b (although the number of processors was much smaller).
It seems the issue had to do with the data loading. Basically, our workflow was like this:
setUpParameters();
loadData();
createParallelPool();
doEstimation();
In this case, the loadData() stage was very large, loading in several gigs of data into memory, including a set of anonymous functions. This was fine, hardware-wise, but it seems that upon parallel pool creation, this made it very slow. We believe (but are not sure) that the creation was so slow because it was replicating the memory several hundred times. Precisely why this was slow, I am not sure.
However, we were able to resolve the problem by moving the createParallelPool() command to the beginning of our function, so the revised workflow looked like:
createParallelPool();
setUpParameters();
loadData();
doEstimation();
Because no jobs were assigned to the pool before the doEstimation() stage, this did not encounter the same problem we had earlier. Essentially, we dramatically reduced the amount of memory necessary to transfer to the other instances of Matlab.
As an aside, we ended up using MATLAB2015b because the 2014b version has great difficulty with anonymous functions in a parallel environment; it would quickly use far too much memory, then page, crashing the program.
Lessons learned:
  • Use as little data in RAM as possible when doing parallel computations, and especially avoid complex data-types. The matfile command is very useful.
  • Declare parallel pools as early as possible in your program to reduce overhead.

Weitere Antworten (2)

Jon Russo
Jon Russo am 25 Okt. 2017
I am running matlab R2017a and when I try to start a parallel pool of size 4 it takes about 30 minutes. Any ideas what might be going on?

Seth
Seth am 9 Apr. 2018
The issue regarding very slow parallel pool initiation start times have been mostly related to matlab license checking operations when the parpool command is called. Specifically, we found things like modifying if matlab is checking network license servers and local license files (or both, redundantly) was helpful in optimizing the parpool() start times, however it's still much slower than it should be.

Kategorien

Mehr zu Parallel Computing Fundamentals finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by