Is opening multiple Matlab engines slow?

I am running C++ code on a Linux server with 48 CPU's. In C++, I create up to 45 threads in a for loop using "#pragma omp parallel for". Each of the the threads create a pointer to a Matlab engine, pass in some variables, run a Matlab function, free up the memory, and close the engine.
If I run the snippet of code with a single thread, it takes <20 seconds to run. With 45 threads (ideally running on 45 separate CPU's), I would expect it to also take ~20 seconds to run. Instead, it takes ~10 minutes to run.
As I watch the gnome-system-monitor, I see bursts of activity when all cores are running at full throttle. (I've setup print statements to show that it is actually getting work done in Matlab as well.) But in between these bursts of activity, there are large swaths of time (i.e. minutes) where all CPU's report 0% activity.
In the C++ code below, I've set up print statements to see where the threads are at when they report 0% activity. Sometimes it happens during engOpen(). More often, it happens near engClose(). When the CPU's appear to hang, the first one might complete engOpen() after a few seconds whereas the last one might take several minutes. For engClose(), the first one might complete after a few seconds to several minutes. The last one can take up to 10 minutes. Throughout these "down times", nearly all the CPU's report 0% activity.
Why is it taking so long? Is there something inherently bad with opening up multiple Matlab engines?
A few other relevant notes:
  • I open up multiple engines because the variable names used in the function are the same for each thread. If each thread used the same Matlab engine, their variables would conflict.
  • The Matlab function is simple. It checks to see if a file exists. If it does, it opens it and checks to see if the data dimensions are correct. Then it returns.
  • I've watched the RAM during this process. It's low (around ~10% full) so I don't suspect that is a problem.
  • I've attempted to force Matlab to use a single thread (since I multithread in C++). I've tried doing this two different ways:
  1. I call engOpen("") with no input parameters and then put this line in the Matlab function: "maxNumCompThreads(1)"
  2. I call engOpen("matlab -nodesktop -nosplash -singleCompThread") and comment out the "maxNumCompThreads(1)" command in the Matlab function.
When I do this, I watch "top". I see the program spawn 45 Matlab programs...but interestingly, the "nTh" (numThreads) for each of these Matlab programs reports anywhere from a few to dozens (e.g. 5 - 60). It makes me wonder if those commands are working as intended.
Here's the C++ code snippet:
#pragma omp parallel for
for (int m = 0; m < numSimEdges; m ++){
Engine *ep; // matlab engine pointer
const char *startcmd = "matlab -nodesktop -nosplash -singleCompThread";
if (!(ep = engOpen(startcmd))) {
std::cout << "ERROR! Can't start MATLAB engine!" << std::endl;
}
// create matlab variables
mxArray *matlab_videoDir1 = mxCreateString(video_dir1.c_str());
mxArray *matlab_videoDir2 = mxCreateString(video_dir2.c_str());
mxArray *matlab_numSampledFrames = mxCreateDoubleScalar((double) numSampledFrames);
mxArray *matlab_top_k = mxCreateDoubleScalar((double) top_k);
mxArray *matlab_n_orientations = mxCreateDoubleScalar((double) n_orientations);
// place the variables into the matlab workspace
engPutVariable(ep, "videoDir1", matlab_videoDir1);
engPutVariable(ep, "videoDir2", matlab_videoDir2);
engPutVariable(ep, "numSampledFrames", matlab_numSampledFrames);
engPutVariable(ep, "top_k", matlab_top_k);
engPutVariable(ep, "n_orientations", matlab_n_orientations);
// evaluate a function
engEvalString(ep, "computed = check_similarity_between_videos(videoDir1, videoDir2, numSampledFrames, top_k, n_orientations)");
// get the computed variable
mxArray *matlab_computed = NULL;
if ((matlab_computed = engGetVariable(ep,"computed")) == NULL) {
std::cout << "Variable 'computed' doesn't exist in matlab session." << std::endl;
alreadyComputed = 0;
} else {
alreadyComputed = mxGetScalar(matlab_computed);
}
// Free memory
mxDestroyArray(matlab_videoDir1);
mxDestroyArray(matlab_videoDir2);
mxDestroyArray(matlab_numSampledFrames);
mxDestroyArray(matlab_top_k);
mxDestroyArray(matlab_n_orientations);
mxDestroyArray(matlab_computed);
// close MATLAB engine
//engClose(ep);
}
Here's the Matlab function:
function computed = check_similarity_between_videos(videoDir1, videoDir2, numSampledFrames, top_k, n_orientations)
% this sets the # of threads to 1 since we do the multithreading in c++
% maxNumCompThreads(1);
computed = 1;
[~, video1] = fileparts(videoDir1);
[root_dir, video2] = fileparts(videoDir2);
simi_dir = [root_dir, '/similarity'];
simi_file = [simi_dir, '/', video1, '__', video2, '.mat'];
% check if file exists
if ~exist(simi_file, 'file')
computed = 0;
return;
end
% load precomputed similarity
load(simi_file, 'simi');
% extract the dimensions
[numOrientationsClip1, numProposalsClip2, numProposalsClip1, numFramesClip2, numFramesClip1] = size(simi);
% check if dimensions are correct
if numFramesClip1~=numSampledFrames || numFramesClip2~=numSampledFrames || ...
numProposalsClip1 < top_k || numProposalsClip2 < top_k || ...
numOrientationsClip1<n_orientations
computed = 0;
end
end

5 Kommentare

Saurabh Gupta
Saurabh Gupta am 1 Aug. 2017
Hi Jared, I don't have an answer for your question as such, but a counter-question. Is there a specific reason for performing this task using multiple singe-threaded MATLAB instances instead of using a parfor loop in one multi-threaded MATLAB instance?
Jared Johansen
Jared Johansen am 3 Aug. 2017
I suppose I could do what you are proposing. It would just take refactoring a bunch of code.
(There is another Matlab function call that is in my real code, but not in the example above. For that function, it was trickier to pass in the variables. I could refactor that too. I guess I was hoping to keep things in the current format (since I know the code works)...but find the reason for the slow run time.)
Saurabh Gupta
Saurabh Gupta am 4 Aug. 2017
I see your point. It may be worth refactoring the code and make use parallelization implemented in MATLAB to run MATLAB functionality.
If you really want to control the execution from C/C++, one option you could explore is generating C/C++ code from MATLAB code using MATLAB Coder product, and then calling those functions directly. This will eliminate the need to invoke MATLAB in your use case.
Another alternative you could consider is running the MATLAB scripts/functions in "batch mode" instead of executing them after invoking individual MATLAB instances. The following posts may be helpful in this regard.
Mark Matusevich
Mark Matusevich am 7 Aug. 2017
I don't have an experience with MATLAB engine, but I see similarities in your question with my own experience with MATLAB Compiler of R2009b.
MATLAB Compiler Runtime allows only 1 instance per process, each call from C code to MCR (i.e. call to my MATLAB function, mxCreateString, mxGetScalar, etc.) locks this instance and executes this command. Any command from different thread waits during this time to acquire the lock. This is true even if you have "Parallel Toolbox" license. I also see bursts of 100% CPU usage, which are due to a few MATLAB functions which have integrated default multi-threading implementation (e.g. matrix multiplication).
P.S.: You should check engOpenSingleUse...
Jared Johansen
Jared Johansen am 7 Aug. 2017
Thank you Saurabh. Generating C/C++ code from Matlab code is an interesting idea that might solve the problem. I will look into that.
Thank you, Mark, for what you shared. This is essentially what I wanted to confirm (if there was something going on under the hood that was preventing this type of multithreading from working). Your experience with MCR seems to indicate that there may be. Thanks for sharing.

Melden Sie sich an, um zu kommentieren.

Antworten (0)

Kategorien

Mehr zu Startup and Shutdown finden Sie in Hilfe-Center und File Exchange

Gefragt:

am 28 Jul. 2017

Kommentiert:

am 7 Aug. 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by