In parfor-loop, can I call a multi-threaded mex and get some speed-up?

9 Ansichten (letzte 30 Tage)
I learned the concept of multi-threaded mex from undocumentedmatlab. (It seems this website is unaccessible now ...)
I am wondering if I can call a multi-threaded mex in parfor-loop.
My current code looks like
parfor k=1:1e6
result(k) = mex_wrapper(data(k));
end
mex_wrapper.c looks like
double calculate()
{
int N=50;
for (i=0;i<N;i++)
{
//...
}
}
void mexFunction()
{
calculate();
}
The iterations inside calculate() are independent, so I want to change the sub-routine calculate() to support multi-thread.
Although I am running parfor in process-based-environments, I am not sure if multi-threaded mex would confict with parfor.
So can I use multi-threaded mex in parfor? And would I get some speed-up by doing so?

Akzeptierte Antwort

Edric Ellis
Edric Ellis am 4 Jan. 2021
You should be able to run a multi-threaded MEX file correctly inside a parfor loop. However, you will be oversubscribing your machine. For example, if your machine has 6 cores, your parfor loop will run 6 copies of your MEX function simultaneously. If each of those uses 6 threads each, you will have 36 threads active on your machine. This should work, but it will probably be less efficient than having a single-threaded MEX function. (A multithreaded MEX function inside parfor can be more useful when you have a cluster of machines - there, you might run single worker process per machine, and have each machine run the multithreaded MEX function).
  3 Kommentare
Edric Ellis
Edric Ellis am 4 Jan. 2021
Yes, "node" and "machine" are the same thing.
Generally, hyperthreading doesn't actually offer much practical benefit for most MATLAB computations, which is why maxNumCompThreads returns the number of physical cores. This the number of computational threads MATLAB uses for operations like fft . It's also the default number of processes parpool('local') will launch.
"Oversubscription" is the key concept here - essentially this is about how many operations you're asking a given node to perform simultaneously (i.e. how many threads are running on the node). If you ask a node to perform more operations simultaneously than it has hardware cores, then some of those operations must wait. So, if you run parpool('local'), each worker process will (by default) have a single computational thread. This will fully occupy the CPU of that node. If your MEX file runs multiple threads, then you will have more active threads on the node than the node can support simultaneously, and so those threads will have to share the cores on the node. As I said, this should work, but it will not get you any additional performance, because the node's hardware was already being fully occupied by the single-threaded version.
Xingwang Yong
Xingwang Yong am 4 Jan. 2021
Thanks, Edric, I got your points. I tried to use maxNumCompThreads() in parfor-loop, it returns nothing. So this is infeasible.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Parallel for-Loops (parfor) finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by