MATLAB Answers

Parallel computing with shared variables, problem with struct

40 views (last 30 days)
Hi all,
I need to parallelize a code that has four nested for-loops inside which a script runs (tau_calc), that calls other scripts (like tau_ADP_v2) according to input information. These scripts need to have access to the whole workspace that has around 30 variables plus a large struct ‘state_ID’ (2 to 3 Gb).
I should parallelize on the id_E index, or [id_E,id_n] , but I cannot figure out how to pass everything to the parfor, especially the large struct and how to save temporary variables to write the state_ID struct. I understand that inside a parfor it cannot be written in the separate workers. The two scripts I attach are working correctly in serial version.
I’m in an impasse and cannot get out of it. I really need of support…


Sign in to comment.

Accepted Answer

Edric Ellis
Edric Ellis on 18 Jul 2019
I must admit I didn't look at your code in great detail - but I did get the distinct impression that there's a lot going on there. The script tau_calc_short has a very high degree of "cyclomatic complexity" - in other words, it has lots of deeply nested control structures. The script tau_ADP_v2 has quite a few copies of near-identical computations which again are highly complex.
Now, none of that means that you can't run that stuff as one giant parfor loop, but it isn't going to make life easy. In particular, parfor needs to be able to prove that your loop iterations are independent. The parfor machinery doesn't care about the complexity of your code - but if it refuses to run your loop, it will probably be difficult for you to follow its reasoning.
Therefore, my main advice to you is: try to restructure your code into more self-contained functions. Done correctly, this will let you compartmentalise the complexity, so that the high-level computation is more digestible to the human reader. Once this is done, it will be much more feasible to work out how to apply parfor, since it will be more obvious where the independent (and thus parallelisable) portions are. Sorry that there aren't any simple answers for this sort of case.


Patrizio Graziosi
Patrizio Graziosi on 30 Jul 2019
Hi Edric,
thank you for your indications.
I ended up to save the workspace (around 1 GB) and attach it to the parpool.
Then the "tau_calc" becomes a big function that loads the workspace, the scripts like tau_ADP becomes subfunctions. I see this is quite a rough solution to polish but works on my pc (4 workers).
poolobj = gcp;
WorkersConstant = parallel.pool.Constant('WorkSpace.mat');
parfor id_E = 1:nE
for id_n = 1:n_bands_transp
[tau_temp, tau_matth_temp, tau_IIS_temp] = tau_calc_funct_v3(id_E, id_n, 'WorkSpace.mat'); % the big tau_calc routine, the actual tau_calc in the serial version
taus(id_E,id_n) = tau_temp;
taus_matth(id_E,id_n) = tau_matth_temp;
if strcmp(IIS,'yes')
taus_IIS(id_E,id_n) = tau_IIS_temp;
The issue is now that when I run it on a cluster I get a number of aborted workers
[^HWarning: A worker aborted during execution of the parfor loop. The parfor loop will now run again on the remaining workers.]^H [^H> In distcomp.remoteparfor/handleIntervalErrorResult (line 234) In distcomp.remoteparfor/getCompleteIntervals (line 364) In parallel_function>distributed_execution (line 745) In parallel_function (line 577) In tau_calc_parallel_VOMBATO_v3 (line 326)]^H
1) Does the parfor starts again from the beginning or continue with the other workers?
2) Can you help me in this? Shall I open a new question?
Edric Ellis
Edric Ellis on 1 Aug 2019
Whether parfor starts from the complete beginning again depends on the release of MATLAB. (I can't remember when we changed that to only re-run the failing portions - but it might well be pretty recent, i.e. R2019a or R2018b). If your workers are crashing like that, hopefully there are some crash dumps around which will help you diagnose things further.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by