Parallel Workers Sometimes Can't Access Function

11 Ansichten (letzte 30 Tage)
Daniel Flisek
Daniel Flisek am 2 Aug. 2024
Kommentiert: Daniel Flisek am 9 Aug. 2024
I'm building an app that uses the Parallel Processing toolbox to run Simulink models in parallel. I create a list of FevalFuture objects for the workers, and assign their tasks with parfeval on a function I wrote.
It was working at one point, but now it takes 3 tries to successfully run.
After the pool is first opened, none of the tasks run at all. They all immediately return a MATLAB:class:MethodRestricted error saying they cannot access the function I'm sending them.
The second time I run my app, with the pool still open, only some of them fail with the same error. The rest run successfully.
The third time and then on, everything runs perfectly with no errors.
Does anyone have any idea why parallel workers would sometimes be unable to access a function?
EDIT
Here is a partial transcription of the code. I am unable to share portions of it, but I don't think that those parts contribute to the problem anyway. The abbreviated parts are marked by '%%' and the relevent variable described.
% start the pool
pool = parpool('Processes');
% pre-load Simulink on the workers
parfevalOnAll(@start_simulink, 0);
% build a list of Simulink models to load on the workers, and transfer any
% existing cache files for those models to them
%% cacheFiles = vector of cache file .slxc filepaths
addAttachedFiles(pool, cacheFiles);
% build a list of tasks & divide them up evenly among workers
%% workerTasks = a cell array of table objects, which contain rows for each simulation task
% create a dataQueue to update UI components
dq = parallel.pool.Dataqueue;
afterEach(dq, @progressFunc); % progressFunc simply updates a label in the app
% suppress warnings from workers
parfevalOnAll(@warning, 0, 'off', 'all');
numIterations = numel(workerTasks);
% create global futures object for collecting intermediate results
app.futures(1:numIterations) = parallel.FevalFuture;
% assign tasks to workers
for p = 1:numIterations
app.futures(p) = parfeval(@myApp.parallelSimulationTask, 1, ...
workerTasks{p}, dq);
end
% collect results as they come in
numCompleted = 0;
while numCompleted < numIterations
try
[completedIdx, results] = fetchNext(app.futures);
if ~isempty(completedIdx)
workerTasks{completedIdx} = results;
end
catch ex
% this is where MATLAB:class:MethodRestricted error occurs, at the
% fetchNext command
end
numCompleted = numCompleted + 1;
end
The full error text from ex is:
ex.identifier = 'MATLAB:parallel:future:FetchNextFutureErrored';
ex.message = 'The function evaluation completed with an error.';
ex.cause.identifier = 'MATLAB:class:MethodRestricted';
ex.cause.message = 'Cannot access method parallelSimulationTask in class myApp';
  5 Kommentare
Madheswaran
Madheswaran am 9 Aug. 2024
Hi Daniel,
This error occurs when the method you are trying to access (parallelSimulationTask) is not accessible, likely because the method is private or protected. Additionally, you mentioned that it was working at one point, but now it takes three tries to successfully run, I suspect there might be a race condition happening in the method parallelSimulationTask. Posting further information on the method would be helpful to look into the issue.
Daniel Flisek
Daniel Flisek am 9 Aug. 2024
I stumbled across that public/private solution shortly before you posted, @Madheswaran, thank you!
The race condition is interesting, I hadn't heard of that term before. The inputs to parallelSimulationTask are all broadcast variables, so each worker should get their own copy. They also receive their own copies of the Simulink cache files. The Simulink model files, however, aren't passed directly. Just the file locations. I assumed each worker would run its own copy of the models, but maybe they don't. Should I explicitly pass the model files with addAttachedFiles?

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Daniel Flisek
Daniel Flisek am 9 Aug. 2024
As @Madheswaran suggested, method access was the cause. I changed the block of static methods that includes parallelSimulationTask from private to public, and it works.
It's strange to me that it used to work fine, and those methods have been private since Day 1. But public it is, I guess.

Weitere Antworten (0)

Kategorien

Mehr zu Startup and Shutdown finden Sie in Help Center und File Exchange

Produkte


Version

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by