Curious error during parfor loop involving getCompleteIntervals

I am consistently getting the error below while executing a parfor loop:
Error using distcomp.remoteparfor/getCompleteIntervals(line406)
An unexpected error occurred during PARFOR: Error in remote execution of parfor: Ignoring interval for loop ID: 58689 because executing loop ID:-1
Strangley, the error only occurs sometimes, and it seems to occur at different points within the code (i.e. sometimes the code is 30% completed and others it is 50%...). Also, I noticed that the loop ID (58689 in this example) is different each time I receive the error.
As for the code, I have a series of parfor loops all within a larger for loop. The error can occur within any of the parfor loops (it is not always the same). As the code is thousands of lines, I have provided some pseudocode that I hope will help in identifying the issue.
I am using Matlab 2021b on a 2021 macbook pro.
nt = 1e6;
npts = 2e6;
% Sample Function
function [a] = myfunc1(in1,in2)
npts = length(in1);
a = zeros(npts,1);
parfor j = 1:npts
a(j,1) = newfunc(in1,in2);
end
end
for i = 1:nt
in1 = rand(nPts,1);
in2 = rand(nPts,1);
% function with parfor loop inside
a = myfunc1(in1,in2);
b = zeros(npts,1);
% Second parfor loop with function called within parfor loop
parfor j = 1:npts
b(j,1) = myfunc2(a(j));
end
c = zeros(npts,1);
% Third parfor loop with function called within parfor loop
parfor j = 1:npts
c(j,1) = myfunc3(a(j),b(j));
end
end

5 Kommentare

Is it possible that you are running out of memory ?
Perhaps, but I do not think so. That was my first thought as well, and I reduced the number of workers from 10 (max) to 8. I still receive the error with 8 workers. While the code is running, I only am using about 23 of 32 Gb of memory...
That's an internal error occurring because the workers are getting into an inconsistent state somehow. The loop ID is simply a counter that increments each time a new parfor loop starts at your MATLAB client. The error you're seeing is essentially because a worker somehow ends up thinking it should not be executing any parfor loop when some work arrives.
I suggest you contact MathWorks support who can help you collect some diagnostic logs from the workers which hopefully will shed some light as to how you're ending up in this unfortunate state.
Jake
Jake am 27 Apr. 2022
Bearbeitet: Jake am 27 Apr. 2022
Edric - Thank you very much for your response! I have contacted support.
Hello Jake. I am facing the same problem. Have you found any solution?

Melden Sie sich an, um zu kommentieren.

Antworten (0)

Kategorien

Mehr zu Parallel Computing Toolbox finden Sie in Hilfe-Center und File Exchange

Produkte

Version

R2021b

Gefragt:

am 22 Apr. 2022

Kommentiert:

am 1 Mär. 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by