Working around segmentation violations in parfor loops
Ältere Kommentare anzeigen
I am currently working on a project where I have to run multiple repetitions of a time-consuming MATLAB function in parallel. For the purposes of this question, let's refer to the function as myfunc.
myfunc uses a MEX file and ends up with a random segmentation violation once every 3 hours. I cannot diagnose the segmentation fault since it originates from a propriety API that I did not code myself. However, I do know that it occurs within the MEX file, and I also know that it is not deterministically related to any settings I can change.
I would like to work around the segmentation violation, and I would ideally also like to keep on using the parfor function in MATLAB. My idea right now is to use a try catch loop within the parfor loop as follows:
%create an output cell to store nreps of output from 'myfunc'
output = cell(1,nreps)
%create a vector to track # of runs that finish successfully
successfulrun = zeros(1,nreps);
% run myfunc in parallel
parfor i = 1:nreps
try
output{i}
successfulrun(i) = true
end
end
%rerun experiments that did not end up successfully
while sum(successulruns) < nreps
%count # of experiments to rerun
%initialize variables to store new results
reps_to_rerun = find(successfulruns == 0);
nreps_to_rerun = sum(reps_to_rerun);
newoutput = cell(1,nreps_to_rerun);
newsuccessfulrun = zeros(1,nreps_to_rerun)
%rerun experiments
parfor i = 1:nreps_to_rerun
try
newoutput{i};
newsuccessfulrun = true;
end
end
%transfer contents to larger loop
for i = 1:nreps_to_rerun
rerun_index = reps_to_rerun(i);
successfulrun(rerun_index) = newsuccessfulrun(i)
if newsuccessfulrun(i)
output{i} = newoutput{i};
end
end
end
My questions are:
1. Will it be OK to keep continuing to run more repetitions like this even though there was a segmentation violation within the MEX file? Or should I clear the memory / restart the matlabpool? I'm assuming this shouldn't be problem since the segmentation violation was in C.
2. Is there any way to "break" out of a parfor loop?
5 Kommentare
Kaustubha Govind
am 13 Apr. 2012
I don't think try-catch will help w.r.t. SegV's, because they usually end the process that is being run. In other words, MATLAB will be shut down due to the SegV.
Ken Atwell
am 14 Apr. 2012
Kaustubha is correct -- try/catch will trap a MATLAB error, but not a segfault triggered by the OS.
Berk Ustun
am 23 Apr. 2012
Sean de Wolski
am 23 Apr. 2012
Does it occur with a regular for-loop?
Berk Ustun
am 23 Apr. 2012
Antworten (0)
Kategorien
Mehr zu Parallel Computing Toolbox finden Sie in Hilfe-Center und File Exchange
Produkte
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!