R2024b parpool crashing when being activated with 24 workers.
    62 Ansichten (letzte 30 Tage)
  
       Ältere Kommentare anzeigen
    
!!! Update: These crashes seem to be happening quite randomly, regardless of the number of workers that are used.
Dear all,
Whenever i try to start a parpool with >20 workers on the processes profile, an error occurs and the parallel pool automatically gets shut down. I have tried validating the profile with the cluster profile manager, and using any value above 20 workers seems to be producing this error for some reason, despite my CPU having 24 cores. I've never experienced this problem on Matlab 2024a, and I have always been able to start parallel pools with up to 24 workers.
Is there a known fix for this? It has only been happening since updating to Matlab R2024b. My CPU is an Intel Core i9-14900KF.
Thanks in advance, I attached the error below if it can be useful, and a few snapshots of the cluster profile manager validations.
Command window output:
Starting parallel pool (parpool) using the 'Processes' profile ...
Error using parpool (line 133)
Parallel pool failed to start with the following error. For more detailed information,
validate the profile 'Processes' in the Cluster Profile Manager.
Error in parallel.internal.ui.PoolHelper.startPool (line 12)
            parpool();
            ^^^^^^^^^
Caused by:
    Error using parallel.internal.pool.AbstractInteractiveClient>@()checker.checkState()
    (line 121)
    The parallel pool job errored with the following message: MATLAB worker shut down
    unexpectedly with status 1 during task execution.
Parallel pool using the 'Processes' profile is shutting down.
This parallel pool has been shut down.
Caused by:
    The client lost connection to worker 2 (Task 2; Host: localhost), potentially due to
    network issues or errors during the interactive communicating job.
With 16 workers (same output when using 20):

With 24 workers:

3 Kommentare
Antworten (1)
  Sergio E. Obando
    
 am 25 Sep. 2024
        While not exactly the same error, this post covers some good troubleshooting steps: Validation Fails
If you prefer or if those steps do not resolve your issue, I would highly recommend contacting Technical Support.
8 Kommentare
  Raffael
 am 2 Jan. 2025
				Same here: 
running a simulation with more than 60 workers crashed with R2024b on several machines.
The same simulation runs fine with R2024a using 700 cores/Matlab workers.
No idea why R2024b crashed; also running SPMD validation test.
in the Job log there is only a "Matlab crashed on worker XXX" message - no other useful information.
Raffael-
  Sergio E. Obando
    
 am 2 Jan. 2025
				Raffael, please reach out to technical support. They can help you debug this issue and see if the root cause is similar to the one from the original post.
Siehe auch
Kategorien
				Mehr zu Startup and Shutdown finden Sie in Help Center und File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!





