Matlab R2024b parallel pool not working above 32 cores

44 Ansichten (letzte 30 Tage)
Filippo Ambrosino
Filippo Ambrosino am 7 Nov. 2024 um 9:29
Kommentiert: sidik am 7 Nov. 2024 um 13:05
Hi everyone,
I have a P8 ThinkStation with the AMD Ryzen ThreadRipper 7985WX working on W11 and Matlab R2024b installed. The processor has 64 physical cores and 128 logical ones. When I try to validate the local cluster profile for the parallel processing with a number of cores greater than 32, the validation fails at "SPMD job test" stage returning the following error:
Error Report: Job errored or did not reach the state 'finished'. MATLAB worker shut down unexpectedly with status -4 during task execution.
Indeed, the error status changes sometimes among -1, -2 and -4.
Any suggestion to fix this issues? I didn't have such a problem with R2023b...
Best regards,
Filippo

Antworten (2)

sidik
sidik am 7 Nov. 2024 um 10:00
try to follow this :
Step 1: Reduce the Number of Cores Used
  1. Open Matlab.
  2. Go to Home > Parallel > Manage Cluster Profiles.
  3. In the Cluster Profile Manager window, select local from the list of cluster profiles.
  4. Click on Edit at the bottom right.
  5. In the NumWorkers section, set the number to 32 (or a lower number if you want to test gradually).
  6. Click Done to save the changes.
  7. Close the Cluster Profile Manager window.
Step 2: Test the Cluster Profile
  1. Go back to Parallel and click on Validate.
  2. Let Matlab validate the cluster profile. If the test still fails, try decreasing NumWorkers (to 16 or 8) and validate again to see if a lower number of cores resolves the issue.
Step 3: Create a Custom Cluster Profile (if needed)
  1. If validation continues to fail, go back to Manage Cluster Profiles and click on New Profile.
  2. Name the new profile (e.g., CustomProfile).
  3. In NumWorkers, try a reasonable number (such as 16 or 24).
  4. Save by clicking Done.
  5. Set this new profile as the active profile by checking the box next to its name.
  6. Validate the profile by clicking on Validate.
if all the above steps fail, i suggest you to visit support and open a support ticket.
don't hesitate if you're still stuck

Filippo Ambrosino
Filippo Ambrosino am 7 Nov. 2024 um 11:01
Hello @sidik
thanks for your help. I tried to decrease the number of cores and it works properly. Unfortunately I need to work with at least 64 cores otherwise I would spend a lot of time to run my routines.
Best regards,
Filippo
  7 Kommentare
Filippo Ambrosino
Filippo Ambrosino am 7 Nov. 2024 um 12:46
@sidik many thanks for your help.
Filippo

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Clusters and Clouds finden Sie in Help Center und File Exchange

Produkte


Version

R2024b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by