Parpool Fail 2015a HPC
3 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Hi,
We have recently purchased a HPC for simulation research. It has 64 cores (4 AMD Opteron) and 256 gb ram. The OS is linux CentOS 6.
We have installed MatLab 2015a 64-bit on the machine.
I am trying to open 64 parallel workers using parpool(64) command, but it gives me error. Even the profile cannot be validated. When I reduce the number to 25 it works. But any number more than 25 I get the error. The error is attached. I will really appriciate it, if you can help me in this regard. It is worth to mention that with the same machine in Windows 10, I can use all 32 cores and 32 hyperthreads (as windows only detect 2 physical CPUs). It means in Windows i can open up to 64 workers.
Stage: SPMD job test (createCommunicatingJob)
Status: Failed
Description:The job errored or did not reach state finished.
Command Line Output:(none)
Error Report:(none)
Debug Log:
LOG FILE OUTPUT:
[14] < M A T L A B (R) >
[14] Copyright 1984-2015 The MathWorks, Inc.
[14] R2015a (8.5.0.197613) 64-bit (glnxa64)
[14] February 12, 2015
[21] < M A T L A B (R) >
[21] Copyright 1984-2015 The MathWorks, Inc.
[21] R2015a (8.5.0.197613) 64-bit (glnxa64)
[21] February 12, 2015
[6] < M A T L A B (R) >
[6] Copyright 1984-2015 The MathWorks, Inc.
[6] R2015a (8.5.0.197613) 64-bit (glnxa64)
[6] February 12, 2015
[16] < M A T L A B (R) >
[16] Copyright 1984-2015 The MathWorks, Inc.
[16] R2015a (8.5.0.197613) 64-bit (glnxa64)
[16] February 12, 2015
[30] < M A T L A B (R) >
[30] Copyright 1984-2015 The MathWorks, Inc.
[30] R2015a (8.5.0.197613) 64-bit (glnxa64)
[30] February 12, 2015
[7] < M A T L A B (R) >
[7] Copyright 1984-2015 The MathWorks, Inc.
[7] R2015a (8.5.0.197613) 64-bit (glnxa64)
[7] February 12, 2015
[24] < M A T L A B (R) >
[24] Copyright 1984-2015 The MathWorks, Inc.
[24] R2015a (8.5.0.197613) 64-bit (glnxa64)
[24] February 12, 2015
[2] < M A T L A B (R) >
[2] Copyright 1984-2015 The MathWorks, Inc.
[2] R2015a (8.5.0.197613) 64-bit (glnxa64)
[2] February 12, 2015
[12] < M A T L A B (R) >
[12] Copyright 1984-2015 The MathWorks, Inc.
[12] R2015a (8.5.0.197613) 64-bit (glnxa64)
[12] February 12, 2015
[0] < M A T L A B (R) >
[0] Copyright 1984-2015 The MathWorks, Inc.
[0] R2015a (8.5.0.197613) 64-bit (glnxa64)
[0] February 12, 2015
[20] < M A T L A B (R) >
[20] Copyright 1984-2015 The MathWorks, Inc.
[20] R2015a (8.5.0.197613) 64-bit (glnxa64)
[20] February 12, 2015
[31] < M A T L A B (R) >
[31] Copyright 1984-2015 The MathWorks, Inc.
[31] R2015a (8.5.0.197613) 64-bit (glnxa64)
[31] February 12, 2015
[9] < M A T L A B (R) >
[9] Copyright 1984-2015 The MathWorks, Inc.
[9] R2015a (8.5.0.197613) 64-bit (glnxa64)
[9] February 12, 2015
[10] < M A T L A B (R) >
[10] Copyright 1984-2015 The MathWorks, Inc.
[10] R2015a (8.5.0.197613) 64-bit (glnxa64)
[10] February 12, 2015
[28] < M A T L A B (R) >
[28] Copyright 1984-2015 The MathWorks, Inc.
[28] R2015a (8.5.0.197613) 64-bit (glnxa64)
[28] February 12, 2015
[18] < M A T L A B (R) >
[18] Copyright 1984-2015 The MathWorks, Inc.
[18] R2015a (8.5.0.197613) 64-bit (glnxa64)
[18] February 12, 2015
[13] < M A T L A B (R) >
[13] Copyright 1984-2015 The MathWorks, Inc.
[13] R2015a (8.5.0.197613) 64-bit (glnxa64)
[13] February 12, 2015
[23] < M A T L A B (R) >
[23] Copyright 1984-2015 The MathWorks, Inc.
[23] R2015a (8.5.0.197613) 64-bit (glnxa64)
[23] February 12, 2015
[25] < M A T L A B (R) >
[25] Copyright 1984-2015 The MathWorks, Inc.
[25] R2015a (8.5.0.197613) 64-bit (glnxa64)
[25] February 12, 2015
[15] < M A T L A B (R) >
[15] Copyright 1984-2015 The MathWorks, Inc.
[15] R2015a (8.5.0.197613) 64-bit (glnxa64)
[15] February 12, 2015
[26] < M A T L A B (R) >
[26] Copyright 1984-2015 The MathWorks, Inc.
[26] R2015a (8.5.0.197613) 64-bit (glnxa64)
[26] February 12, 2015
[5] < M A T L A B (R) >
[5] Copyright 1984-2015 The MathWorks, Inc.
[5] R2015a (8.5.0.197613) 64-bit (glnxa64)
[5] February 12, 2015
[11] < M A T L A B (R) >
[11] Copyright 1984-2015 The MathWorks, Inc.
[11] R2015a (8.5.0.197613) 64-bit (glnxa64)
[11] February 12, 2015
[17] < M A T L A B (R) >
[17] Copyright 1984-2015 The MathWorks, Inc.
[17] R2015a (8.5.0.197613) 64-bit (glnxa64)
[17] February 12, 2015
[19] < M A T L A B (R) >
[19] Copyright 1984-2015 The MathWorks, Inc.
[19] R2015a (8.5.0.197613) 64-bit (glnxa64)
[19] February 12, 2015
[27] < M A T L A B (R) >
[27] Copyright 1984-2015 The MathWorks, Inc.
[27] R2015a (8.5.0.197613) 64-bit (glnxa64)
[27] February 12, 2015
[8] < M A T L A B (R) >
[8] Copyright 1984-2015 The MathWorks, Inc.
[8] R2015a (8.5.0.197613) 64-bit (glnxa64)
[8] February 12, 2015
[22] < M A T L A B (R) >
[22] Copyright 1984-2015 The MathWorks, Inc.
[22] R2015a (8.5.0.197613) 64-bit (glnxa64)
[22] February 12, 2015
[29] < M A T L A B (R) >
[29] Copyright 1984-2015 The MathWorks, Inc.
[29] R2015a (8.5.0.197613) 64-bit (glnxa64)
[29] February 12, 2015
[3] < M A T L A B (R) >
[3] Copyright 1984-2015 The MathWorks, Inc.
[3] R2015a (8.5.0.197613) 64-bit (glnxa64)
[3] February 12, 2015
[4] < M A T L A B (R) >
[4] Copyright 1984-2015 The MathWorks, Inc.
[4] R2015a (8.5.0.197613) 64-bit (glnxa64)
[4] February 12, 2015
[1] < M A T L A B (R) >
[1] Copyright 1984-2015 The MathWorks, Inc.
[1] R2015a (8.5.0.197613) 64-bit (glnxa64)
[1] February 12, 2015
[21]
[14]
[6]
[21]To get started, type one of these: helpwin, helpdesk, or demo.
[21]For product information, visit www.mathworks.com.
[21]
[14]To get started, type one of these: helpwin, helpdesk, or demo.
[14]For product information, visit www.mathworks.com.
[14]
[16]
[6]To get started, type one of these: helpwin, helpdesk, or demo.
[6]For product information, visit www.mathworks.com.
[6]
[30]
[2]
[16]To get started, type one of these: helpwin, helpdesk, or demo.
[16]For product information, visit www.mathworks.com.
[16]
[30]To get started, type one of these: helpwin, helpdesk, or demo.
[30]For product information, visit www.mathworks.com.
[30]
[0]
[20]
[9]
[28]
[2]To get started, type one of these: helpwin, helpdesk, or demo.
[2]For product information, visit www.mathworks.com.
[2]
[7]
[31]
[10]
[0]To get started, type one of these: helpwin, helpdesk, or demo.
[0]For product information, visit www.mathworks.com.
[0]
[24]
[25]
[12]
[23]
[9]To get started, type one of these: helpwin, helpdesk, or demo.
[9]For product information, visit www.mathworks.com.
[9]
[26]
[20]To get started, type one of these: helpwin, helpdesk, or demo.
[20]For product information, visit www.mathworks.com.
[20]
[28]To get started, type one of these: helpwin, helpdesk, or demo.
[28]For product information, visit www.mathworks.com.
[28]
[17]
[31]To get started, type one of these: helpwin, helpdesk, or demo.
[31]For product information, visit www.mathworks.com.
[31]
[5]
[13]
[7]To get started, type one of these: helpwin, helpdesk, or demo.
[7]For product information, visit www.mathworks.com.
[7]
[11]
[18]
[10]To get started, type one of these: helpwin, helpdesk, or demo.
[10]For product information, visit www.mathworks.com.
[10]
[19]
[8]
[15]
[25]To get started, type one of these: helpwin, helpdesk, or demo.
[24]To get started, type one of these: helpwin, helpdesk, or demo.
[25]For product information, visit www.mathworks.com.
[25]
[24]For product information, visit www.mathworks.com.
[24]
[12]To get started, type one of these: helpwin, helpdesk, or demo.
[12]For product information, visit www.mathworks.com.
[12]
[29]
[23]To get started, type one of these: helpwin, helpdesk, or demo.
[26]To get started, type one of these: helpwin, helpdesk, or demo.
[23]For product information, visit www.mathworks.com.
[23]
[26]For product information, visit www.mathworks.com.
[26]
[17]To get started, type one of these: helpwin, helpdesk, or demo.
[17]For product information, visit www.mathworks.com.
[17]
[27]
[13]To get started, type one of these: helpwin, helpdesk, or demo.
[13]For product information, visit www.mathworks.com.
[13]
[19]To get started, type one of these: helpwin, helpdesk, or demo.
[19]For product information, visit www.mathworks.com.
[19]
[11]To get started, type one of these: helpwin, helpdesk, or demo.
[11]For product information, visit www.mathworks.com.
[11]
[5]To get started, type one of these: helpwin, helpdesk, or demo.
[5]For product information, visit www.mathworks.com.
[5]
[18]To get started, type one of these: helpwin, helpdesk, or demo.
[3]
[18]For product information, visit www.mathworks.com.
[18]
[22]
[8]To get started, type one of these: helpwin, helpdesk, or demo.
[8]For product information, visit www.mathworks.com.
[8]
[15]To get started, type one of these: helpwin, helpdesk, or demo.
[15]For product information, visit www.mathworks.com.
[15]
[29]To get started, type one of these: helpwin, helpdesk, or demo.
[29]For product information, visit www.mathworks.com.
[29]
[1]
[4]
[27]To get started, type one of these: helpwin, helpdesk, or demo.
[27]For product information, visit www.mathworks.com.
[27]
[3]To get started, type one of these: helpwin, helpdesk, or demo.
[3]For product information, visit www.mathworks.com.
[3]
[22]To get started, type one of these: helpwin, helpdesk, or demo.
[22]For product information, visit www.mathworks.com.
[22]
[1]To get started, type one of these: helpwin, helpdesk, or demo.
[1]For product information, visit www.mathworks.com.
[1]
[4]To get started, type one of these: helpwin, helpdesk, or demo.
[4]For product information, visit www.mathworks.com.
[4]
[14] Academic License
[21] Academic License
[6] Academic License
[14]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[21]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[0] Academic License
[16] Academic License
[14]2015-11-07 17:01:29 | Enter distcomp_evaluate_filetask_core
[14]2015-11-07 17:01:29 | Enter distcomp_evaluate_filetask_core/iSetup
[14]2015-11-07 17:01:29 | This process will exit on any fault.
[21]2015-11-07 17:01:29 | Enter distcomp_evaluate_filetask_core
[21]2015-11-07 17:01:29 | Enter distcomp_evaluate_filetask_core/iSetup
[30] Academic License
[14]2015-11-07 17:01:29 | This process will exit when its parent process dies.
[21]2015-11-07 17:01:29 | This process will exit on any fault.
[14]2015-11-07 17:01:29 | About to initialize MPI.
[21]2015-11-07 17:01:29 | This process will exit when its parent process dies.
[21]2015-11-07 17:01:29 | About to initialize MPI.
[6]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[9] Academic License
[2] Academic License
[13] Academic License
[0]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[28] Academic License
[12] Academic License
[31] Academic License
[10] Academic License
[6]2015-11-07 17:01:29 | Enter distcomp_evaluate_filetask_core
[6]2015-11-07 17:01:29 | Enter distcomp_evaluate_filetask_core/iSetup
[6]2015-11-07 17:01:29 | This process will exit on any fault.
[29] Academic License
[6]2015-11-07 17:01:29 | This process will exit when its parent process dies.
[16]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[26] Academic License
[6]2015-11-07 17:01:30 | About to initialize MPI.
[19] Academic License
[30]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[5] Academic License
[0]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[0]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[9]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[0]2015-11-07 17:01:30 | This process will exit on any fault.
[0]2015-11-07 17:01:30 | This process will exit when its parent process dies.
[2]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[25] Academic License
[0]2015-11-07 17:01:30 | About to initialize MPI.
[7] Academic License
[13]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[20] Academic License
[16]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[16]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[8] Academic License
[18] Academic License
[16]2015-11-07 17:01:30 | This process will exit on any fault.
[30]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[28]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[31]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[30]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[24] Academic License
[12]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[16]2015-11-07 17:01:30 | This process will exit when its parent process dies.
[30]2015-11-07 17:01:30 | This process will exit on any fault.
[16]2015-11-07 17:01:30 | About to initialize MPI.
[23] Academic License
[10]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[9]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[15] Academic License
[30]2015-11-07 17:01:30 | This process will exit when its parent process dies.
[9]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[17] Academic License
[29]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[9]2015-11-07 17:01:30 | This process will exit on any fault.
[30]2015-11-07 17:01:30 | About to initialize MPI.
[9]2015-11-07 17:01:30 | This process will exit when its parent process dies.
[2]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[2]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[9]2015-11-07 17:01:30 | About to initialize MPI.
[2]2015-11-07 17:01:30 | This process will exit on any fault.
[26]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[19]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[2]2015-11-07 17:01:30 | This process will exit when its parent process dies.
[11] Academic License
[13]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[13]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[2]2015-11-07 17:01:30 | About to initialize MPI.
[5]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[13]2015-11-07 17:01:30 | This process will exit on any fault.
[28]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[28]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[31]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[31]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[28]2015-11-07 17:01:30 | This process will exit on any fault.
[12]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[13]2015-11-07 17:01:30 | Unexpected error setting up process monitor. Error returned:
[13]Unexpected Standard exception from MEX file.
[13]What() is:boost::thread_resource_error
[13]..
[13]Error in distcomp_evaluate_filetask_core>iSetupProcessMonitoringThreads (line 622)
[13] dct_psfcns('pidwatch', pidToWatch)
[13]Error in distcomp_evaluate_filetask_core>iMaybeSetupProcessMonitoringThreads (line 256)
[13] iSetupProcessMonitoringThreads;
[13]Error in distcomp_evaluate_filetask_core>iSetup (line 506)
[13]iMaybeSetupProcessMonitoringThreads();
[13]Error in distcomp_evaluate_filetask_core (line 25)
[13] runprop = iSetup(handlers, mdceDebugEnabled, outputWriterStack, isSyncTaskEvaluation, varargin);
[12]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[13]2015-11-07 17:01:30 | About to exit with code: 1
[31]2015-11-07 17:01:30 | This process will exit on any fault.
[28]2015-11-07 17:01:30 | This process will exit when its parent process dies.
[12]2015-11-07 17:01:30 | This process will exit on any fault.
[10]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[10]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[28]2015-11-07 17:01:30 | About to initialize MPI.
[31]2015-11-07 17:01:30 | This process will exit when its parent process dies.
job aborted:
rank: node: exit code[: error message]
0: 127.0.0.1: -2
1: 127.0.0.1: -2
2: 127.0.0.1: -2
3: 127.0.0.1: -2
4: 127.0.0.1: -2
5: 127.0.0.1: -2
6: 127.0.0.1: -2
7: 127.0.0.1: -2
8: 127.0.0.1: -2
9: 127.0.0.1: -2
10: 127.0.0.1: -2
11: 127.0.0.1: -2
12: 127.0.0.1: -2
13: 127.0.0.1: -2: process 13 exited without calling init while other processes have called init
14: 127.0.0.1: -2
15: 127.0.0.1: -2
16: 127.0.0.1: -2
17: 127.0.0.1: -2
18: 127.0.0.1: -2
19: 127.0.0.1: -2
20: 127.0.0.1: -2
21: 127.0.0.1: -2
22: 127.0.0.1: -2
23: 127.0.0.1: -2
24: 127.0.0.1: -2
25: 127.0.0.1: -2
26: 127.0.0.1: -2
27: 127.0.0.1: -2
28: 127.0.0.1: -2
29: 127.0.0.1: -2
30: 127.0.0.1: -2
31: 127.0.0.1: -2
Stage: Pool job test (createCommunicatingJob)
Status: Skipped
Description:Validation skipped due to previous failure.
Command Line Output:(none)
Error Report:(none)
Debug Log:(none)
Stage: Parallel pool test (parpool)
Status: Skipped
Description:Validation skipped due to previous failure.
Command Line Output:(none)
Error Report:(none)
Debug Log:(none)
0 Kommentare
Antworten (1)
Edric Ellis
am 9 Nov. 2015
This looks like your machine ran out of resources while trying to start up the workers. Do you have any ulimit in effect?
4 Kommentare
Darwin
am 17 Okt. 2016
I manage Matlab on Linux HPC machines and can use the number of workers equal to the number of cores on 1 node with parpool. Hyperthreading does not work right under CentOS.
Siehe auch
Kategorien
Mehr zu Parallel Computing Fundamentals finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!