How do I allocate cpu resources to a batch job?

12 Ansichten (letzte 30 Tage)
sebrz
sebrz am 3 Jun. 2022
Bearbeitet: sebrz am 8 Jun. 2022
I want to run several batch jobs in the background. Each job runs a different script, but each script calls a 'mpi -np 16 "someApplication"' in parallel with 16 physical cpu cores. These 16 cores are fixed.
Do I need to set up a pool of 16+1 workers for each batch job or do I set up one worker with 16 cpu s, or what would be the best solution in order to run multiple jobs in the background for a server of several multiples of 16 cpus? Can a worker acces multiple cpu s if it is necessary?
Thanks in advance?

Antworten (1)

Raymond Norris
Raymond Norris am 3 Jun. 2022
It would help if you could provide an example of the script and how you're running the job.
Let's assume you're using PBS, maybe it looks something like
#!/bin/sh
#PBS -l ... (request nodes, ppn, etc.)
module load matlab
mpirun -np 16 matlab -r someApplication
* You wrote mpi -np but I'm assuming you meant mpiexec/mpirun. mpirun should be smart enough to not even need to specify -np 16.
Allocating 16 cores to MATLAB means MATLAB will see 16 cores, but it doesn't by default start a "pool" of workers. Rather, MATLAB will leverage it for the implicit multi-threading (e.g., fft).
If you start a local pool, you'd want to keep it to a max of 16 (which is really 17 including the MATLAB client). In this case, the workers will start MATLAB in singlethreaded mode by default. A worker can access multiple CPUs if you tell the pool to start with more threads. For example
local = parcluster("local");
local.NumThreads = 2;
pool = local.parpool(8);
Again, if you can provide a sample batch script and highlevel MATLAB code, it'll be easier to guide you.
  1 Kommentar
sebrz
sebrz am 8 Jun. 2022
Bearbeitet: sebrz am 8 Jun. 2022
Hi Raymond,
thanks for your answer and the clarification about pool and workers; and yes I meant mpirun.
So the matlab script looks like this right now:
% myScript.m
% set up
x0 = somevalue;
b.a = anothervalue;
...
xmin, fmin = optimizer('someFunction',x0,b)
and the function is defined in the same directory as myScript.m and calls an external application/module.
% some function is defined in same directory as myScript.m
[f] = someFunction(a,b);
doStuffInDirectory;
f = system('mpirun -np 16 externalApplication')
Let's say I want to do it with slurm and a node has 48 cpus.
For the first szenario I have different scripts which call different optimizers/have different objectives/constraints etc :
#!/bin/bash
...
#SBATCH --nodes=1
#SBATCH --tasks-per-node=3
#SBATCH --cpus-per-task=16
$MCRMODULE = MATLAB
module rm matlab
module load $MCRMODULE
module load externalApplication
matlab -nodisplay -singleCompThread -r "myScript;"
Do I run the different batch jobs on one slurm script or do I make multiple slurm scripts?
matlab -nodisplay -singleCompThread -r "myScript1;"
matlab -nodisplay -singleCompThread -r "myScript2;"
Can I just write a MATLAB script like this and with one slurm script call all jobs? Like this:
%myBatchScript
job1 = batch('myScript1')
job2 = batch('myScript2')
...
I have seen on my pc if I run a batch like this several times on the command window:
job = batch('myScript')
it works without problems and I do not have to set pool/workers.
But since I move to a bigger cluster I was wondering what would be the best option ?
In the second szenarion, I want to evaluate in parallel multiple optimizations with different inputs but same optimizer/constraints/ etc.
I do not have the "someFunction" written in such a way that I can evaluate it in parallel while calling the external application, because it writes stuff in the directory unfortunately.
Ideally, my MATLAB code would look like this:
% myIdealScript
param = [NxM];
for i = 1:size(param,2)
f = myScriptAsAFunction(param)
end
and be evaluated with one slurm script. Do I just set 16xN/48 as nodes and leave 3 as tasks per node?
matlab -nodisplay -singleCompThread -r "myIdealScript;"

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Third-Party Cluster Configuration finden Sie in Help Center und File Exchange

Produkte


Version

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by