Hi Marcus, Is something like staskfarm, https://github.com/paddydoyle/staskfarm, https://www.tchpc.tcd.ie/node/1127 any use for your needs? Sorry if not.
Regards Sean On Fri, Nov 05, 2021 at 10:42:32AM +0100, Marcus Peders?n wrote: > Hi all, > I have setup a basic slurm system and been testing out > a nuber of things. > The latest thing I started to test is the parallel parts. > What I have is about 70 independent scripts that would be > ideal to run in parallel. > For testing purposes I have created 20 dummy scripts > that print script name, hostname sleeps for one minute > and prints no of minutes. > > The way I want to run this is to allocate 2 nodes > and run all of the 20 scripts in parallel, each one of them > in one process. > My idea is that the first node will be filled up with 12 processes, > each process running one script and the second node will run > the rest of the processes/scripts (8 scripts on 8 processes). > I have read up on a couple of tutorials and looked at the documentation > for different parts of slurm. > But what ever flags I use for both sbatch and srun I do not seem to > be able to accomplish what I want. > All nodes have 6 cores with 2 threads. > > The closest I have come is with this small sbatch: > > #! /bin/bash > #SBATCH --job-name=TestParallel > #SBATCH --nodes=2 > #SBATCH --ntasks-per-node=1 > #SBATCH --ntasks=2 > #SBATCH --cpus-per-task=12 > #SBATCH --nodelist=node1,node2 > #SBATCH --output="%x-%4j-%N.out" > #SBATCH --mail-user=my@mail > #SBATCH --mail-type=ALL > > echo > date +%Y-%m-%d" "%H-%M-%S > > for i in {1..20} > do > srun --nodes=1 --ntasks=1 --ntasks-per-node=1 --cpus-per-task=1 > --exclusive --job-name=Testp-$i --output=/path/to/test_prog$i.log > /path/to/test_prog$i.sh & > done > > date +%Y-%m-%d" "%H-%M-%S > > wait > > > sacct gives the following output: > 505 TestParal+ all marcus 24 RUNNING > node[1-2] 0:0 > 505.batch batch 12 RUNNING node1 > 0:0 > 505.0 Testp-3 1 RUNNING node1 > 0:0 > 505.1 Testp-6 1 RUNNING node2 > 0:0 > 505.2 Testp-2 1 RUNNING node1 > 0:0 > 505.3 Testp-13 1 RUNNING node1 > 0:0 > 505.4 Testp-9 1 RUNNING node1 > 0:0 > 505.5 Testp-11 1 RUNNING node1 > 0:0 > 505.6 Testp-16 1 RUNNING node1 > 0:0 > 505.7 Testp-12 1 RUNNING node1 > 0:0 > 505.8 Testp-20 1 RUNNING node1 > 0:0 > 505.9 Testp-4 1 RUNNING node1 > 0:0 > 505.10 Testp-19 1 RUNNING node1 > 0:0 > 505.11 Testp-10 1 RUNNING node1 > 0:0 > 505.12 Testp-5 1 RUNNING node1 > 0:0 > > > Slurm only use one process on node2 and of cause I want all the last 8 > processes to run on node2. > > I have tried a number of other options usualy ending in running the same > script multiple times > and that is not what I want. > > I feel a bit stuck and can not get my head around this. > > I would really appreciate some help!! > > Many thanks in advance!! > > Best Regards > Marcus > > --- > När du skickar e-post till SLU så innebär detta att SLU behandlar dina > personuppgifter. För att läsa mer om hur detta går till, klicka här > <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/> > E-mailing SLU will result in SLU processing your personal data. For more > information on how this is done, click here > <https://www.slu.se/en/about-slu/contact-slu/personal-data/> > -- Sean McGrath M.Sc Systems Administrator Trinity Centre for High Performance and Research Computing Trinity College Dublin sean.mcgr...@tchpc.tcd.ie https://www.tcd.ie/ https://www.tchpc.tcd.ie/ +353 (0) 1 896 3725