There are a couple of options here, not exactly convenient but will get the job done:

1. Use array, with `-N 1 -w <nodename>` defined for each array task. You can do the same without array, using for loop to submit different sbatchs.

2. Use `scontrol reboot`. Set the reboot program to do the job. scontrol reboot by definition only runs once for each node. It does require exclusive access, so might not be convenient for you use case.


For (2), see https://slurm.schedmd.com/slurm.conf.html#OPT_RebootProgram and https://slurm.schedmd.com/scontrol.html#OPT_reboot


For (1), you could also use a single `sbatch -N <number of nodes> -w <nodelist>` with multiple `srun -N 1-1 -w <nodename> ... &` inside the batch script. Make sure to put '&' at the end of the srun, so they will run concurrently.


Hope that helps,

--Dani_L.




On 19/02/2025 5:33, Shunran Zhang via slurm-users wrote:
Assuming all node need to run the same task once...

How about -n num_of_nodes --ntasks-per-node=1 ?

Otherwise if it is more deployment related I would use ansible to do that.

S. Zhang

On 2025/02/19 2:37, John Hearns via slurm-users wrote:
I am running single node tests on a cluster.

I can select named nodes using the -2 flag with sbatch.
However - if I want to submit perhaps 20 test jobs is there any smart way to run only one time on a node?
I know I could touch a file with the hostname and test for that file.
I am just wondering if there is a smarter way to do this.\

I should point out that the tests take a few minutes to run, so if a node finishes a test run it could become idle and run the tests again.




-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to