I don’t think that’s true (and others have shared documentation regarding interactive jobs and the S commands). There was documentation shared for how this works, and it seems as if it has been ignored.
[novosirj@amarel2 ~]$ salloc -n1 salloc: Pending job allocation 83053985 salloc: job 83053985 queued and waiting for resources salloc: job 83053985 has been allocated resources salloc: Granted job allocation 83053985 salloc: Waiting for resource configuration salloc: Nodes slepner012 are ready for job This is the behavior I’ve always seen. If I include a command at the end of the line, it appears to simply run it in the “new” shell that is created by salloc (which you’ll notice you can exit via CTRL-D or exit). [novosirj@amarel2 ~]$ salloc -n1 hostname salloc: Pending job allocation 83054458 salloc: job 83054458 queued and waiting for resources salloc: job 83054458 has been allocated resources salloc: Granted job allocation 83054458 salloc: Waiting for resource configuration salloc: Nodes slepner012 are ready for job amarel2.amarel.rutgers.edu salloc: Relinquishing job allocation 83054458 You can, however, tell it to srun something in that shell instead: [novosirj@amarel2 ~]$ salloc -n1 srun hostname salloc: Pending job allocation 83054462 salloc: job 83054462 queued and waiting for resources salloc: job 83054462 has been allocated resources salloc: Granted job allocation 83054462 salloc: Waiting for resource configuration salloc: Nodes node073 are ready for job node073.perceval.rutgers.edu salloc: Relinquishing job allocation 83054462 When you use salloc, it starts an allocation and sets up the environment: [novosirj@amarel2 ~]$ env | grep SLURM SLURM_NODELIST=slepner012 SLURM_JOB_NAME=bash SLURM_NODE_ALIASES=(null) SLURM_MEM_PER_CPU=4096 SLURM_NNODES=1 SLURM_JOBID=83053985 SLURM_NTASKS=1 SLURM_TASKS_PER_NODE=1 SLURM_JOB_ID=83053985 SLURM_SUBMIT_DIR=/cache/home/novosirj SLURM_NPROCS=1 SLURM_JOB_NODELIST=slepner012 SLURM_CLUSTER_NAME=amarel SLURM_JOB_CPUS_PER_NODE=1 SLURM_SUBMIT_HOST=amarel2.amarel.rutgers.edu SLURM_JOB_PARTITION=main SLURM_JOB_NUM_NODES=1 If you run “srun” subsequently, it will run on the compute node, but a regular command will run right where you are: [novosirj@amarel2 ~]$ srun hostname slepner012.amarel.rutgers.edu [novosirj@amarel2 ~]$ hostname amarel2.amarel.rutgers.edu Again, I’d advise Mahmood to read the documentation that was already provided. It really doesn’t matter what behavior is requested — that’s not what this command does. If one wants to run a script on a compute node, the correct command is sbatch. I’m not sure what advantage salloc with srun has. I assume it’s so you can open an allocation and then occasionally send srun commands over to it. -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novos...@rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' > On Jan 2, 2019, at 12:20 PM, Terry Jones <te...@jon.es> wrote: > > I know very little about how SLURM works, but this sounds like it's a > configuration issue - that it hasn't been configured in a way that indicates > the login nodes cannot also be used as compute nodes. When I run salloc on > the cluster I use, I *always* get a shell on a compute node, never on the > login node that I ran salloc on. > > Terry > > > On Wed, Jan 2, 2019 at 4:56 PM Mahmood Naderan <mahmood...@gmail.com> wrote: > Currently, users run "salloc --spankx11 ./qemu.sh" where qemu.sh is a script > to run a qemu-system-x86_64 command. > When user (1) runs that command, the qemu is run on the login node since the > user is accessing the login node. When user (2) runs that command, his qemu > process is also running on the login node and so on. > > That is not what I want! > I expected slurm to dispatch the jobs on compute nodes. > > > Regards, > Mahmood > > > > > On Wed, Jan 2, 2019 at 7:39 PM Renfro, Michael <ren...@tntech.edu> wrote: > Not sure what the reasons behind “have to manually ssh to a node”, but salloc > and srun can be used to allocate resources and run commands on the allocated > resources: > > Before allocation, regular commands run locally, and no Slurm-related > variables are present: > > ===== > > [renfro@login ~]$ hostname > login > [renfro@login ~]$ echo $SLURM_TASKS_PER_NODE > > > ===== > > After allocation, regular commands still run locally, Slurm-related variables > are present, and srun runs commands on the allocated node (my prompt change > inside a job is a local thing, not done by default): > > ===== > > [renfro@login ~]$ salloc > salloc: Granted job allocation 147867 > [renfro@login(job 147867) ~]$ hostname > login > [renfro@login(job 147867) ~]$ echo $SLURM_TASKS_PER_NODE > 1 > [renfro@login(job 147867) ~]$ srun hostname > node004 > [renfro@login(job 147867) ~]$ exit > exit > salloc: Relinquishing job allocation 147867 > [renfro@login ~]$ > > ===== > > Lots of people get interactive shells on a reserved node with some variant of > ‘srun --pty $SHELL -I’, which doesn’t require explicitly running salloc or > ssh, so what are you trying to accomplish in the end? > > -- > Mike Renfro, PhD / HPC Systems Administrator, Information Technology Services > 931 372-3601 / Tennessee Tech University > > > On Jan 2, 2019, at 9:24 AM, Mahmood Naderan <mahmood...@gmail.com> wrote: > > > > I want to know if there any any way to push the node selection part on > > slurm and not a manual thing that is done by user. > > Currently, I have to manually ssh to a node and try to "allocate resources" > > using salloc. > > > > > > Regards, > > Mahmood >
signature.asc
Description: Message signed with OpenPGP