Thanks Davide, It's true that srun will create an allocation if you aren't inside a job, but if you are inside a job and you request more resources than it has, then srun will just fail. This is the key issue that I want to avoid.
On Sat, Apr 5, 2025 at 11:48 AM Davide DelVento <davide.quan...@gmail.com> wrote: > The plain srun is probably the best bet, and if you really need the thing > to be started from another slurm job (rather than the login node) you will > need to exploit the fact that > > > If necessary, srun will first create a resource allocation in which to > run the parallel job. > > AFAIK, there is no option to for the "create a resource allocation" even > if it's not necessary. But you may try to request something that is "above > and beyond" what the current allocation provides, and that might solve your > problem. > Looking at the srun man page, I could speculate that --clusters > or --cluster-constraint might help in that regard (but I am not sure). > > Have a nice weekend > > > On Fri, Apr 4, 2025 at 6:27 AM Michael Milton via slurm-users < > slurm-users@lists.schedmd.com> wrote: > >> I'm helping with a workflow manager that needs to submit Slurm jobs. For >> logging and management reasons, the job (e.g. srun python) needs to be run >> as though it were a regular subprocess (python): >> >> - stdin, stdout and stderr for the command should be connected to >> process inside the job >> - signals sent to the command should be sent to the job process >> - We don't want to use the existing job allocation, if this is run >> from a Slurm job >> - The command should only terminate when the job is finished, to >> avoid us needing to poll Slurm >> >> We've tried: >> >> - sbatch --wait, but then SIGTERM'ing the process doesn't kill the job >> - salloc, but that requires a TTY process to control it (?) >> - salloc srun seems to mess with the terminal when it's killed, >> likely because of being "designed to be executed in the foreground" >> - Plain srun re-uses the existing Slurm allocation, and specifying >> resources like --mem will just request then from the current job rather >> than submitting a new one >> >> What is the best solution here? >> >> -- >> slurm-users mailing list -- slurm-users@lists.schedmd.com >> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com >> >
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com