On Tuesday, 28 August 2018 8:15:55 AM AEST Priedhorsky, Reid wrote: > I am trying to figure out how to advise users on starting worker daemons in > their allocations using srun. That is, I want to be able to run “srun foo”, > where foo starts some child process and then exits, and the child > process(es) persist and wait for work.
That won't happen on a well configured Slurm system as it is Slurm's role to clear up any processes from that job left around once that job exits. This is why cgroups and pam_slurm_adopt are so useful, you can track and kill those off far more easily. If you want processes to stick around you either need to ask for enough time in the job and ensure that the script doesn't exit (and thus signal the end of the job) until those daemons are done or you will need to find a way outside of Slurm to do it. One possible way for the latter would be to configure something like systemd to allow specific users to run daemons as themselves. Then you could let them submit a job where they do: systemctl start --user mydaemon.service to start it up (and check it has started successfully before exiting). There's a bit about how to do this here (which I've just started using for a side radio-astronomy project at the observatory I volunteer at): https://www.brendanlong.com/systemd-user-services-are-amazing.html Hope this helps! All the best, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC