Excuse me, how can I tell slurm not to terminate until all steps (tasks) are finished?
Regards, Mahmood On Fri, May 18, 2018 at 10:35 AM, Mahmood Naderan <mahmood...@gmail.com> wrote: > OK I understand that. However, there is a issue with ntasks=1. > Assume a user wants to launch an application with the number of cores > in the command line argument. Taking into mind that the cpu limit for > the partition is 20 cores, the following example > > [mahmood@rocks7 ~]$ srun --x11 -A y8 -p RUBY --mem=8GB --pty bash > [mahmood@compute-0-6 ~]$ /state/partition1/scfd/sc -t10 > > raises two problems: > 1- Slurm assumes that the user job is using only one core. That means > a user can create 20 interactive sessions and in each of the sessions > launch the program with 10 threads and bypassing the core limit I set > before. > > 2- The user that start the session with ntasks=1 (or not specifying > that) and then cheat the system by launching the program with more > than cpu limit (specifying -t50). > > Any idea? > > > > Regards, > Mahmood > > > > > On Thu, May 17, 2018 at 11:40 PM, Matthieu Hautreux > <matthieu.hautr...@gmail.com> wrote: >> >> >> It means what is written : your job is terminated because 9 tasks out of 10 >> exited more than 60s before. >> >> The logic behind the 60 seconds (configurable) is described in the srun man >> page. You should look at it closely. >> >> You should also look at the FAQ here https://slurm.schedmd.com/faq.html. >> >> You should set --ntask=1, if I properly guess your goal. >> >> HTH >>