On 27 May 2018 at 18:23, Nadav Toledo <nadavtol...@cs.technion.ac.il> wrote:
> Hello forum, > > I am trying to deal with idle session for some time, and haven't found a > solution i am happy with. > The scenario is as follow: users using srun for jupyter-lab(which is fine > and even encouraged by me) on image processing cluster with gpus. > > problem is, I am trying to have some kind of solution to email/cancel > their job if their session is idle for X amount of hours. > > the w command or xprintidle cannot be used , since they both work with ssh > but not with slurm(checked that) > > Writing a script is not as easy as one might think, If i run a script in > admin user scope, i need later on to figure out which idle gpu belong to > which slurm job. > running a script in the user scope is probably better idea, but in which > way? crontab is running even user is not logged, how can i force users to > run something only when the job start? > > perhaps some combination of sreport and tres? > Hmm. We address this with accounting. A tight walltime ( 40 minutes) means that most jobs run without worrying about walltime. But some will need to set it. The accounting system keeps people honest by making "hogging" of resources bad for a users job priority - in so much as their next job will be deprioritsed. Letting people know that their next job will not be de-prioritised if they waste the resources, we find our users behave responsibly. L.