Re: [slurm-users] Providing users with info on wait time vs. run time

2022-09-16 Thread Ümit Seren
On Fri, Sep 16, 2022 at 3:43 PM Sebastian Potthoff < s.potth...@uni-muenster.de> wrote: > Hi Hermann, > > So you both are happily(?) ignoring this warning the "Prolog and Epilog > Guide", > right? :-) > > "Prolog and Epilog scripts [...] should not call Slurm commands (e.g. > squeue, > scontrol, s

Re: [slurm-users] Providing users with info on wait time vs. run time

2022-09-16 Thread Paul Edmon
We also call scontrol in our scripts (a little as we can manage) and we run at the scale of 1500 nodes.  It hasn't really caused many issues, but we try to limit it as much as we possibly can. -Paul Edmon- On 9/16/22 9:41 AM, Sebastian Potthoff wrote: Hi Hermann, So you both are happily(?) i

Re: [slurm-users] Providing users with info on wait time vs. run time

2022-09-16 Thread Sebastian Potthoff
Hi Hermann, >> So you both are happily(?) ignoring this warning the "Prolog and Epilog >> Guide", >> right? :-) >> >> "Prolog and Epilog scripts [...] should not call Slurm commands (e.g. squeue, >> scontrol, sacctmgr, etc)." > > We have probably been doing this since before the warning was add

Re: [slurm-users] Providing users with info on wait time vs. run time

2022-09-16 Thread Loris Bennett
Hi Hermann, Hermann Schwärzler writes: > Hi Loris, > hi Sebastian, > > thanks for the information on how you are doing this. > So you both are happily(?) ignoring this warning the "Prolog and Epilog > Guide", > right? :-) > > "Prolog and Epilog scripts [...] should not call Slurm commands (e.g.

Re: [slurm-users] Providing users with info on wait time vs. run time

2022-09-16 Thread Hermann Schwärzler
Hi Loris, hi Sebastian, thanks for the information on how you are doing this. So you both are happily(?) ignoring this warning the "Prolog and Epilog Guide", right? :-) "Prolog and Epilog scripts [...] should not call Slurm commands (e.g. squeue, scontrol, sacctmgr, etc)." May I ask how big

Re: [slurm-users] Providing users with info on wait time vs. run time

2022-09-16 Thread Loris Bennett
Hi Sebastian, Sebastian Potthoff writes: > Hi Loris > > We do something similar. At the end of our script pointed to by > EpilogSlurmctld we have > > Using EpilogSlurmctld only works if the slurmctld user is root (or slurm with > root privileges), right? I opted for the normal Epilog since w

Re: [slurm-users] Providing users with info on wait time vs. run time

2022-09-16 Thread Sebastian Potthoff
Hi Loris > We do something similar. At the end of our script pointed to by > EpilogSlurmctld we have Using EpilogSlurmctld only works if the slurmctld user is root (or slurm with root privileges), right? I opted for the normal Epilog since we wanted to avoid running slurm as root and I don’t h

Re: [slurm-users] Providing users with info on wait time vs. run time

2022-09-16 Thread Loris Bennett
Hi Hermann, Sebastian Potthoff writes: > Hi Hermann, > > I happened to read along this conversation and was just solving this issue > today. I added this part to the epilog script to make it work: > > # Add job report to stdout > StdOut=$(/usr/bin/scontrol show job=$SLURM_JOB_ID | /usr/bin/grep

Re: [slurm-users] Providing users with info on wait time vs. run time

2022-09-15 Thread Ole Holm Nielsen
Hi Hermann, On 9/15/22 18:07, Hermann Schwärzler wrote: Use the "smail" tool from the slurm-contribs RPM and set this in slurm.conf: MailProg=/usr/bin/smail Maybe I am missing something but from what I can tell smail sends an email and does *not* change or append to the .out file of a job..

Re: [slurm-users] Providing users with info on wait time vs. run time

2022-09-15 Thread Sebastian Potthoff
Hi Hermann, I happened to read along this conversation and was just solving this issue today. I added this part to the epilog script to make it work: # Add job report to stdout StdOut=$(/usr/bin/scontrol show job=$SLURM_JOB_ID | /usr/bin/grep StdOut | /usr/bin/xargs | /usr/bin/awk 'BEGIN { FS

Re: [slurm-users] Providing users with info on wait time vs. run time

2022-09-15 Thread Hermann Schwärzler
Hi Ole, On 9/15/22 5:21 PM, Ole Holm Nielsen wrote: On 15-09-2022 16:08, Hermann Schwärzler wrote: Just out of curiosity: how do you insert the output of seff into the out-file of a job? Use the "smail" tool from the slurm-contribs RPM and set this in slurm.conf: MailProg=/usr/bin/smail

Re: [slurm-users] Providing users with info on wait time vs. run time

2022-09-15 Thread Ole Holm Nielsen
On 15-09-2022 16:08, Hermann Schwärzler wrote: Just out of curiosity: how do you insert the output of seff into the out-file of a job? Use the "smail" tool from the slurm-contribs RPM and set this in slurm.conf: MailProg=/usr/bin/smail /Ole

Re: [slurm-users] Providing users with info on wait time vs. run time

2022-09-15 Thread Hermann Schwärzler
Hi Loris, we try to achieve the same (I guess) - which is nudging the users in the direction of using scarce resources carefully - by using goslmailer (https://github.com/CLIP-HPC/goslmailer) and a (not yet published - see https://github.com/CLIP-HPC/goslmailer/issues/20) custom connector to

Re: [slurm-users] Providing users with info on wait time vs. run time

2022-09-15 Thread Ole Holm Nielsen
On 9/15/22 12:02, Loris Bennett wrote: Today I spotted a job which requested an entire node, then had to wait four around 16 hours and finally ran, apparently successfully, for less than 4 minutes. As it currently seems in general fashionable for users round here to request the maximum number of

[slurm-users] Providing users with info on wait time vs. run time

2022-09-15 Thread Loris Bennett
Hi, Today I spotted a job which requested an entire node, then had to wait four around 16 hours and finally ran, apparently successfully, for less than 4 minutes. As it currently seems in general fashionable for users round here to request the maximum number of cores available on a node without d