On Fri, Sep 16, 2022 at 3:43 PM Sebastian Potthoff <
s.potth...@uni-muenster.de> wrote:
> Hi Hermann,
>
> So you both are happily(?) ignoring this warning the "Prolog and Epilog
> Guide",
> right? :-)
>
> "Prolog and Epilog scripts [...] should not call Slurm commands (e.g.
> squeue,
> scontrol, s
We also call scontrol in our scripts (a little as we can manage) and we
run at the scale of 1500 nodes. It hasn't really caused many issues,
but we try to limit it as much as we possibly can.
-Paul Edmon-
On 9/16/22 9:41 AM, Sebastian Potthoff wrote:
Hi Hermann,
So you both are happily(?) i
Hi Hermann,
>> So you both are happily(?) ignoring this warning the "Prolog and Epilog
>> Guide",
>> right? :-)
>>
>> "Prolog and Epilog scripts [...] should not call Slurm commands (e.g. squeue,
>> scontrol, sacctmgr, etc)."
>
> We have probably been doing this since before the warning was add
Hi Hermann,
Hermann Schwärzler writes:
> Hi Loris,
> hi Sebastian,
>
> thanks for the information on how you are doing this.
> So you both are happily(?) ignoring this warning the "Prolog and Epilog
> Guide",
> right? :-)
>
> "Prolog and Epilog scripts [...] should not call Slurm commands (e.g.
Hi Loris,
hi Sebastian,
thanks for the information on how you are doing this.
So you both are happily(?) ignoring this warning the "Prolog and Epilog
Guide", right? :-)
"Prolog and Epilog scripts [...] should not call Slurm commands (e.g.
squeue, scontrol, sacctmgr, etc)."
May I ask how big
Hi Sebastian,
Sebastian Potthoff writes:
> Hi Loris
>
> We do something similar. At the end of our script pointed to by
> EpilogSlurmctld we have
>
> Using EpilogSlurmctld only works if the slurmctld user is root (or slurm with
> root privileges), right? I opted for the normal Epilog since w
Hi Loris
> We do something similar. At the end of our script pointed to by
> EpilogSlurmctld we have
Using EpilogSlurmctld only works if the slurmctld user is root (or slurm with
root privileges), right? I opted for the normal Epilog since we wanted to avoid
running slurm as root and I don’t h
Hi Hermann,
Sebastian Potthoff writes:
> Hi Hermann,
>
> I happened to read along this conversation and was just solving this issue
> today. I added this part to the epilog script to make it work:
>
> # Add job report to stdout
> StdOut=$(/usr/bin/scontrol show job=$SLURM_JOB_ID | /usr/bin/grep
Hi Hermann,
On 9/15/22 18:07, Hermann Schwärzler wrote:
Use the "smail" tool from the slurm-contribs RPM and set this in
slurm.conf:
MailProg=/usr/bin/smail
Maybe I am missing something but from what I can tell smail sends an email
and does *not* change or append to the .out file of a job..
Hi Hermann,
I happened to read along this conversation and was just solving this issue
today. I added this part to the epilog script to make it work:
# Add job report to stdout
StdOut=$(/usr/bin/scontrol show job=$SLURM_JOB_ID | /usr/bin/grep StdOut |
/usr/bin/xargs | /usr/bin/awk 'BEGIN { FS
Hi Ole,
On 9/15/22 5:21 PM, Ole Holm Nielsen wrote:
On 15-09-2022 16:08, Hermann Schwärzler wrote:
Just out of curiosity: how do you insert the output of seff into the
out-file of a job?
Use the "smail" tool from the slurm-contribs RPM and set this in
slurm.conf:
MailProg=/usr/bin/smail
On 15-09-2022 16:08, Hermann Schwärzler wrote:
Just out of curiosity: how do you insert the output of seff into the
out-file of a job?
Use the "smail" tool from the slurm-contribs RPM and set this in slurm.conf:
MailProg=/usr/bin/smail
/Ole
Hi Loris,
we try to achieve the same (I guess) - which is nudging the users in the
direction of using scarce resources carefully - by using goslmailer
(https://github.com/CLIP-HPC/goslmailer) and a (not yet published - see
https://github.com/CLIP-HPC/goslmailer/issues/20) custom connector to
On 9/15/22 12:02, Loris Bennett wrote:
Today I spotted a job which requested an entire node, then had to wait
four around 16 hours and finally ran, apparently successfully, for less
than 4 minutes.
As it currently seems in general fashionable for users round here to
request the maximum number of
Hi,
Today I spotted a job which requested an entire node, then had to wait
four around 16 hours and finally ran, apparently successfully, for less
than 4 minutes.
As it currently seems in general fashionable for users round here to
request the maximum number of cores available on a node without d
15 matches
Mail list logo