Sorry William for the long time in not replying (almost exactly a year!) your 
note was sent to my spam folder and I lost access to that cluster so it became 
less of a concern.

I recently got access to another system and had the same issue even with a 
local epilog with just /bin/true in it.  This time I found a big clue in the 
slurmd.log on one of the nodes:

[2023-03-24T18:43:11.525] debug:  Finished wait for job 134161's prolog to 
complete
[2023-03-24T18:43:56.573] Warning: Note very large processing time from 
slurm_getpwuid_r: usec=45048016 began=18:43:11.525
[2023-03-24T18:43:56.573] debug:  [job 134161] attempting to run epilog 
[/tmp/epilog.sh]
[2023-03-24T18:43:56.581] Warning: Note very large processing time from 
prep_g_epilog: usec=45055597 began=18:43:11.525
[2023-03-24T18:43:56.581] epilog for job 134161 ran for 45 seconds

Note that almost the entire time is in that slurm_getpwuid_r call.  Both the 
last cluster and this one use a single NIS server to serve the user accounts.  
Anyway, the resolution for my system is to make the account info  local to each 
system.  For ‘real’ systems, they will probably want to spread the load across 
multiple NIS servers, but I’m fine on my system with local account information.

Can anyone shed some light on why slurm is parsing the passwd file for the 
invoking user if the system epilog is going to be run as root anyway?  Maybe 
that is in there if the user has their own epilog?

Thanks,

Brent

PS: Kudos to whomever put the wrapper to check the duration of the 
slurm_getpwuid_r call!


From: slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] On Behalf Of 
William Brown
Sent: Friday, April 1, 2022 12:33 PM
To: Slurm User Community List <slurm-users@lists.schedmd.com>
Subject: Re: [slurm-users] nodes lingering in completion

To process the epilog a Bash process must be created so perhaps look at .bashrc.

Try timing running the epilog yourself on a compute node.  I presume it is 
owned by an account local to the compute nodes, not a directory service account?

William

On Fri, 1 Apr 2022, 17:25 Henderson, Brent, 
<brent.hender...@hpe.com<mailto:brent.hender...@hpe.com>> wrote:
Hi slurm experts -

I’ve gotten temporary access to a cluster with 1k nodes - so of course I setup 
slurm on it (v20.11.8).  ☺  Small jobs are fine and go back to idle rather 
quickly.  Jobs that use all the nodes will have some ‘linger’ in the completing 
state for over a minute while others may take less time - but still noticeable.

Reading some older posts, I see that the epilog is a typical cause for this so 
I removed it from the config file and indeed, nodes very quickly go back to the 
idle state after the job completes.  I then created an epilog on each node in 
/tmp that just contained the bash header and exit 0 and changed my run to be 
just: ‘salloc -N 1024  sleep 10’.  Even with this very simple command and 
epilog, the nodes exhibit the ‘lingering’ behavior before returning to idle.

Looking in the slurmd log for one of the nodes that took >60s to go back to 
idle, I see this:

[2022-03-31T20:57:44.158] Warning: Note very large processing time from 
prep_epilog: usec=75087286 began=20:56:29.070
[2022-03-31T20:57:44.158] epilog for job 43226 ran for 75 seconds

I tried upping the debug level on the slurmd side but didn’t see anything 
useful.

So, I guess I have a couple questions:
- anyone seen this behavior before and know a fix?  :)
- might this issue be resolved in 21.08?  (Didn’t see anything in the release 
note that talked about the epilog.)
- thoughts on how to collect some additional information on what might be 
happening on the system to slow down the epilog?

Thanks,

Brent

Reply via email to