Re: [slurm-users] Extract job information after completion

2021-04-27 Thread O'Grady, Paul Christopher
On Apr 27, 2021, at 10:44 AM, slurm-users-requ...@lists.schedmd.com wrote: In slurm.conf set: EpilogSlurmctld=/etc/slurm/slurm.epilogslurmctld Which does a number of things, including the following: root@pople01:/etc/slurm # tail -6 slurm.epilog

[slurm-users] Extract job information after completion

2021-04-27 Thread O'Grady, Paul Christopher
Sometimes when a slurm job fails I want to see what a user did, getting the command/workdir/stdout/stderr information. I can see that with "scontrol show job ". However, after the job is done that command doesn't seem to work anymore, saying "invalid job id". I try to use sacct, which seems t

Re: [slurm-users] [EXTERNAL] --no-alloc breaks mpi?

2021-03-08 Thread O'Grady, Paul Christopher
On Mar 8, 2021, at 1:35 PM, slurm-users-requ...@lists.schedmd.com wrote: What?s happening is that there?s no SLURM_JOBID (my speculation since I don?t have perms to use ?no-alloc) is set, but SLURM_NODELIST may be set, so its confusing ORTE. Coul

[slurm-users] --no-alloc breaks mpi?

2021-03-08 Thread O'Grady, Paul Christopher
Hi, I’m having an issue with srun's --no-alloc flag with mpi which I can reproduce with a fairly simple example. When I run a simple one-core mpi test program as “slurmUser” (the account that has the --no-alloc privilege) it succeeds: srun -p psfehq -n 1 -o logs/test.log -w psana1507 python ~