Hi Jason,

What happens when you try to run that command on the node? Is the exit
status of the command 0?

e.g. for my servers, where lingering is masked, I get

[root@thespian-gpgpu001 ~]# loginctl enable-linger scrosby
Could not enable linger: Unit is masked.
[root@thespian-gpgpu001 ~]# echo $?
1

Sean

--
Sean Crosby | Senior DevOpsHPC Engineer and HPC Team Lead
Research Computing Services | Business Services
The University of Melbourne, Victoria 3010 Australia



On Wed, 8 Jul 2020 at 01:14, Jason Simms <sim...@lafayette.edu> wrote:

> *UoM notice: External email. Be cautious of links, attachments, or
> impersonation attempts.*
> ------------------------------
> Hello all,
>
> Two users on my system experience job failures every time they submit a
> job via sbatch. When I run their exact submission script, or when I create
> a local system user and launch from there, the jobs run fine. Here is an
> example of what I see in the slurmd log:
>
> [2020-07-06T15:02:41.284] task_p_slurmd_batch_request: 1421
> [2020-07-06T15:02:41.284] task/affinity: job 1421 CPU input mask for node:
> 0x00000F0000
> [2020-07-06T15:02:41.284] task/affinity: job 1421 CPU final HW mask for
> node: 0x00000F0000
> [2020-07-06T15:02:41.295] _run_prolog: prolog with lock for job 1421 ran
> for 0 seconds
> [2020-07-06T15:02:41.295] error: [job 1421] prolog failed status=1:0
> [2020-07-06T15:02:41.295] Job 1421 already killed, do not launch batch job
>
> The prolog file is simply:
>
> #!/bin/bash
> loginctl enable-linger $SLURM_JOB_USER
>
> There seems to be some reason why certain users always encounter this, but
> I can't figure out why. Their accounts are no "different" than anyone else
> (not in a different group, etc.), so I don't think permissions are an issue.
>
> Anyway, the job failure immediately puts the node into a DRAINED/DRAINING
> state (which is expected). But for now, these users cannot submit any jobs
> at all.
>
> Any insights would be welcomed!
>
> Warmest regards,
> Jason
>
> --
> *Jason L. Simms, Ph.D., M.P.H.*
> Manager of Research and High-Performance Computing
> XSEDE Campus Champion
> Lafayette College
> Information Technology Services
> 710 Sullivan Rd | Easton, PA 18042
> Office: 112 Skillman Library
> p: (610) 330-5632
>

Reply via email to