We had this issue recently. Some googling led me to the NERSC FAQs,
which state:
> _is_a_lwp is a function called internally for Slurm job accounting. The
> message indicates a rare error situation with a function call. But the error
> shouldn't affect anything in the user job. Please ignore t
We have a user that keeps encountering this error with one type of her jobs.
Sometimes her jobs will cancel and other times it will run fine.
slurmstepd: error: _is_a_lwp: open() /proc/195420/status failed: No such file
or directory
slurmstepd: error: *** JOB 17534 ON pe2dc5-0007 CANCELLED AT
2