Re: [slurm-users] SLURM upgrade from 20.11.3 to 20.11.9 misidentification of job steps

2022-05-19 Thread John DeSantis
From: slurm-users On Behalf Of John DeSantis Sent: 18 May 2022 15:39 To: slurm-users@lists.schedmd.com Subject: Re: [slurm-users] SLURM upgrade from 20.11.3 to 20.11.9 misidentification of job steps Hello, It also appears that random jobs are being identified as using too much memory, d

Re: [slurm-users] SLURM upgrade from 20.11.3 to 20.11.9 misidentification of job steps

2022-05-19 Thread Luke Sudbery
work on Monday. -Original Message- From: slurm-users On Behalf Of John DeSantis Sent: 18 May 2022 15:39 To: slurm-users@lists.schedmd.com Subject: Re: [slurm-users] SLURM upgrade from 20.11.3 to 20.11.9 misidentification of job steps Hello, It also appears that random jobs are being identified as

Re: [slurm-users] SLURM upgrade from 20.11.3 to 20.11.9 misidentification of job steps

2022-05-18 Thread John DeSantis
Hello, It also appears that random jobs are being identified as using too much memory, despite being well within limits. For example, a job is running that requested 2048 MB per CPU and all processes are within the limit. But, the job is identified as being over limit when it isn't. Please

[slurm-users] SLURM upgrade from 20.11.3 to 20.11.9 misidentification of job steps

2022-05-18 Thread John DeSantis
Hello, Due to the recent CVE posted by Tim, we did upgrade from SLURM 20.11.3 to 20.11.9. Today, I received a ticket from a user with their output files populated with the "slurmstepd: error: Exceeded job memory limit" message. But, the jobs are still running and it seems that the controller