date:20220518

Re: [slurm-users] SLURM upgrade from 20.11.3 to 20.11.9 misidentification of job steps

2022-05-18 Thread John DeSantis

Hello, It also appears that random jobs are being identified as using too much memory, despite being well within limits. For example, a job is running that requested 2048 MB per CPU and all processes are within the limit. But, the job is identified as being over limit when it isn't. Please

[slurm-users] SLURM upgrade from 20.11.3 to 20.11.9 misidentification of job steps

2022-05-18 Thread John DeSantis

Hello, Due to the recent CVE posted by Tim, we did upgrade from SLURM 20.11.3 to 20.11.9. Today, I received a ticket from a user with their output files populated with the "slurmstepd: error: Exceeded job memory limit" message. But, the jobs are still running and it seems that the controller

Re: [slurm-users] container on slurm cluster

2022-05-18 Thread Brian Andrus

Ghui, It seems that things are doing what they should. You are allowing an account to become root inside the pod and the pod is considered a trusted environment by slurm (you are running munge inside it). So as far as slurm is concerned, 'root' from a trusted environment is submitting a job.

Re: [slurm-users] Slurm notifications, a more comprehensive solution - goslmailer

2022-05-18 Thread Petar Jager

Hi Hermann, You're welcome, looking forward to hearing some feedback from you. Regarding the matrix integration, or any other for that matter, gosl code was written with extensibility in mind. Meaning, all the helper code required to create a new connector is packaged and easily reusable. If you

Re: [slurm-users] container on slurm cluster

2022-05-18 Thread Josef Dvoracek

> I had config the right slurm and munge inside the container. this is the reason. Who has access to munge.key can effectively became root at slurm cluster. you should not disclose munge.key to containers. cheers josef On 18. 05. 22 9:13, GHui wrote: ...I had config the right slurm and mung

Re: [slurm-users] container on slurm cluster

2022-05-18 Thread Markus Kötter

Hi, On 18.05.22 08:25, Stephan Roth wrote: Personal note: I'm not sure what I'd choose as a successor to Singularity 3.8, yet. Thoughts are welcome. I can recommend nvidia enroot/pyxis. enroot does unprivileged sandboxes/containers, pyxis is the slurm SPANK glue. https://slurm.schedmd.com/

Re: [slurm-users] SLURM upgrade from 20.11.3 to 20.11.9 misidentification of job steps

[slurm-users] SLURM upgrade from 20.11.3 to 20.11.9 misidentification of job steps

Re: [slurm-users] container on slurm cluster

Re: [slurm-users] Slurm notifications, a more comprehensive solution - goslmailer

Re: [slurm-users] container on slurm cluster

Re: [slurm-users] container on slurm cluster

6 matches

Site Navigation

Mail list logo

Footer information