Hi Mark, Chris,
On Mon, Jul 15, 2019 at 01:23:20PM -0400, Mark Hahn wrote:
> > Could it be a RHEL7 specific issue?
>
> no - centos7 systems here, and pam_adopt works.
Can you show what your /etc/pam.d/sshd looks like?
Kind regards,
-- Andy
signature.asc
Description: PGP signature
Our site has been going through the process of upgrading SLURM on our primary
cluster which was delivered to us with Slurm 16.05 with Bright Computing.
We're currently at 17.02.13-2 and working to get to 17.11 and then 18.08.
We've run into an issue with 17.11 and switching effective GID on a
Is it possible to set a cluster level limit of GPUs per user? We'd like
to implement a limit of how many GPUs a user may use across multiple
partitions at one time.
I tried this, but it obviously isn't correct:
# sacctmgr modify cluster slurm_cluster set MaxTRESPerUser=gres/gpu=2
Unknown o
On 7/17/19 12:26 AM, Chris Samuel wrote:
On 16/7/19 11:43 am, Will Dennis wrote:
[2019-07-16T09:36:51.464] error: slurmdbd: agent queue is full (20140),
discarding DBD_STEP_START:1442 request
So it looks like your slurmdbd cannot keep up with the rate of these incoming
steps and is having
Unfortunately, I think you're stuck in setting it at the account level with
sacctmgr. You could also set that limit as part of a QoS and then attach
the QoS to the partition. But I think that's as granular as you can get for
limiting TRES'.
HTH!
David
On Wed, Jul 17, 2019 at 10:11 AM Mike Harvey
Not thinking that the server (which runs both the Slurm controller daemon and
the DB) is the issue... It's a Dell PowerEdge R430 platform, with dual Intel
Xeon E5-2640v3 CPUs and 256GB memory, and RAID-1 array of 1TB SATA disks.
top - 09:29:26 up 101 days, 14:57, 3 users, load average: 0.06,
OK, as it turns out, it was a problem like this bug:
https://bugs.schedmd.com/show_bug.cgi?id=3819 ( cf
https://bugs.schedmd.com/show_bug.cgi?id=2741 as well )
Back in May, I posted the following thread:
https://lists.schedmd.com/pipermail/slurm-users/2019-May/003372.html - to which
I never go
Hi Andy,
We have RHEL7, and pam_slurm_adopt is working for us as well, with memory
constraint working
pam.d/sshd:
#%PAM-1.0
auth required pam_sepermit.so
auth substack password-auth
auth include postlogin
# Used with polkit to reauthorize users in remote sessions
-
On 7/17/19 4:05 AM, Andy Georges wrote:
Can you show what your /etc/pam.d/sshd looks like?
For us it's actually here:
---
# cat /etc/pam.d/common-account
#%PAM-1.0
#
# This file is autogenerated by pam-config. All changes
# will be o