Re: [slurm-users] [External] Re: srun : Communication connection failure

2022-01-25 Thread Durai Arasan
Hello Mike,Doug: The issue was resolved somehow. My colleagues says the addresses in slurm.conf on the login nodes were incorrect. It could also have been a temporary network issue. Best, Durai Arasan MPI Tübingen On Fri, Jan 21, 2022 at 2:15 PM Doug Meyer wrote: > Hi, > Did you recently add n

[slurm-users] Fwd: useradd: group 'slurm' does not exist

2022-01-25 Thread Nousheen
Hello everyone, I am struggling with the installation of slurm on Centos 7. while following this tutorial https://www.slothparadise.com/how-to-install-slurm-on-centos-7-cluster/ , after the installation of MariaDB, I try to create users for slurm and munge but following the same sequence of comman

Re: [slurm-users] Fwd: useradd: group 'slurm' does not exist

2022-01-25 Thread Jeffrey R. Lang
Looking at what you provided in your email the groupadd commands are failing, due to the requested GID 991 and 992 already being assigned by the system your installing on. Check the /etc/group file and find two GID numbers lower than 991 that are unused and use those instead. Keep them in the

Re: [slurm-users] how to allocate high priority to low cpu and memory jobs

2022-01-25 Thread Renfro, Michael
Since there's only 9 factors to assign priority weights to, one way around this might be to set up separate partitions for high memory and low memory jobs (with a max memory allowed for the low memory partition), and then use partition weights to separate those jobs out. From: slurm-users on b

Re: [slurm-users] [External] Re: srun : Communication connection failure

2022-01-25 Thread Doug Meyer
Always hate those odd problems. Glad you are up! Doug On Tue, Jan 25, 2022, 6:43 AM Durai Arasan wrote: > Hello Mike,Doug: > > The issue was resolved somehow. My colleagues says the addresses in > slurm.conf on the login nodes were incorrect. It could also have been a > temporary network issue.

Re: [slurm-users] srun : Communication connection failure

2022-01-25 Thread Ryan Novosielski
I’m coming to this question late, and this is not the answer to your problem (well, maybe tangentially), but it may help someone else: my recollection is that the compute node that gets assigned the job must be able to contact the node you’re starting the interactive job from (so bg-slurmb-login