[slurm-users] GrpTRESMins

2020-06-02 Thread Dhumal, Dr. Nilesh
Hello, I set GrpTRESMins for a user "test" using following command sudo sacctmgr modify user test set GrpTRESMins=cpu=4000, qos=silver. User submitted job by following sbatch script. #!/bin/bash #SBATCH -N 1 #SBATCH -n 4 #SBATCH -o pmi_phosphate_conf1.log #SBATCH --user=test #SBATCH --accoun

Re: [slurm-users] Problem with permisions. CentOS 7.8

2020-06-02 Thread Brian Andrus
Going to chime in with some questions here... Do you know how were your RPMS built? Were they built on a system with the same packages and architecture as your nodes? That helps (a lot). If you know the command that was used to build them and what packages were included, that can help trouble

Re: [slurm-users] Problem with permisions. CentOS 7.8

2020-06-02 Thread Ferran Planas Padros
Hi Ole, I run the same version of slurm in all (master and computing) nodes (slurm-14.03.3-2). I agree that I should update the old nodes (which have CentOS 6.5 and 6.6) to CentOS 7. However, it is the installation of slurm on CentOS 7.8 that is giving me all these problems, so I am skeptical

Re: [slurm-users] [External] Re: Problem with permisions. CentOS 7.8

2020-06-02 Thread Michael Robbert
Those files in /run/system/generator.late/ look like they came from older SystemV init scripts. Can you check to make sure you don't have a slurm service script in /etc/init.d/? Also, note that there is a difference between the "slurm" service and the "slurmd" service. The former was the older n

Re: [slurm-users] Problem with permisions. CentOS 7.8

2020-06-02 Thread Ole Holm Nielsen
Hi Ferran, The Slurm RPMs built in the standard way will not cause any errors with Systemd daemons. You should not have any troubles on a correctly installed Slurm node. That is why I think you need to look at other problems in your setup. Which versions of Slurm do you run? Which nodes r

Re: [slurm-users] problems building slurm with MariaDB 10.4

2020-06-02 Thread Jeffrey McDonald
Thanks, one of your links to the MariaDB installs pointed out the issue, with version 10.4 you must install the MariaDB-shared rpm. My build completes successfully. Thanks, Jeff On Tue, Jun 2, 2020 at 11:20 AM Rodrigo Santibáñez wrote: > > Hello Jeffrey, > > I installed slurm 17.02.11 a week ago

Re: [slurm-users] Problem with permisions. CentOS 7.8

2020-06-02 Thread Jim Prewett
Hi Ferran, You're right that editing the files under /run/systemd will not persist after rebooting. I'm pretty sure the files that you're looking for are in /usr/lib/systemd/system This page has a nice writeup on the locations of the systemd-related files: https://www.digitalocean.com/co

Re: [slurm-users] Problem with permisions. CentOS 7.8

2020-06-02 Thread Rodrigo Santibáñez
Mmm... What I did was install all rpms in the calculation nodes (similarly as install all rpms in the controller node), but running on them slurmd only. I think you're aware that munge should be running in the calculation nodes, the munge.key should be the same in all nodes, slurm configuration fi

Re: [slurm-users] Problem with permisions. CentOS 7.8

2020-06-02 Thread Ferran Planas Padros
Hi, Thanks for your answer, However, I am setting up a calculating node, not the master node, and thus I have not installed slurmctld on it. After some digging, I have found that all these files: /run/systemd/generator.late/slurm.service /run/systemd/generator.late/runlevel5.target.wants/s

Re: [slurm-users] Problem with permisions. CentOS 7.8

2020-06-02 Thread Rodrigo Santibáñez
Yes, you have both daemons, installed with the slurm rpm.The slurmd (all nodes) communicates with slurmctld (runs in the main master node and, optionally, in a backup node). You do not need to run slurmd as the slurm user. Use `systemctld enable slurmctld` (and slurmd) followed by `systemclt start

Re: [slurm-users] problems building slurm with MariaDB 10.4

2020-06-02 Thread Rodrigo Santibáñez
Hello Jeffrey, I installed slurm 17.02.11 a week ago for centOS7 and I followed the instructions here https://wiki.fysik.dtu.dk/niflheim/Slurm_installation You could install MariaDB (I installed v10.4) with their repo and gpgkey with instructions here https://mariadb.com/kb/en/yum/ Then, you cou

[slurm-users] problems building slurm with MariaDB 10.4

2020-06-02 Thread Jeffrey McDonald
Hi, I'm trying to build the slurm rpms on a Centos 7.8 system with the mariadb 10.4 RPMs, # rpm -qa | grep -i mariadb: MariaDB-common-10.4.13-1.el7.centos.x86_64 MariaDB-server-10.4.13-1.el7.centos.x86_64 MariaDB-compat-10.4.13-1.el7.centos.x86_64 MariaDB-client-10.4.13-1.el7.centos.x86_64 MariaD

Re: [slurm-users] Problem with permisions. CentOS 7.8

2020-06-02 Thread Ferran Planas Padros
Hi Ole, Thanks for your answer and your time. I'd appreciate if you, or someone else, could make a final look at my case. After your suggestions and comments, I have re-done the whole installation for Munge and Slurm. I uninstalled and remoced all previous rpms and restarted from scratch. Mu

Re: [slurm-users] sacct

2020-06-02 Thread Ole Holm Nielsen
On 6/2/20 12:16 PM, Sidhu, Khushwant wrote: Do these parameters need to be enabled, plugins added/enabled ? Could it be a database problem ? (I've come to an previously installed slurm installation & have very little slurm experience) Yes, the database records may be purged after some time, s

Re: [slurm-users] sacct

2020-06-02 Thread Sidhu, Khushwant
Do these parameters need to be enabled, plugins added/enabled ? Could it be a database problem ? (I've come to an previously installed slurm installation & have very little slurm experience) Cheers Khush -Original Message- From: slurm-users On Behalf Of Ole Holm Nielsen Sent: 02 June

Re: [slurm-users] Problem with permisions. CentOS 7.8

2020-06-02 Thread Ole Holm Nielsen
Hi Ferran, Please install Slurm software in the standard way, see https://wiki.fysik.dtu.dk/niflheim/Slurm_installation It seems that you have some unusual way to manage your Linux systems. In Stockholm and Sweden there are many Slurm experts at the HPC centers which might be able to help you

Re: [slurm-users] Problem with permisions. CentOS 7.8

2020-06-02 Thread Ferran Planas Padros
Hi! I did a fresh installation with the EPEL repo, and installing munge from it and it worked. To have the slurm user for munge was definitely a problem, but that is the set up we have on the CentOS 6. Now I've learnt my lesson for future installations, thanks to everyone! Now, I have a fol

Re: [slurm-users] sacct

2020-06-02 Thread Ole Holm Nielsen
On 6/2/20 10:16 AM, Sidhu, Khushwant wrote: When a job is running & I use the command: Sacct –format “AveCPU, AveDiskRead, AveDiskWrite,user” –j 12345 I get values for all parameters. However, when a job is completed, the same command returns no values for all but ‘user’. Is there a reason

[slurm-users] sacct

2020-06-02 Thread Sidhu, Khushwant
Hi, When a job is running & I use the command: Sacct -format "AveCPU, AveDiskRead, AveDiskWrite,user" -j 12345 I get values for all parameters. However, when a job is completed, the same command returns no values for all but 'user'. Is there a reason for this ? Thanks Khush Disclaimer: This

Re: [slurm-users] [EXTERNAL] problems with OpenMPI 4.0.3

2020-06-02 Thread Barbara Krašovec
Afaik, there were some problems with certain versions of UCX, where UCX expected OPAL memory hooks from OMPI, but they were disabled and the physical pages became out-of-sync. But I don't know if this is the case. Maybe you could run dynamic debug to see if there is something useful in dmesg: ech