Hello,
I set GrpTRESMins for a user "test" using following command
sudo sacctmgr modify user test set GrpTRESMins=cpu=4000, qos=silver.
User submitted job by following sbatch script.
#!/bin/bash
#SBATCH -N 1
#SBATCH -n 4
#SBATCH -o pmi_phosphate_conf1.log
#SBATCH --user=test
#SBATCH --accoun
Going to chime in with some questions here...
Do you know how were your RPMS built? Were they built on a system with
the same packages and architecture as your nodes? That helps (a lot). If
you know the command that was used to build them and what packages were
included, that can help trouble
Hi Ole,
I run the same version of slurm in all (master and computing) nodes
(slurm-14.03.3-2). I agree that I should update the old nodes (which have
CentOS 6.5 and 6.6) to CentOS 7. However, it is the installation of slurm on
CentOS 7.8 that is giving me all these problems, so I am skeptical
Those files in /run/system/generator.late/ look like they came from older
SystemV init scripts. Can you check to make sure you don't have a slurm service
script in /etc/init.d/?
Also, note that there is a difference between the "slurm" service and the
"slurmd" service. The former was the older n
Hi Ferran,
The Slurm RPMs built in the standard way will not cause any errors with
Systemd daemons. You should not have any troubles on a correctly
installed Slurm node. That is why I think you need to look at other
problems in your setup.
Which versions of Slurm do you run?
Which nodes r
Thanks, one of your links to the MariaDB installs pointed out the
issue, with version 10.4 you must install the MariaDB-shared rpm.
My build completes successfully.
Thanks,
Jeff
On Tue, Jun 2, 2020 at 11:20 AM Rodrigo Santibáñez
wrote:
>
> Hello Jeffrey,
>
> I installed slurm 17.02.11 a week ago
Hi Ferran,
You're right that editing the files under /run/systemd will not persist
after rebooting. I'm pretty sure the files that you're looking for are in
/usr/lib/systemd/system
This page has a nice writeup on the locations of the systemd-related
files:
https://www.digitalocean.com/co
Mmm... What I did was install all rpms in the calculation nodes (similarly
as install all rpms in the controller node), but running on them slurmd
only.
I think you're aware that munge should be running in the calculation nodes,
the munge.key should be the same in all nodes, slurm configuration fi
Hi,
Thanks for your answer,
However, I am setting up a calculating node, not the master node, and thus I
have not installed slurmctld on it.
After some digging, I have found that all these files:
/run/systemd/generator.late/slurm.service
/run/systemd/generator.late/runlevel5.target.wants/s
Yes, you have both daemons, installed with the slurm rpm.The slurmd (all
nodes) communicates with slurmctld (runs in the main master node and,
optionally, in a backup node).
You do not need to run slurmd as the slurm user. Use `systemctld enable
slurmctld` (and slurmd) followed by `systemclt start
Hello Jeffrey,
I installed slurm 17.02.11 a week ago for centOS7 and I followed the
instructions here https://wiki.fysik.dtu.dk/niflheim/Slurm_installation
You could install MariaDB (I installed v10.4) with their repo and gpgkey
with instructions here https://mariadb.com/kb/en/yum/
Then, you cou
Hi,
I'm trying to build the slurm rpms on a Centos 7.8 system with the
mariadb 10.4 RPMs,
# rpm -qa | grep -i mariadb:
MariaDB-common-10.4.13-1.el7.centos.x86_64
MariaDB-server-10.4.13-1.el7.centos.x86_64
MariaDB-compat-10.4.13-1.el7.centos.x86_64
MariaDB-client-10.4.13-1.el7.centos.x86_64
MariaD
Hi Ole,
Thanks for your answer and your time. I'd appreciate if you, or someone else,
could make a final look at my case.
After your suggestions and comments, I have re-done the whole installation for
Munge and Slurm. I uninstalled and remoced all previous rpms and restarted from
scratch. Mu
On 6/2/20 12:16 PM, Sidhu, Khushwant wrote:
Do these parameters need to be enabled, plugins added/enabled ?
Could it be a database problem ?
(I've come to an previously installed slurm installation & have very little
slurm experience)
Yes, the database records may be purged after some time, s
Do these parameters need to be enabled, plugins added/enabled ?
Could it be a database problem ?
(I've come to an previously installed slurm installation & have very little
slurm experience)
Cheers
Khush
-Original Message-
From: slurm-users On Behalf Of Ole Holm
Nielsen
Sent: 02 June
Hi Ferran,
Please install Slurm software in the standard way, see
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation
It seems that you have some unusual way to manage your Linux systems. In
Stockholm and Sweden there are many Slurm experts at the HPC centers which
might be able to help you
Hi!
I did a fresh installation with the EPEL repo, and installing munge from it and
it worked. To have the slurm user for munge was definitely a problem, but that
is the set up we have on the CentOS 6. Now I've learnt my lesson for future
installations, thanks to everyone!
Now, I have a fol
On 6/2/20 10:16 AM, Sidhu, Khushwant wrote:
When a job is running & I use the command:
Sacct –format “AveCPU, AveDiskRead, AveDiskWrite,user” –j 12345
I get values for all parameters.
However, when a job is completed, the same command returns no values for
all but ‘user’.
Is there a reason
Hi,
When a job is running & I use the command:
Sacct -format "AveCPU, AveDiskRead, AveDiskWrite,user" -j 12345
I get values for all parameters.
However, when a job is completed, the same command returns no values for all
but 'user'.
Is there a reason for this ?
Thanks
Khush
Disclaimer: This
Afaik, there were some problems with certain versions of UCX, where UCX
expected OPAL memory hooks from OMPI, but they were disabled and the
physical pages became out-of-sync. But I don't know if this is the case.
Maybe you could run dynamic debug to see if there is something useful in
dmesg:
ech
20 matches
Mail list logo