Just noticed this. On the problem node the munged.log file has an entry
every 1:40:
2020-04-17 15:31:02 -0600 Info: Invalid credential
2020-04-17 15:32:42 -0600 Info: Invalid credential
2020-04-17 15:34:22 -0600 Info: Invalid credential
This happens on the failed node and two othe
Both work. The only discrepancy is that the slurm controller output had
these two lines:
UID: ??? (1000)
GID: ??? (1000)
Like the controller doesn't know the username for UID 1000.
But it returned success 0
On Fri, Apr 17, 2020 at 2:00 PM Riebs, Andy wrote:
> A coup
A couple of quick checks to see if the problem is munge:
1. On the problem node, try
$ echo foo | munge | unmunge
2. If (1) works, try this from the node running slurmctld to the problem
node
slurm-node$ echo foo | ssh node munge | unmunge
From: slurm-users [mailto:slurm-users-boun.
Is there an alternative to munge when running slurm? Munge issues are a
common problem in slurm, and munge doesn't give any useful information when
a problem occurs. An alternative that at least gave some useful
information when a problem occurs would be a big improvement.
Thanks.
There is no ntp service running on any of my nodes, and all but this one is
working. I haven't heard that ntp is a requirement for slurm, just that
the time be synchronized across the cluster. And it is.
On Wed, Apr 15, 2020 at 12:17 PM Carlos Fenoy wrote:
> I’d check ntp as your encoding time
I went back and built the slurm-19.05.6 rpms using:
rpmbuld -ta slurm-19.05.6.tar.bz2 for slurm-19.05.6.
It still failed with:
Error: Package: slurm-19.05.6-1.el7.x86_64
Requires: libnvidia-ml.so.1()(64bit)
Now I remember why I went back to 18.08. It was because this post
https://lis
Can’t speak for everyone, but I went to Slurm 19.05 some months back, and
haven't had any problems with CUDA 10.0 or 10.1 (or 8.0, 9.0, or 9.1).
> On Apr 17, 2020, at 8:46 AM, Lisa Kay Weihl wrote:
>
> External Email Warning
>
> This email originated from outside the university. Please use cau
Wow. I did not catch that version issue. I saw that there were issues with the
newest Slurm and how CUDA 10+ installs so I avoided that even though we have
CUDA 8. I did have Slurm 19 downloaded so I'm thinking I ran into an issue with
that and went back to 18 but now that I have more experience
On 17-04-2020 11:47, Ole Holm Nielsen wrote:
On 17-04-2020 10:38, Christian Anthon wrote:
It would be neat to have these build requirements / install
requirements built into the spec file.
I agree with you, and it seems that the SchedMD pages no longer list the
build prerequisites (I think th
On 17-04-2020 10:38, Christian Anthon wrote:
It would be neat to have these build requirements / install requirements
built into the spec file.
I agree with you, and it seems that the SchedMD pages no longer list the
build prerequisites (I think there was some information in the past).
Try go
Hello
I did install mariadb-server and mariadb-devel and all worked fine
Thank you
Felix
On 4/17/2020 11:38 AM, Christian Anthon wrote:
It would be neat to have these build requirements / install
requirements built into the spec file.
Cheers, Christian.
On 17/04/2020 10.08, Ole Holm Niels
It would be neat to have these build requirements / install requirements
built into the spec file.
Cheers, Christian.
On 17/04/2020 10.08, Ole Holm Nielsen wrote:
Hi Felix,
Please make sure to install all prerequisite packages on the Slurm
build host. I have summarized this information in m
Hi Felix,
Please make sure to install all prerequisite packages on the Slurm build
host. I have summarized this information in my Slurm Wiki page:
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#build-slurm-rpms
/Ole
On 17-04-2020 09:11, Felix Farcas wrote:
I am trying to build a rpm
As such it is a mistake in the rpm spec file. But you just need
mariadb-devel, or possibly mysql-devel installed.
Cheers, Christian.
On 17/04/2020 09.11, Felix Farcas wrote:
Hello
I am trying to build a rpm for a new server and I get the following
error:
Requires(interp): /bin/sh /bin/sh /
Hello
I am trying to build a rpm for a new server and I get the following error:
Requires(interp): /bin/sh /bin/sh /bin/sh
Requires(rpmlib): rpmlib(FileDigests) <= 4.6.0-1
rpmlib(PayloadFilesHavePrefix) <= 4.0-1 rpmlib(CompressedFileNames) <=
3.0.4-1
Requires(post): /bin/sh
Requires(preun): /
15 matches
Mail list logo