A colleague found this, and it resolved the issue for me.
https://bugs.schedmd.com/show_bug.cgi?id=14134
The /etc/hosts on the compute nodes did not have this extra line, but
the file on the login/slurmctld node did have it.
I removed the line and now e.g. srun -x11 -N 1 xclock works.
Allan
rated the magic cookies - it couldn't cope with the node hostname
> being it's FQDN, it needed it to be the short hostname. I 'fixed' it by
> changing all the nodes' hostnames to short format :) (CentOS uses long
> by default, but it's not as if I re
Davide DelVento writes:
> Perhaps just a very trivial question, but it doesn't look you
> mentioned it: does your X-forwarding work from the login node? Maybe
> the X-server on your client is the problem and trying xclock on the
> login node would clarify that
Sorry, yes running xterm, xclock, e
Hi everyone,
I'm trying to get X11 forwarding working on my cluster. I've read some
of the threads and web posts on X11 forwarding and most of the common
issues I'm finding seem to pertain to older versions of Slurm.
I log in from my workstation to the login node with ssh -X. I have x11
apps inst
Well looking at the current slurm.conf it appears that the name was
changed, and "Shared" is now called "OverSubscribe" in more modern slurm
versions. So you might look deeper at what config options are in
conflict, since with the EXCLUSIVE mode I get one node per job here.
I'm running an even older slurm than you (it does what I need, I am a
team of one and I have many things to take care of other than chasing
the latest version of every piece of software).
Anyway, did you try Shared=EXCLUSIVE in the partition configuration?
>From the (v14.11.7) slurm.conf man page