It only happens for versions on the 22.05 series prior to the latest
release (22.05.5). So the 21 version isn't impacted and you should be
fine to upgrade from 21 to 22.05.5 and not see the hash_k12 issue. If
you upgrade to any prior minor version though you will hit this issue.
-Paul Edmon-
Hi All,
Regarding
https://lists.schedmd.com/pipermail/slurm-users/2022-September/009222.html
.
Question for all of you that might have done this upgrade recently, does
this happen during the major version ( 21->22 in my case ) upgrade also ?
All of the discussion I found online about it only ment
I'm really pleased to find the test suite included with slurm, and after some
initial difficulty, I now am able to run the unit tests and expect tests.
The expect tests seem to generally be failing whenever the test involves tasks.
Anything asking for more than 1 task per node is failing.
[202
FWIW, I have used NFS/Gluster/Luster for a SaveStateLocation at various
times on various clusters.
I have never had an issue with any of them and run clusters in size up
to 1000+ nodes. I have even used the same share to symlink all the
nodes' slurm.conf with no issue.
Of course, YMMV, bu
HA for slurmctld is not multidatacenter HA but rather a traditional HA
setup where you have two server heads off of one storage brick
(connected by SAS cables or other fast interconnect). Multidatacenter
HA has issues with keeping things in sync due to latency and IOPs (as
noted below).
So t
Hello,
it seems that in a cluster configured for power saving, salloc does not wait
until the nodes
assigned to the job recover from the power down state and go back to normal
operation
Although the job is in the state CONFIGURING and the node are still in
IDLE+NOT_RESPONDING+POWERING_UP,
th
On 10/24/22 09:57, Diego Zuccato wrote:
Il 24/10/2022 09:32, Ole Holm Nielsen ha scritto:
> It is definitely a BAD idea to store Slurm StateSaveLocation on a slow
> NFS directory! SchedMD recommends to use local NVME or SSD disks
> because there will be many IOPS to this file system!
IIUC i
On 24/10/2022 09:32, Ole Holm Nielsen wrote:
On 10/24/22 06:12, Richard Chang wrote:
I have a two node Slurmctld setup and both will mount an NFS exported directory
as the state save location.
It is definitely a BAD idea to store Slurm StateSaveLocation on a slow NFS
directory! SchedMD reco
Il 24/10/2022 09:32, Ole Holm Nielsen ha scritto:
> It is definitely a BAD idea to store Slurm StateSaveLocation on a slow
> NFS directory! SchedMD recommends to use local NVME or SSD disks
> because there will be many IOPS to this file system!
IIUC it does have to be shared between controllers
On 10/24/22 06:12, Richard Chang wrote:
Is there a thumb rule for the size of the directory that is NFS exported,
and to be used as StateSaveLocation.
I have a two node Slurmctld setup and both will mount an NFS exported
directory as the state save location.
It is definitely a BAD idea to st
10 matches
Mail list logo