[slurm-users] Re: slumrestd 24.05.1: crashes when GET on /slurm/v0.0.41/nodes : unsorted double linked list corrupted

2024-07-24 Thread Daniel Letai via slurm-users
This is a know issue and resolved in 24.05.2 in the patches labeled "Always allocate pointers despite skipping parsing" For example: https://github.com/SchedMD/slurm/commit/5b07b6bda407431215606b93e57d0a9b7f4c9b53 The same patch also applies to 0.0.40 and 0.0

[slurm-users] Re: slumrestd 24.05.1: crashes when GET on /slurm/v0.0.41/nodes : unsorted double linked list corrupted

2024-07-24 Thread Josef Dvořáček via slurm-users
Ok, answering myself.. It seems that endpoint /slurm/v0.0.39/jobs works well. Not sure why, but I'm ok to live with that, so perhaps it will help to someone too. cheers josef (this time via socket) WORKS OK: # curl -si --header X-SLURM-USER-NAME:root --header X-SLURM-USER-TOKEN:$SLURM_JWT --un

[slurm-users] slumrestd 24.05.1: crashes when GET on /slurm/v0.0.41/nodes : unsorted double linked list corrupted

2024-07-24 Thread Josef Dvořáček via slurm-users
Isn't this failure familiar to anyone? When I ask API endpoint "localhost:6820/slurm/v0.0.41/jobs", slurmrestd segrafults with unsorted double linked list corrupted. Anyone using this API endpoint without segfaulting? I do the get using curl: curl --header X-SLURM-USER-NAME:root --header X-SLU

[slurm-users] Re: slurmctld hourly: Unexpected missing socket error

2024-07-24 Thread Patryk Bełzak via slurm-users
Hi, we're on 389 directory server (aka 389ds), which is pretty large instance. One of optimizations was to create proper ACI's on server side which significantly improved lookup times on slurm controller and worker nodes. Second thing was to move sssd cache to tmpfs - instruction by RedHat: ht