We are pleased to announce the availability of Slurm version 24.05.5.

This release fixes a few potential crashes, several stepmgr bugs, compatibility for sstat and sattach with newer version steps, and some other minor bugs.

Downloads are available at https://www.schedmd.com/downloads.php .

--
Marshall Garey
Release Management, Support, and Development
SchedMD LLC - Commercial Slurm Development and Support

* Changes in Slurm 24.05.5
==========================
 -- Fix issue signaling cron jobs resulting in unintended requeues.
 -- Fix slurmctld memory leak in implementation of HealthCheckNodeState=CYCLE.
 -- job_container/tmpfs - Fix SLURM_CONF env variable not being properly set.
 -- sched/backfill - Fix job's time_limit being overwritten by time_min for job
    arrays in some situations.
 -- RoutePart - fix segfault from incorrect memory allocation when node doesn't
    exist in any partition.
 -- slurmctld - Fix crash when a job is evaluated for a reservation after
    removal of a dynamic node.
 -- gpu/nvml - Attempt loading libnvidia-ml.so.1 as a fallback for failure in
    loading libnvidia-ml.so.
 -- slurmrestd - Fix populating non-required object fields of objects as '{}' in
    JSON/YAML instead of 'null' causing compiled OpenAPI clients to reject
    the response to 'GET /slurm/v0.0.40/jobs' due to validation failure of
    '.jobs[].job_resources'.
 -- Fix sstat/sattach protocol errors for steps on higher version slurmd's
    (regressions since 20.11.0rc1 and 16.05.1rc1 respectively).
 -- slurmd - Avoid a crash when starting slurmd version 24.05 with
    SlurmdSpoolDir files that have been upgraded to a newer major version of
    Slurm. Log warnings instead.
 -- Fix race condition in stepmgr step completion handling.
 -- Fix slurmctld segfault with stepmgr and MpiParams when running a job array.
 -- Fix requeued jobs keeping their priority until the decay thread happens.
 -- slurmctld - Fix crash and possible split brain issue if the
    backup controller handles an scontrol reconfigure while in control
    before the primary resumes operation.
 -- Fix stepmgr not getting dynamic node addrs from the controller
 -- stepmgr - avoid "Unexpected missing socket" errors.
 -- Fix `scontrol show steps` with dynamic stepmgr
 -- Support IPv6 in configless mode

--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to