We are pleased to announce the availability of Slurm versions 24.05.3 and 23.11.10.

Version 24.05.3 fixes a potential database problem when deleting a qos. This bug only existed in 24.05.

Both versions have fixes for jobs potentially being stuck when using cloud nodes when some nodes are powered down, a regression in 23.11.9 and 24.05.2 that caused sattach to crash, and some other minor issues.

Slurm can be downloaded from https://www.schedmd.com/downloads.php .

--
Marshall Garey
Release Management, Support, and Development
SchedMD LLC - Commercial Slurm Development and Support

* Changes in Slurm 24.05.3
==========================
 -- data_parser/v0.0.40 - Added field descriptions
 -- slurmrestd - Avoid creating new slurmdbd connection per request to
    '* /slurm/slurmctld/*/*' endpoints.
 -- Fix compilation issue with switch/hpe_slingshot plugin.
 -- Fix gres per task allocation with threads-per-core.
 -- data_parser/v0.0.41 - Added field descriptions
 -- slurmrestd - Change back generated OpenAPI schema for
    `DELETE /slurm/v0.0.40/jobs/` to RequestBody instead of using parameters
    for request. slurmrestd will continue accept endpoint requests via
    RequestBody or HTTP query.
 -- topology/tree - Fix issues with switch distance optimization.
 -- Fix potential segfault of secondary slurmctld when falling back to the
    primary when running with a JobComp plugin.
 -- Enable --json/--yaml=v0.0.39 options on client commands to dump data using
    data_parser/v0.0.39 instead or outputting nothing.
 -- switch/hpe_slingshot - Fix issue that could result in a 0 length state file.
 -- Fix unnecessary message protocol downgrade for unregistered nodes.
 -- Fix unnecessarily packing alias addrs when terminating jobs with a mix of
    non-cloud/dynamic nodes and powered down cloud/dynamic nodes.
 -- accounting_storage/mysql - Fix issue when deleting a qos that could remove
    too many commas from the qos and/or delta_qos fields of the assoc table.
 -- slurmctld - Fix memory leak when using RestrictedCoresPerGPU.
 -- Fix allowing access to reservations without MaxStartDelay set.
 -- Fix regression introduced in 24.05.0rc1 breaking srun --send-libs parsing.
 -- Fix slurmd vsize memory leak when using job submission/allocation commands
    that implicitly or explicitly use --get-user-env.
 -- slurmd - Fix node going into invalid state when using CPUSpecList and
    setting CPUs to the # of cores on a multithreaded node
 -- Fix reboot asap nodes being considered in backfill after a restart.
 -- Fix --clusters/-M queries for clusters outside of a federation when
    fed_display is configured.
 -- Fix scontrol allowing updating job with bad cpus-per-task value.
 -- sattach - Fix regression from 24.05.2 security fix leading to crash.
 -- mpi/pmix - Fix assertion when built under --enable-debug.

* Changes in Slurm 23.11.10
===========================
 -- switch/hpe_slingshot - Fix issue that could result in a 0 length state file.
 -- Fix unnecessary message protocol downgrade for unregistered nodes.
 -- Fix unnecessarily packing alias addrs when terminating jobs with a mix of
    non-cloud/dynamic nodes and powered down cloud/dynamic nodes.
 -- Fix allowing access to reservations without MaxStartDelay set.
 -- Fix scontrol allowing updating job with bad cpus-per-task value.
 -- sattach - Fix regression from 23.11.9 security fix leading to crash.

--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to