We are pleased to announce the availability of Slurm versions 24.05.3
and 23.11.10.
Version 24.05.3 fixes a potential database problem when deleting a qos.
This bug only existed in 24.05.
Both versions have fixes for jobs potentially being stuck when using
cloud nodes when some nodes are powered down, a regression in 23.11.9
and 24.05.2 that caused sattach to crash, and some other minor issues.
Slurm can be downloaded from https://www.schedmd.com/downloads.php .
--
Marshall Garey
Release Management, Support, and Development
SchedMD LLC - Commercial Slurm Development and Support
* Changes in Slurm 24.05.3
==========================
-- data_parser/v0.0.40 - Added field descriptions
-- slurmrestd - Avoid creating new slurmdbd connection per request to
'* /slurm/slurmctld/*/*' endpoints.
-- Fix compilation issue with switch/hpe_slingshot plugin.
-- Fix gres per task allocation with threads-per-core.
-- data_parser/v0.0.41 - Added field descriptions
-- slurmrestd - Change back generated OpenAPI schema for
`DELETE /slurm/v0.0.40/jobs/` to RequestBody instead of using parameters
for request. slurmrestd will continue accept endpoint requests via
RequestBody or HTTP query.
-- topology/tree - Fix issues with switch distance optimization.
-- Fix potential segfault of secondary slurmctld when falling back to the
primary when running with a JobComp plugin.
-- Enable --json/--yaml=v0.0.39 options on client commands to dump data using
data_parser/v0.0.39 instead or outputting nothing.
-- switch/hpe_slingshot - Fix issue that could result in a 0 length state file.
-- Fix unnecessary message protocol downgrade for unregistered nodes.
-- Fix unnecessarily packing alias addrs when terminating jobs with a mix of
non-cloud/dynamic nodes and powered down cloud/dynamic nodes.
-- accounting_storage/mysql - Fix issue when deleting a qos that could remove
too many commas from the qos and/or delta_qos fields of the assoc table.
-- slurmctld - Fix memory leak when using RestrictedCoresPerGPU.
-- Fix allowing access to reservations without MaxStartDelay set.
-- Fix regression introduced in 24.05.0rc1 breaking srun --send-libs parsing.
-- Fix slurmd vsize memory leak when using job submission/allocation commands
that implicitly or explicitly use --get-user-env.
-- slurmd - Fix node going into invalid state when using CPUSpecList and
setting CPUs to the # of cores on a multithreaded node
-- Fix reboot asap nodes being considered in backfill after a restart.
-- Fix --clusters/-M queries for clusters outside of a federation when
fed_display is configured.
-- Fix scontrol allowing updating job with bad cpus-per-task value.
-- sattach - Fix regression from 24.05.2 security fix leading to crash.
-- mpi/pmix - Fix assertion when built under --enable-debug.
* Changes in Slurm 23.11.10
===========================
-- switch/hpe_slingshot - Fix issue that could result in a 0 length state file.
-- Fix unnecessary message protocol downgrade for unregistered nodes.
-- Fix unnecessarily packing alias addrs when terminating jobs with a mix of
non-cloud/dynamic nodes and powered down cloud/dynamic nodes.
-- Fix allowing access to reservations without MaxStartDelay set.
-- Fix scontrol allowing updating job with bad cpus-per-task value.
-- sattach - Fix regression from 23.11.9 security fix leading to crash.
--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com