Slurm version 17.02.8 contains about 42 bug fixes developed over the past two months.
Slurm downloads are available from https://www.schedmd.com/downloads.php Details about the changes are listed below.
* Changes in Slurm 17.02.8 ========================== -- Add 'slurmdbd:' to the accounting plugin to notify message is from dbd instead of local. -- mpi/mvapich - Buffer being only partially cleared. No failures observed. -- Fix for job --switch option on dragonfly network. -- In salloc with --uid option, drop supplementary groups before changing UID. -- jobcomp/elasticsearch - strip any trailing slashes from JobCompLoc. -- jobcomp/elasticsearch - fix memory leak when transferring generated buffer. -- Prevent slurmstepd ABRT when parsing gres.conf CPUs. -- Fix sbatch --signal to signal all MPI ranks in a step instead of just those on node 0. -- Check multiple partition limits when scheduling a job that were previously only checked on submit. -- Cray: Avoid running application/step Node Health Check on the external job step. -- Optimization enhancements for partition based job preemption. -- Address some build warnings from GCC 7.1, and one possible memory leak if /proc is inaccessible. -- If creating/altering a core based reservation with scontrol/sview on a remote cluster correctly determine the select type. -- Fix autoconf test for libcurl when clang is used. -- Fix default location for cgroup_allowed_devices_file.conf to use correct default path. -- Document NewName option to sacctmgr. -- Reject a second PMI2_Init call within a single step to prevent slurmstepd from hanging. -- Handle old 32bit values stored in the database for requested memory correctly in sacct. -- Fix memory leaks in the task/cgroup plugin when constraining devices. -- Make extremely verbose info messages debug2 messages in the task/cgroup plugin when constraining devices. -- Fix issue that would deny the stepd access to /dev/null where GRES has a 'type' but no file defined. -- Fix issue where the slurmstepd would fatal on job launch if you have no gres listed in your slurm.conf but some in gres.conf. -- Fix validating time spec to correctly validate various time formats. -- Make scontrol work correctly with job update timelimit [+|-]=. -- Reduce the visibily of a number of warnings in _part_access_check. -- Prevent segfault in sacctmgr if no association name is specified for an update command. -- burst_buffer/cray plugin modified to work with changes in Cray UP06 software release. -- Fix job reasons for jobs that are violating assoc MaxTRESPerNode limits. -- Fix segfault when unpacking a 16.05 slurm_cred in a 17.02 daemon. -- Fix setting TRES limits with case insensitive TRES names. -- Add alias for xstrncmp() -- slurm_xstrncmp(). -- Fix sorting of case insensitive strings when using xstrcasecmp(). -- Gracefully handle race condition when reading /proc as process exits. -- Avoid error on Cray duplicate setup of core specialization. -- Skip over undefined (hidden in Slurm) nodes in pbsnodes. -- Add empty hashes in perl api's slurm_load_node() for hidden nodes. -- CRAY - Add rpath logic to work for the alpscomm libs. -- Fixes for administrator extended TimeLimit (job reason & time limit reset). -- Fix gres selection on systems running select/linear. -- sview: Added window decorator for maximize,minimize,close buttons for all systems. -- squeue: interpret negative length format specifiers as a request to delimit values with spaces. -- Fix the torque pbsnodes wrapper script to parse a gres field with a type set correctly