We are pleased to announce the availability of Slurm version 20.11.8.
This includes a number of minor-to-moderate severity bug fixes. Slurm can be downloaded from https://www.schedmd.com/downloads.php . - Tim -- Tim Wickberg Chief Technology Officer, SchedMD LLC Commercial Slurm Development and Support
* Changes in Slurm 20.11.8 ========================== -- slurmctld - fix erroneous "StepId=CORRUPT" messages in error logs. -- Correct the error given when auth plugin fails to pack a credential. -- Fix unused-variable compiler warning on FreeBSD in fd_resolve_path(). -- acct_gather_filesystem/lustre - only emit collection error once per step. -- srun - leave SLURM_DIST_UNKNOWN as default for --interactive. -- Add GRES environment variables (e.g., CUDA_VISIBLE_DEVICES) into the interactive step, the same as is done for the batch step. -- Fix various potential deadlocks when altering objects in the database dealing with every cluster in the database. -- slurmrestd - handle slurmdbd connection failures without segfaulting. -- slurmrestd - fix segfault for searches in slurmdb/v0.0.36/jobs. -- slurmrestd - remove (non-functioning) users query parameter for slurmdb/v0.0.36/jobs from openapi.json -- slurmrestd - fix segfault in slurmrestd db/jobs with numeric queries -- slurmrestd - add argv handling for job/submit endpoint. -- srun - fix broken node step allocation in a heterogeneous allocation. -- Fail step creation if -n is not multiple of --ntasks-per-gpu. -- job_container/tmpfs - Fix slowdown on teardown. -- Fix problem with SlurmctldProlog where requeued jobs would never launch. -- job_container/tmpfs - Fix issue when restarting slurmd where the namespace mount points could disappear. -- sacct - avoid truncating JobId at 34 characters. -- scancel - fix segfault when --wckey filtering option is used. -- select/cons_tres - Fix memory leak. -- Prevent file descriptor leak in job_container/tmpfs on slurmd restart. -- slurmrestd/dbv0.0.36 - Fix values dumped in job state/current and job step state. -- slurmrestd/dbv0.0.36 - Correct description for previous state property. -- perlapi/libslurmdb - expose tres_req_str to job hash. -- scrontab - close and reopen temporary crontab file to deal with editors that do not change the original file, but instead write out then rename a new file. -- sstat - fix linking so that it will work when --without-shared-libslurm was used to build Slurm. -- Clear allocated cpus for running steps in a job before handling requested nodes on new step. -- Don't reject a step if not enough nodes are available. Instead, defer the step until enough nodes are available to satisfy the request. -- Don't reject a step if it requests at least one specific node that is already allocated to another step. Instead, defer the step until the requested node(s) become available. -- slurmrestd - add description for slurmdb/job endpoint. -- Better handling of --mem=0. -- Ignore DefCpuPerGpu when --cpus-per-task given. -- sacct - fix segfault when printing StepId (or when using --long).