We are pleased to announce the availability of Slurm version 20.02.2.
This includes a series of moderate and minor fixes since the last maintenance releases for both branches.
Slurm can be downloaded from https://www.schedmd.com/downloads.php . - Tim -- Tim Wickberg Chief Technology Officer, SchedMD LLC Commercial Slurm Development and Support
* Changes in Slurm 20.02.2 ========================== -- Fix slurmctld segfault when checking no_consume GRES node allocation counts. -- Fix resetting of cloud_dns on a reconfigure. -- squeue - change output for dependency column to use "(null)" instead of "" for no dependncies as documented in the man page, and used by other columns. -- Clear node_cnt_wag after job update. -- Fix regression where AccountingStoreJobComment was not defaulting to 'yes'. -- Send registration message immediately after a node is resumed. -- Cray - Fix hetjobs when using only a single component in the step launch. -- Cray - Fix hetjobs launched without component 0. -- Cray - Quiet cookies missing message which is expected on for hetjobs. -- Fix handling of -m/--distribution options for across socket/2nd level by task/affinity plugin. -- Fix grp_node_bitmap error when slurmctld started before slurmdbd. -- Fix scheduling issue when there are not enough nodes available to run a job resulting in possible job starvation. -- Make it so mpi/cray_shasta appears in srun --mpi=list -- Don't requeue jobs that have been explicitly canceled. -- Fix error message for a regular user trying to update licenses on a running job. -- Fix backup slurmctld handling for logrotation via SIGUSR2. -- Fix reservation feature specification when looking for inactive features after active features fails. -- Prevent misleading error messages for reservation creation. -- Print message in scontrol when a request fails for not having enough nodes. -- Fix duplicate output in sacct with multiple resv events. -- auth/jwt - return correct gid for a given user. This was incorrectly assuming the users's primary group name matched their username. -- slurmrestd - permit non-SlurmUser/root job submission. -- Use host IP if hostname unknown for job submission for allocating node. -- Fix issue with primary_slurmdbd_resumed_operation trigger not happening on slurmctld restart. -- Fix race in acct_gather_interconnect/ofed on step termination. -- Fix typo of SlurmctldProlog -> PrologSlurmctld in error message. -- slurm.spec - add SuSE-specific dependencies for optional slurmrestd package. -- Fix FreeBSD build issues. -- Fixed sbatch not processing --ignore-pbs in batch script. -- Don't clear the qos_id of an invalid QOS. -- Allow a job that was once FAIL_[QOS|ACCOUNT] to be eligible again if the qos|account limitation is remedied. -- Fix core reservations using the FLEX flag to allow use of resources outside of the reservation allocation. -- Fix MPS without File with 1 GPU, and without GPUs. -- Add FreeBSD support to proctrack/pgid plugin. -- Fix remote dependency testing for meta job in job array. -- Fix preemption when dealing with a job array. -- Don't send remote non-pending singleton dependencies on federation update. -- slurmrestd - fix crash on empty query. -- Fix race condition which could lead to invalid references in backfill. -- Fix edge case in _remove_job_hash(). -- Fix exit code when using --cluster/-M client options. -- Fix compilation issues in GCC10. -- Fix invalid references when federated job is revoked while in backfill loop. -- Fix distributing job steps across idle nodes within a job. -- Fix detected floating reservation overlapping. -- Break infinite loop in cons_tres dealing with incorrect tasks per tres request resulting in slurmctld hang. -- Send the current (not the previous) reason for a pending job to client commands like squeue/scontrol. -- Fix incorrect lock levels for select_g_reconfigure(). -- Handle hidden nodes correctly in slurmrestd. -- Allow sacctmgr to use MaxSubmitP[U|A] as format options. -- Fix segfault when trying to delete a corrupted association. -- Fix setting ntasks-per-core when using --multithread. -- Only override job wait reason to priority if Reason=None or Reason=Resources. -- Perl API / seff - fix missing symbol issue with accounting_storage/slurmdbd. -- slurm.spec - add --with cray_shasta option. -- Downgrade "Node config differ.." error message if config_overrides enabled. -- Add client error when using --gpus-per-socket without --sockets-per-node. -- Fix nvml/rsmi debug statements making it to stderr. -- NodeSets - fix slurmctld segfault in newer glibc if any nodes have no defined features. -- ConfigLess - write out plugstack config to correct config file name in the config cache. -- priority/multifactor - gracefully handle NULL list of associations or array of siblings when calculating FairTree fairshare. -- Fix cons_tres --exclusive=user to allocate only requested number of CPUs. -- Add MySQL deadlock detection and automatic retry mechanism. -- Reject repeating floating reservations as they aren't supported. -- Fix testing of reservation flags that may be NO_VAL64. -- Fix _verify_node_state memory requested as --mem-per-gpu DefMemPerGPU. -- Fix DependencyNeverSatisfied not set as the job's state reason if > kill_invalid_depend or --kill-on-invalid-dep are used. -- pam_slurm_adopt - explicitly call slurm_conf_init(). -- configless - fix plugstack.conf handling for client commands. -- Set SLURM_JOB_USER and SLURM_JOB_UID in task_epilog correctly. -- slurmrestd - authenticate job submissions by SlurmUser properly.