In the gres.conf on one of my nodes I have just the line Autodetect=nvml
as in the last example in https://slurm.schedmd.com/gres.conf.html. In the slurm.conf on all nodes I have this line for the node with Autodetect=nvml NodeName=slurmnode1 CPUs=16 Boards=1 SocketsPerBoard=1 CoresPerSocket=8 ThreadsPerCore=2 RealMemory=47671 Gres=gpu:gp100:4 since that node can have up to 4 gpus dynamically assigned. Without the Gres=gpu:gp100:4 I can't run any job that requires a gpu even if I dynamically assign gpus on that node. Apparently Autodetect=nvml isn't enough to let the controller know that there are gpus available on that node. With this configuration I get this message every second in my slurmctld.log file: error: _slurm_rpc_node_registration node=slurmnode1: Invalid argument I've restarted both slurmd and slurmctld and still get the error. That node also stays in the drain state no matter what I do with it. Apparently slurm doesn't like this configuration. What is the right way to configure a node with Autodetect=nvml?