I'm trying to spoof a gpu on a Centos 7.7 virtual machine that is a slurm node. I just want slurm to see that this node has a gpu. I'm not going to execute any code that uses a gpu.
I created a character device with: mknod nvidia0 c 1 1 Here's what it looks like: [root@liqidos-dean-node1 dev]# ls -l nvidia0 crw-------. 1 root root 1, 1 Jan 22 15:43 nvidia0 Here's my gres.conf: Name=gpu Type=gp100 File=/dev/nvidia0 Cores=0,1 The relevant lines from my slurm.conf are (my full slurm.conf is below): ... GresTypes=gpu ... SelectType=select/cons_tres .... NodeName=liqidos-dean-node1 Gres=gpu:gp100:1 CPUs=2 RealMemory=3770 Sockets=2 CoresPerSocket=1 ThreadsPerCore=1 State=UNKNOWN After restarting slurmd it slurm doesn't recognize my spoofed gpu: [liqid@liqidos-dean-node1 ~]$ slurmd -C NodeName=liqidos-dean-node1 CPUs=2 Boards=1 SocketsPerBoard=2 CoresPerSocket=1 ThreadsPerCore=1 RealMemory=3770 UpTime=0-06:47:11 [liqid@liqidos-dean-node1 ~]$ scontrol show node NodeName=liqidos-dean-node1 Arch=x86_64 CoresPerSocket=1 CPUAlloc=0 CPUTot=2 CPULoad=0.01 AvailableFeatures=(null) ActiveFeatures=(null) Gres=(null) NodeAddr=liqidos-dean-node1 NodeHostName=liqidos-dean-node1 Version=19.05.4 OS=Linux 3.10.0-1062.el7.x86_64 #1 SMP Wed Aug 7 18:08:02 UTC 2019 RealMemory=3770 AllocMem=0 FreeMem=177 Sockets=2 Boards=1 State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A Partitions=debug BootTime=2020-01-22T09:12:57 SlurmdStartTime=2020-01-22T15:55:16 CfgTRES=cpu=2,mem=3770M,billing=2 AllocTRES= CapWatts=n/a CurrentWatts=0 AveWatts=0 ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s Have I missed something, or is slurm smart enough to recognize that I don't have a real GPU? Thanks. Full slurm.conf: SlurmctldHost=slurmctld-dean GresTypes=gpu MpiDefault=none PluginDir=/usr/local/lib/slurm ProctrackType=proctrack/cgroup ReturnToService=1 SlurmctldPidFile=/var/run/slurmctld.pid SlurmctldPort=6817 SlurmdPidFile=/var/run/slurmd.pid SlurmdPort=6818 SlurmdSpoolDir=/var/spool/slurmd SlurmUser=slurm StateSaveLocation=/var/spool/slurm/state SwitchType=switch/none TaskPlugin=task/affinity TaskPluginParam=Sched InactiveLimit=0 KillWait=30 MinJobAge=300 SlurmctldTimeout=120 SlurmdTimeout=300 Waittime=0 SchedulerType=sched/backfill SelectType=select/cons_tres SelectTypeParameters=CR_Core AccountingStorageType=accounting_storage/slurmdbd AccountingStoreJobComment=YES ClusterName=cluster JobCompType=jobcomp/none JobAcctGatherFrequency=30 JobAcctGatherType=jobacct_gather/none SlurmctldDebug=info SlurmctldLogFile=/var/log/slurm/slurmctld.log SlurmdDebug=info SlurmdLogFile=/var/log/slurm/slurmd.log NodeName=liqidos-dean-node1 Gres=gpu:gp100:1 CPUs=2 RealMemory=3770 Sockets=2 CoresPerSocket=1 ThreadsPerCore=1 State=UNKNOWN PartitionName=debug Nodes=liqidos-dean-node1 Default=YES MaxTime=INFINITE State=UP