> > > use yum install slurm20, here they show Slurm 19 but it's the same for 20 > > In that case you'll need to open a bug with Bright to get them to > rebuild Slurm with nvml support.
They told me they don't officially support MPS nor Slurm and to come here to get support (or pay SchedMD). The vicious cycle continues. Since all I want it MPS enabled from https://slurm.schedmd.com/gres.html#MPS_config_example_2 "CUDA Multi-Process Service (MPS) provides a mechanism where GPUs can be shared by multiple jobs, where each job is allocated some percentage of the GPU's resources. The total count of MPS resources available on a node should be configured in the slurm.conf file (e.g. "NodeName=tux[1-16] Gres=gpu:2,mps:200"). Several options are available for configuring MPS in the gres.conf file as listed below with examples following that: No MPS configuration: The count of gres/mps elements defined in the slurm.conf will be evenly distributed across all GPUs configured on the node. For the example, "NodeName=tux[1-16] Gres=gpu:2,mps:200" will configure a count of 100 gres/mps resources on each of the two GPUs." Do I even need to edit gres.conf? Can I just leave out AutoDetect=nvml?