Hi Benson, As you can perhaps see from our slurm.conf, we have task affinity or similar switches off. Along the same route, i also removed the core binding of the GPUs. That is why, I am quite surprised, that slurm doesn’t allow new jobs in. I am aware of the PCIe bandwidth implications of a GPU running on the wrong socket or of the interfere des of pinned jobs versus unpinned ones. But at this point, these appear to be higher order optimizations.
Best, Peter