Diego Zuccato <diego.zucc...@unibo.it> writes: > Il 13/06/20 17:47, navin srivastava ha scritto: > >> Yes we have separate partitions. Some are specific to gpu having 2 nodes >> with 8 gpu and another partitions are mix of both,nodes with 2 gpu and >> very few nodes are without any gpu. > Maybe it's already known and obvious, but... Remember that a node can be > allocated to only one partition.
Maybe I am misunderstanding you, but I think that this is not the case. A node can be in multiple partitions. We have nodes belonging to individual research groups which are in both a separate partition just for the group and in a 'scavenger' partition for everyone (but with lower priority add maximum run-time). > So, if you have the mixed nodes in bot > partitions and there's a GPU job running, a non-gpu job will find that > node marked as busy because it's allocated to another partition. > That's why we're drastically reducing the number of partitions we have > and will avoid shared nodes. Again I don't this is explanation. If a job is running on a GPU node, but not using all the CPUs, then a CPU-only job should be able to start on that node, unless some form of exclusivity has been set up, such as ExclusiveUser=YES for the partition. Without seeing the full slurm.conf, it is difficult to guess what the problem might be. Cheers, Loris -- Dr. Loris Bennett (Mr.) ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de