Diego Zuccato <diego.zucc...@unibo.it> writes:

> Il 13/06/20 17:47, navin srivastava ha scritto:
>> Yes we have separate partitions. Some are specific to gpu having 2 nodes
>> with 8 gpu and another partitions are mix of both,nodes with 2 gpu and
>> very few nodes are without any gpu. 
> Maybe it's already known and obvious, but... Remember that a node can be
> allocated to only one partition.

Maybe I am misunderstanding you, but I think that this is not the case.
A node can be in multiple partitions.  We have nodes belonging to
individual research groups which are in both a separate partition just
for the group and in a 'scavenger' partition for everyone (but with
lower priority add maximum run-time).

> So, if you have the mixed nodes in bot
> partitions and there's a GPU job running, a non-gpu job will find that
> node marked as busy because it's allocated to another partition.
> That's why we're drastically reducing the number of partitions we have
> and will avoid shared nodes.

Again I don't this is explanation.  If a job is running on a GPU node,
but not using all the CPUs, then a CPU-only job should be able to start
on that node, unless some form of exclusivity has been set up, such as
ExclusiveUser=YES for the partition.

Without seeing the full slurm.conf, it is difficult to guess what the
problem might be.



Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin         Email loris.benn...@fu-berlin.de

Reply via email to