So if a GPU job is submitted to a partition containing only GPU nodes, and a 
non-GPU job is submitted to a partition containing at least some nodes without 
GPUs, both jobs should be able to run. Priorities should be evaluated on a 
per-partition basis. I can 100% guarantee that in our HPC, pending GPU jobs 
don't block non-GPU jobs, and vice versa.

I could see a problem if the GPU job was submitted to a partition containing 
both types of nodes: if that job was assigned the highest priority for whatever 
reason (fair share, age, etc.), other jobs in the same partition would have to 
wait until that job started.

A simple solution would be to make a GPU partition containing only GPU nodes, 
and a non-GPU partition containing only non-GPU nodes. Submit GPU jobs to the 
GPU partition, and non-GPU jobs to the non-GPU partition.

Once that works, you could make a partition that includes both types of nodes 
to reduce idle resources, but jobs submitted to that partition would have to 
(a) not require a GPU, (b) require a limited number of CPUs per node, so that 
you'd have some CPUs available for GPU jobs on the nodes containing GPUs.

________________________________
From: slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of navin 
srivastava <navin.alt...@gmail.com>
Sent: Saturday, June 13, 2020 10:47 AM
To: Slurm User Community List <slurm-users@lists.schedmd.com>
Subject: Re: [slurm-users] ignore gpu resources to scheduled the cpu based jobs


Yes we have separate partitions. Some are specific to gpu having 2 nodes with 8 
gpu and another partitions are mix of both,nodes with 2 gpu and very few nodes 
are without any gpu.

Regards
Navin


On Sat, Jun 13, 2020, 21:11 navin srivastava 
<navin.alt...@gmail.com<mailto:navin.alt...@gmail.com>> wrote:
Thanks Renfro.

Yes we have both types of nodes with gpu and nongpu.
Also some users job require gpu and some applications use only CPU.

So the issue happens when user priority is high and waiting for gpu resources 
which is not available and the job with lower priority is waiting even though 
enough CPU is available which need only CPU resources.

When I hold gpu  jobs the cpu  jobs will go through.

Regards
Navin

On Sat, Jun 13, 2020, 20:37 Renfro, Michael 
<ren...@tntech.edu<mailto:ren...@tntech.edu>> wrote:
Will probably need more information to find a solution.

To start, do you have separate partitions for GPU and non-GPU jobs? Do you have 
nodes without GPUs?

On Jun 13, 2020, at 12:28 AM, navin srivastava 
<navin.alt...@gmail.com<mailto:navin.alt...@gmail.com>> wrote:

Hi All,

In our environment we have GPU. so what i found is if the user having high 
priority and his job is in queue and waiting for the GPU resources which are 
almost full and not available. so the other user submitted the job which does 
not require the GPU resources are in queue even though lots of cpu resources 
are available.

our scheduling mechanism is FIFO and Fair tree enabled. Is there any way we can 
make some changes so that the cpu based job should go through and GPU based job 
can wait till the GPU resources are free.

Regards
Navin.




Reply via email to