Hi,
we're testing possible slurm configurations on a test system right now. 
Eventually, it is going to serve ~1000 users.

We're going to have some users who are going to run lots of short jobs (a 
couple of minutes to ~4h) and some users that run jobs that are going to run 
for days or weeks. I want to avoid a situation in which a group of users 
basically saturates the whole cluster with jobs that run for a week or two and 
nobody could run any short jobs anymore. I also would like to favor short jobs, 
because they make the whole cluster feel more dynamic and agile for everybody.

On the other hand, I would like to make the most of the ressources, i.e. when 
nobody is sending short jobs, long jobs could run on all the nodes.

My idea was to basically have three partitions:

1. PartitionName=short MaxTime=04:00:00 State=UP Nodes=node[01-99]  
PriorityTier=100
2. PartitionName=long_safe MaxTime=14-00:00:00 State=UP Nodes=node[01-50] 
PriorityTier=100
3. PartitionName=long_preempt MaxTime=14-00:00:00 State=UP Nodes=nodes[01-99] 
PriorityTier=40 PreemptMode=requeue

and then use the JobSubmitPlugin "all_partitions" so that all jobs get 
submitted to all partitions by default. This way, a short job ends up in the 
`short` partition and is able to use all nodes. A long job ends up using the 
`long_safe` partition until for the first 50 nodes. These jobs are not going to 
be preempted. Remaining long jobs use the `long_preempt` queue. So they run on 
the remaining nodes as long as there are no higher prio short (or long) jobs in 
the queue.

So, the cluster could be saturated with long running jobs but if short jobs are 
submitted and the user has a high enough fair share, some of the long jobs 
would get preempted and the short ones would run.

This scenario works fine.... BUT the long jobs seem to be playing pingpong on 
the `long_preempt` partition because as soon as they run, they stop accruing 
AGE priority unlike still queued jobs. As soon as a queued job, albeit by the 
same user, "overtakes" a running one, it preempts the running one, stops 
accruing age and so on....

So, is there maybe a cleverer way to do this?

Thanks a lot!
Thomas

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to