Hi, we have 40 nodes (all the same, amd nodes with 128 cores) which have all been purchased by different groups at our lab and each group would like to have immediate access of course to what they have paid for. The stakeholder groups are also fine with allowing the general public to use their hosts/cores provided they can preempt the general public's jobs. One way I can see to do that is to assign specific nodes to each stakeholder group defined as a partition, something like this:
PartitionName=shared Default=yes Priority=10 MaxTime=5-00:00:00 DefaultTime=30 PreemptMode=CANCEL State=UP Nodes=amd[0001-0040] PartitionName=exp1 Default=no Priority=50 MaxTime=5-00:00:00 DefaultTime=1-00:00:00 PreemptMode=OFF State=UP Nodes=amd[0001-0003] PartitionName=exp2 Default=no Priority=50 MaxTime=5-00:00:00 DefaultTime=1-00:00:00 PreemptMode=OFF State=UP Nodes=amd[0004-0019] PartitionName=exp3 Default=no Priority=50 MaxTime=5-00:00:00 DefaultTime=1-00:00:00 PreemptMode=OFF State=UP Nodes=amd[0020-0040] Is this the most efficient and best use of resources? In the above scenario if scavenger jobs are running on a given experiment's hosts and the experiment needs to run jobs, then scavenger jobs get preempted, even if there are idle hosts in the other stakeholder partitions. Is there a way to guarantee say exp1 that they will have priority on 386 cores but not necessarily tie them to 3 specific hosts? Thanks, Renata