On 9 October 2017 at 22:06, cyberseawolf . <ebre...@gmail.com> wrote:
> Hello everybody, > I'm a young system administrator that is moving from Torque/MAUI to Slurm. > I set up a pretty peculiar resource management in the previous queue system > and I would like to port it in the new one. > > - I have the following two partitions that are totally independent to each > others (like having to separate queues): > > Part A --> has 24 cores per node at higher speed, 16 nodes in total; > > Part B --> has 4 cores per node at lower speed, 11 nodes in total. > > > - There are two kinds of accounts (I hope that this is the right word...): > > Acc A --> every user can request up to 24 cores/6 nodes (i.e. 144 total > CPUs) for all his/her jobs belonged to Part A, up to 4 cores/11 nodes (i.e. > 44 total CPUs) for all his/her jobs belonged to part B, all jobs have very > low priority; > > Acc B --> each user can request up to 12 cores/1 node (i.e. 12 total > CPUs) per each job in Part A, up to 4 cores/3 nodes (i.e. 12 total CPUs) > per each job in Part B, all jobs have high priority, only 10 jobs for all > users can be executed at the same time in Part A, only 12 jobs can be > queued for each user in Part A, no such limits in Part B. > > > - There are no time limit for all jobs. > > > - I did not use any database to track cluster usage in the past. If > needed, I would like to use a very simple one since I have no experience > with it. > > > - The purpose of this set up is to give more resources to users of Acc A > since they're doing a massive usage of the cluster. This being said, all > jobs of Acc B must be executed as soon as resources are available since > they are much quicker. > > Could you please suggest me which keywords I should use in slurm.conf > file? And what about the manual, are there any pages I have to check in > order to let this set up to work? > I would like to use the last version of Slurm to get rid of all bugs and > take advantage of the new features that could help me. > > Thank you very much for your kindness, > > Welcome to SLURM Emanuele You will need to track cluster usage for the prioritisation. Luckily, it's essentially built into slurm. Make sure that the head or master node (any node really, but I use master) has the slurm-slurmdbd package installed. This will do your accounting for you, coupled with a db (mysql works well). https://slurm.schedmd.com/accounting.html The Resource Limits page is handy for working out priorities, QoS and how they intersect with Partitions, Accounts, and etc: https://slurm.schedmd.com/resource_limits.html Keywords: SelectType=select/cons_res SelectTypeParameters=CR_CPU PriorityFlags=FAIR_TREE PriorityType=priority/multifactor PriorityDecayHalfLife=10 AccountingStorageEnforce=qos,limits,safe It's realtively complex - as you would hope/imagine/fear. But I've found that by tackling one thing at a time, you can get a working system relatively quickly. As always, working out the syntax is the hardest part. Cheers L. ------ "The antidote to apocalypticism is *apocalyptic civics*. Apocalyptic civics is the insistence that we cannot ignore the truth, nor should we panic about it. It is a shared consciousness that our institutions have failed and our ecosystem is collapsing, yet we are still here — and we are creative agents who can shape our destinies. Apocalyptic civics is the conviction that the only way out is through, and the only way through is together. " *Greg Bloom* @greggish https://twitter.com/greggish/status/873177525903609857