Hello Rich:

You could create partitions "bulk_a", "bulk_b", "bulk_c" (names are arbitrary) 
which map onto those three groups of nodes and have the intended resource 
limits set at partition level.  Then make job_submit lua cause all jobs 
submitted to "bulk" (or only the subset requesting a specific shared resource, 
or any subset you desire that job_submit.lua can detect) to also get submitted 
to the intended one or more of bulk_[abc].  I can imagine this meeting your 
need but am not certain it does.

Node features requested by jobs (keying off of them in lua filter, or adding 
them there) might help too.

--
Paul Brunk, system administrator
Georgia Advanced Resource Computing Center
Enterprise IT Svcs, the University of Georgia


On 2/1/22, 5:45 AM, "slurm-users" <slurm-users-boun...@lists.schedmd.com> wrote:
[EXTERNAL SENDER - PROCEED CAUTIOUSLY]

Hi,


I am wondering if this possible with slurm, I have an application where I want 
to create groups of  nodes (group size would be between 1 and n servers) which 
have exclusive access to a shared resources and then on that group of nodes 
allow a configurable amount of jobs to run.

For example I could have:

partition: bulk, containing:

group1, max 4 jobs:
  - node1
  - node2
  - node3
  - node4


group 2, max 2 jobs:
   - node5


group 3, max 1 job:

  - node6
  - node7
  - node8
  - node9



Ideally the user could submit a job to a generic queue and I could set a 
configurable gres/license in the background for them and the jobs get placed in 
a free group or pend if it requires the exclusive resource.

I've taken a look at:
1. Using the job submit lua plugin to look at the groups and if a group has 
available resources set a gres so the job is correctly placed.

2. Licenses, but I can't see how to limit a license to a group of hosts without 
creating clusters. Can you limit licenses to specific nodes?

3. On the scheduler, script building the node configuration and update the node 
gres and issue a 'scontrol reconfigure'




Option 3 works, but isn't great.


So I would really like the be able to use a plugin to look at the current 
allocation and set the a gres/license/partition for the user in the background, 
is it possible for the job_submit lua plugin to access an external resources or 
the license part of the slurm? As I could use that.

Or am I missing something or doing something very wrong.


Thanks in advance for any assistance its much appreciated.




Rich Cardwell
Snr IT Engineer
ri...@graphcore.ai<mailto:ri...@graphcore.ai>

www.graphcore.ai <http://www.graphcore.ai><http://www.graphcore.ai%3e>









** We have updated our privacy policy, which contains important information 
about how we collect and process your personal data. To read the policy, please 
click here 
<http://www.graphcore.ai/privacy><http://www.graphcore.ai/privacy%3e> **

This email and its attachments are intended solely for the addressed recipients 
and may contain confidential or legally privileged information.
If you are not the intended recipient you must not copy, distribute or 
disseminate this email in any way; to do so may be unlawful.

Any personal data/special category personal data herein are processed in 
accordance with UK data protection legislation.
All associated feasible security measures are in place. Further details are 
available from the Privacy Notice on the website and/or from the Company.

Graphcore Limited (registered in England and Wales with registration number 
10185006) is registered at 107 Cheapside, London, UK, EC2V 6DN.
This message was scanned for viruses upon transmission. However Graphcore 
accepts no liability for any such transmission.

Reply via email to