Ok, then regular license accounting won’t work.

Somewhat tested, but should work or at least be a starting point. Given a job 
number JOBID that’s already running with this license on one or more nodes:

  sbatch -w $(scontrol show job JOBID | grep ' NodeList=' | cut -d= -f2) -N 1

should start a one-node job on an available node being used by JOBID. Add other 
parameters as required for cpus-per-task, time limits, or whatever else is 
needed. If you start the larger jobs first, and let the later jobs fill in on 
idle CPUs on those nodes, it should work.

> On May 6, 2020, at 9:46 AM, navin srivastava <navin.alt...@gmail.com> wrote:
> 
> To explain with more details.
> 
> job will be submitted based on core at any time but it will go to any random 
> nodes but limited to 4 Nodes only.(license having some intelligence that it 
> calculate the nodes and if it reached to 4 then it will not allow any more 
> nodes. yes it didn't depend on the no of core available on nodes.
> 
> Case-1 if 4 jobs running with 4 cores each on 4 nodes [node1, node2, node3 
> and node4]
>              Again Fifth job assigned by SLURM with 4 cores on any one node 
> of node1, node2, node3 and node4 then license will be allowed.
>  
> Case-2 if 4 jobs running with 4 cores each on 4 nodes [node1, node2, node3 
> and node4]
>              Again Fifth job assigned by SLURM on node5 with 4 cores  then 
> license will not allowed [ license not found error came in this case]
> 
> Regards
> Navin.
> 
> 
> On Wed, May 6, 2020 at 7:47 PM Renfro, Michael <ren...@tntech.edu> wrote:
> To make sure I’m reading this correctly, you have a software license that 
> lets you run jobs on up to 4 nodes at once, regardless of how many CPUs you 
> use? That is, you could run any one of the following sets of jobs:
> 
> - four 1-node jobs,
> - two 2-node jobs,
> - one 1-node and one 3-node job,
> - two 1-node and one 2-node jobs,
> - one 4-node job,
> 
> simultaneously? And the license isn’t node-locked to specific nodes by MAC 
> address or anything similar? But if you try to run jobs beyond what I’ve 
> listed above, you run out of licenses, and you want those later jobs to be 
> held until licenses are freed up?
> 
> If all of those questions have an answer of ‘yes’, I think you want the 
> remote license part of the https://slurm.schedmd.com/licenses.html, something 
> like:
> 
>   sacctmgr add resource name=software_name count=4 percentallowed=100 
> server=flex_host servertype=flexlm type=license
> 
> and submit jobs with a '-L software_name:N’ flag where N is the number of 
> nodes you want to run on.
> 
> > On May 6, 2020, at 5:33 AM, navin srivastava <navin.alt...@gmail.com> wrote:
> > 
> > Thanks Micheal.
> > 
> > Actually one application license are based on node and we have 4 Node 
> > license( not a fix node). we have several nodes but when job lands on any 4 
> > random nodes it runs on those nodes only. After that it fails if it goes to 
> > other nodes.
> > 
> > can we define a custom variable and set it on the node level and when user 
> > submit it will pass that variable and then job will and onto those specific 
> > nodes?
> > i do not want to create a separate partition. 
> > 
> > is there any way to achieve this by any other method?
> > 
> > Regards
> > Navin.
> > 
> > 
> > Regards
> > Navin.
> > 
> > On Tue, May 5, 2020 at 7:46 PM Renfro, Michael <ren...@tntech.edu> wrote:
> > Haven’t done it yet myself, but it’s on my todo list.
> > 
> > But I’d assume that if you use the FlexLM or RLM parts of that 
> > documentation, that Slurm would query the remote license server 
> > periodically and hold the job until the necessary licenses were available.
> > 
> > > On May 5, 2020, at 8:37 AM, navin srivastava <navin.alt...@gmail.com> 
> > > wrote:
> > > 
> > > External Email Warning
> > > This email originated from outside the university. Please use caution 
> > > when opening attachments, clicking links, or responding to requests.
> > > Thanks Michael,
> > > 
> > > yes i have gone through but the licenses are remote license and it will 
> > > be used by outside as well not only in slurm.
> > > so basically i am interested to know how we can update the database 
> > > dynamically to get the exact value at that point of time.
> > > i mean query the license server and update the database accordingly. does 
> > > slurm automatically updated the value based on usage?
> > > 
> > > 
> > > Regards
> > > Navin.
> > > 
> > > 
> > > On Tue, May 5, 2020 at 7:00 PM Renfro, Michael <ren...@tntech.edu> wrote:
> > > Have you seen https://slurm.schedmd.com/licenses.html already? If the 
> > > software is just for use inside the cluster, one Licenses= line in 
> > > slurm.conf plus users submitting with the -L flag should suffice. Should 
> > > be able to set that license value is 4 if it’s licensed per node and you 
> > > can run up to 4 jobs simultaneously, or 4*NCPUS if it’s licensed per CPU, 
> > > or 1 if it’s a single license good for one run from 1-4 nodes.
> > > 
> > > There are also options to query a FlexLM or RLM server for license 
> > > management.
> > > 
> > > -- 
> > > Mike Renfro, PhD / HPC Systems Administrator, Information Technology 
> > > Services
> > > 931 372-3601     / Tennessee Tech University
> > > 
> > > > On May 5, 2020, at 7:54 AM, navin srivastava <navin.alt...@gmail.com> 
> > > > wrote:
> > > > 
> > > > Hi Team,
> > > > 
> > > > we have an application whose licenses is limited .it scales upto 4 
> > > > nodes(~80 cores).
> > > > so if 4 nodes are full, in 5th node job used to get fail.
> > > > we want to put a restriction so that the application can't go for the 
> > > > execution beyond the 4 nodes and fail it should be in queue state.
> > > > i do not want to keep a separate partition to achieve this config.is 
> > > > there a way to achieve this scenario using some dynamic resource which 
> > > > can call the license variable on the fly and if it is reached it should 
> > > > keep the job in queue.
> > > > 
> > > > Regards
> > > > Navin.
> > > > 
> > > > 
> > > > 
> > > 
> > 
> 

Reply via email to