Ok, then regular license accounting won’t work. Somewhat tested, but should work or at least be a starting point. Given a job number JOBID that’s already running with this license on one or more nodes:
sbatch -w $(scontrol show job JOBID | grep ' NodeList=' | cut -d= -f2) -N 1 should start a one-node job on an available node being used by JOBID. Add other parameters as required for cpus-per-task, time limits, or whatever else is needed. If you start the larger jobs first, and let the later jobs fill in on idle CPUs on those nodes, it should work. > On May 6, 2020, at 9:46 AM, navin srivastava <navin.alt...@gmail.com> wrote: > > To explain with more details. > > job will be submitted based on core at any time but it will go to any random > nodes but limited to 4 Nodes only.(license having some intelligence that it > calculate the nodes and if it reached to 4 then it will not allow any more > nodes. yes it didn't depend on the no of core available on nodes. > > Case-1 if 4 jobs running with 4 cores each on 4 nodes [node1, node2, node3 > and node4] > Again Fifth job assigned by SLURM with 4 cores on any one node > of node1, node2, node3 and node4 then license will be allowed. > > Case-2 if 4 jobs running with 4 cores each on 4 nodes [node1, node2, node3 > and node4] > Again Fifth job assigned by SLURM on node5 with 4 cores then > license will not allowed [ license not found error came in this case] > > Regards > Navin. > > > On Wed, May 6, 2020 at 7:47 PM Renfro, Michael <ren...@tntech.edu> wrote: > To make sure I’m reading this correctly, you have a software license that > lets you run jobs on up to 4 nodes at once, regardless of how many CPUs you > use? That is, you could run any one of the following sets of jobs: > > - four 1-node jobs, > - two 2-node jobs, > - one 1-node and one 3-node job, > - two 1-node and one 2-node jobs, > - one 4-node job, > > simultaneously? And the license isn’t node-locked to specific nodes by MAC > address or anything similar? But if you try to run jobs beyond what I’ve > listed above, you run out of licenses, and you want those later jobs to be > held until licenses are freed up? > > If all of those questions have an answer of ‘yes’, I think you want the > remote license part of the https://slurm.schedmd.com/licenses.html, something > like: > > sacctmgr add resource name=software_name count=4 percentallowed=100 > server=flex_host servertype=flexlm type=license > > and submit jobs with a '-L software_name:N’ flag where N is the number of > nodes you want to run on. > > > On May 6, 2020, at 5:33 AM, navin srivastava <navin.alt...@gmail.com> wrote: > > > > Thanks Micheal. > > > > Actually one application license are based on node and we have 4 Node > > license( not a fix node). we have several nodes but when job lands on any 4 > > random nodes it runs on those nodes only. After that it fails if it goes to > > other nodes. > > > > can we define a custom variable and set it on the node level and when user > > submit it will pass that variable and then job will and onto those specific > > nodes? > > i do not want to create a separate partition. > > > > is there any way to achieve this by any other method? > > > > Regards > > Navin. > > > > > > Regards > > Navin. > > > > On Tue, May 5, 2020 at 7:46 PM Renfro, Michael <ren...@tntech.edu> wrote: > > Haven’t done it yet myself, but it’s on my todo list. > > > > But I’d assume that if you use the FlexLM or RLM parts of that > > documentation, that Slurm would query the remote license server > > periodically and hold the job until the necessary licenses were available. > > > > > On May 5, 2020, at 8:37 AM, navin srivastava <navin.alt...@gmail.com> > > > wrote: > > > > > > External Email Warning > > > This email originated from outside the university. Please use caution > > > when opening attachments, clicking links, or responding to requests. > > > Thanks Michael, > > > > > > yes i have gone through but the licenses are remote license and it will > > > be used by outside as well not only in slurm. > > > so basically i am interested to know how we can update the database > > > dynamically to get the exact value at that point of time. > > > i mean query the license server and update the database accordingly. does > > > slurm automatically updated the value based on usage? > > > > > > > > > Regards > > > Navin. > > > > > > > > > On Tue, May 5, 2020 at 7:00 PM Renfro, Michael <ren...@tntech.edu> wrote: > > > Have you seen https://slurm.schedmd.com/licenses.html already? If the > > > software is just for use inside the cluster, one Licenses= line in > > > slurm.conf plus users submitting with the -L flag should suffice. Should > > > be able to set that license value is 4 if it’s licensed per node and you > > > can run up to 4 jobs simultaneously, or 4*NCPUS if it’s licensed per CPU, > > > or 1 if it’s a single license good for one run from 1-4 nodes. > > > > > > There are also options to query a FlexLM or RLM server for license > > > management. > > > > > > -- > > > Mike Renfro, PhD / HPC Systems Administrator, Information Technology > > > Services > > > 931 372-3601 / Tennessee Tech University > > > > > > > On May 5, 2020, at 7:54 AM, navin srivastava <navin.alt...@gmail.com> > > > > wrote: > > > > > > > > Hi Team, > > > > > > > > we have an application whose licenses is limited .it scales upto 4 > > > > nodes(~80 cores). > > > > so if 4 nodes are full, in 5th node job used to get fail. > > > > we want to put a restriction so that the application can't go for the > > > > execution beyond the 4 nodes and fail it should be in queue state. > > > > i do not want to keep a separate partition to achieve this config.is > > > > there a way to achieve this scenario using some dynamic resource which > > > > can call the license variable on the fly and if it is reached it should > > > > keep the job in queue. > > > > > > > > Regards > > > > Navin. > > > > > > > > > > > > > > > > > >