Re: [slurm-users] DenyOnLimit flag ignored for QOS, always rejects?

Renfro, Michael Fri, 25 Jan 2019 09:40:29 -0800

Resolved, thanks to Adam Hough on sighpcsyspros Slack.

Before, I had MaxSubmitJobsPerUser=8, and really wanted MaxJobsPerUser=8


- MaxJobsPerUser= The maximum number of jobs a user can have running at a given 
time.
- MaxSubmitJobsPerUser= The maximum number of jobs a user can have running and 
pending at a given time.

Now:

    $ sacctmgr list qos normal,gpu 
format=name,priority,gracetime,preemptmode,usagefactor,grptresrunmin,MaxSubmitJobsPerUser,maxjobsperuser,flags
          Name   Priority  GraceTime PreemptMode UsageFactor GrpTRESRunMin 
MaxSubmitPU MaxJobsPU                Flags
    ---------- ---------- ---------- ----------- ----------- ------------- 
----------- --------- --------------------
           gpu          0   00:00:00     cluster    1.000000                    
               8
        normal          0   00:00:00     cluster    1.000000

$ for n in $(seq 9); do sbatch --time=00:10:00 --partition=gpu omp_hw.sh; done
Submitted batch job 150670
Submitted batch job 150671
Submitted batch job 150672
Submitted batch job 150673
Submitted batch job 150674
Submitted batch job 150675
Submitted batch job 150676
Submitted batch job 150677
Submitted batch job 150678
[renfro@login hw]$ squeue -u $USER -p gpu
JOBID  PARTI       NAME     USER ST         TIME CPUS NODES MIN_MEMORY 
NODELIST(REASON) GRES
150678   gpu  omp_hw.sh   renfro PD         0:00 1    1     4000M      
(QOSMaxJobsPerUs (null)
150670   gpu  omp_hw.sh   renfro  R         0:06 1    1     4000M      
gpunode001       (null)
150671   gpu  omp_hw.sh   renfro  R         0:06 1    1     4000M      
gpunode001       (null)
150672   gpu  omp_hw.sh   renfro  R         0:06 1    1     4000M      
gpunode001       (null)
150673   gpu  omp_hw.sh   renfro  R         0:06 1    1     4000M      
gpunode001       (null)
150674   gpu  omp_hw.sh   renfro  R         0:06 1    1     4000M      
gpunode001       (null)
150675   gpu  omp_hw.sh   renfro  R         0:06 1    1     4000M      
gpunode001       (null)
150676   gpu  omp_hw.sh   renfro  R         0:06 1    1     4000M      
gpunode001       (null)
150677   gpu  omp_hw.sh   renfro  R         0:06 1    1     4000M      
gpunode001       (null)
$ scancel -u $USER -p gpu

> On Jan 25, 2019, at 10:35 AM, Renfro, Michael <ren...@tntech.edu> wrote:
> 
> Hey, folks. Running 17.02.10 with Bright Cluster Manager 8.0.
> 
> I wanted to limit queue-stuffing on my GPU nodes, similar to what 
> AssocGrpCPURunMinutesLimit does. The current goal is to restrict a user to 
> having 8 active or queued jobs in the production GPU partition, and block 
> (not reject) other jobs to allow other users fair access to the queue. I'm 
> good with a time limit instead of a job number limit, too.
> 
> I'd assumed a partition QOS was the way to go, as the sacctmgr man page reads 
> in part:
> 
>    Flags  Used by the slurmctld to override or enforce certain 
> characteristics.
>           Valid options are
> 
>           DenyOnLimit
>             If set, jobs using this QOS will be rejected at submission time 
> if they do not conform to the QOS 'Max' limits. Group limits will also be 
> treated like 'Max' limits as well and will be denied if they go over. By 
> default jobs that go over these limits will pend until they conform. This 
> currently only applies to QOS and Association limits.
> 
> So avoid setting the DenyOnLimit flag, and extra jobs will pend until they 
> conform, right? My QOS settings for 8 active or pending GPU jobs per user are 
> as follows:
> 
>    $ sacctmgr list qos normal,gpu 
> format=name,priority,gracetime,preemptmode,usagefactor,grptresrunmin,MaxSubmitJobsPerUser,flags
>          Name   Priority  GraceTime PreemptMode UsageFactor GrpTRESRunMin 
> MaxSubmitPU                Flags
>    ---------- ---------- ---------- ----------- ----------- ------------- 
> ----------- --------------------
>        normal          0   00:00:00     cluster    1.000000
>           gpu          0   00:00:00     cluster    1.000000                   
>       8
> 
> Partition settings, where the gpu QOS is applied to jobs in the gpu partition:
> 
>    $ egrep 'PartitionName=(batch|gpu) ' /etc/slurm/slurm.conf
>    PartitionName=batch Default=YES MinNodes=1 MaxNodes=40 
> DefaultTime=1-00:00:00 MaxTime=30-00:00:00 AllowGroups=ALL 
> PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO 
> Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO DefMemPerCPU=4000 
> AllowAccounts=ALL AllowQos=ALL LLN=NO ExclusiveUser=NO OverSubscribe=NO 
> OverTimeLimit=0 State=UP Nodes=node[001-040]
>    PartitionName=gpu Default=NO MinNodes=1 DefaultTime=1-00:00:00 
> MaxTime=30-00:00:00 AllowGroups=ALL PriorityJobFactor=1 PriorityTier=1 
> DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 
> PreemptMode=OFF ReqResv=NO DefMemPerCPU=4000 AllowAccounts=ALL AllowQos=ALL 
> LLN=NO MaxCPUsPerNode=16 QoS=gpu ExclusiveUser=NO OverSubscribe=NO 
> OverTimeLimit=0 State=UP Nodes=gpunode[001-004]
> 
> Original submission specifying CPUs, time, GRES, QOS, and partition, which 
> accepts jobs 1-8, and rejects job 9 even though I haven't set the DenyOnLimit 
> flag:
> 
>    $ for n in $(seq 9); do sbatch --nodes=1 --cpus-per-task=1 --time=00:10:00 
> --gres=gpu --qos=gpu --partition=gpu omp_hw.sh; done
>    Submitted batch job 150548
>    Submitted batch job 150549
>    Submitted batch job 150550
>    Submitted batch job 150551
>    Submitted batch job 150552
>    Submitted batch job 150553
>    Submitted batch job 150554
>    Submitted batch job 150555
>    sbatch: error: Batch job submission failed: Job violates accounting/QOS 
> policy (job submit limit, user's size and/or time limits)
>    $ scancel -u $USER -p gpu
> 
> Minimized down to just the specification for CPUs, time, and partition, same 
> results, since the gpu QOS is automatically applied to jobs in the gpu 
> partition:
> 
>    $ for n in $(seq 9); do sbatch --nodes=1 --cpus-per-task=1 --time=00:10:00 
> --partition=gpu omp_hw.sh; done
>    Submitted batch job 150556
>    Submitted batch job 150557
>    Submitted batch job 150558
>    Submitted batch job 150559
>    Submitted batch job 150560
>    Submitted batch job 150561
>    Submitted batch job 150562
>    Submitted batch job 150563
>    sbatch: error: Batch job submission failed: Job violates accounting/QOS 
> policy (job submit limit, user's size and/or time limits)
>    $ scancel -u $USER -p gpu
> 
> Running in the batch partition with the normal QOS, all 9 jobs are accepted:
> 
>    $ for n in $(seq 9); do sbatch --nodes=1 --cpus-per-task=1 --time=00:10:00 
> --partition=batch omp_hw.sh; done
>    Submitted batch job 150564
>    Submitted batch job 150565
>    Submitted batch job 150566
>    Submitted batch job 150567
>    Submitted batch job 150568
>    Submitted batch job 150569
>    Submitted batch job 150570
>    Submitted batch job 150571
>    Submitted batch job 150572
>    $ scancel -u $USER -p batch
> 
> Running in the batch partition with the gpu QOS explicitly specified, accepts 
> jobs 1-8, and rejects job 9:
> 
>    $ for n in $(seq 9); do sbatch --nodes=1 --cpus-per-task=1 --time=00:10:00 
> --partition=batch --qos=gpu omp_hw.sh; done
>    Submitted batch job 150573
>    Submitted batch job 150574
>    Submitted batch job 150575
>    Submitted batch job 150576
>    Submitted batch job 150577
>    Submitted batch job 150578
>    Submitted batch job 150579
>    Submitted batch job 150580
>    sbatch: error: Batch job submission failed: Job violates accounting/QOS 
> policy (job submit limit, user's size and/or time limits)
>    $ scancel -u $USER -p batch
> 
> So the behavior appears to be triggered by the gpu QOS. What might I have 
> missed?
> 
> -- 
> Mike Renfro, PhD / HPC Systems Administrator, Information Technology Services
> 931 372-3601     / Tennessee Tech University
> 
>

Re: [slurm-users] DenyOnLimit flag ignored for QOS, always rejects?

Reply via email to