So here is a default partition PartitionName=BDW AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL AllocNodes=ALL Default=YES QoS=N/A DefaultTime=01:00:00 DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO MaxNodes=UNLIMITED MaxTime=1-00:00:00 MinNodes=1 LLN=NO MaxCPUsPerNode=UNLIMITED Nodes=nid00[016-063] PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=EXCLUSIVE OverTimeLimit=NONE PreemptMode=OFF State=UP TotalCPUs=3456 TotalNodes=48 SelectTypeParameters=NONE DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED
If we just flip on AccountingStorageEnforce=limits,qos,(tried adding safe as well) no jobs can run. Here is a running job which shows the default "normal" QOS that was created when slurm was installed JobId=244667 JobName=em25d_SEAM UserId=j0497482(10214) GroupId=rt3(501) MCS_label=N/A Priority=1 Nice=0 Account=(null) QOS=normal JobState=RUNNING Reason=None Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 RunTime=00:00:06 TimeLimit=1-00:00:00 TimeMin=N/A SubmitTime=2019-03-05T11:24:41 EligibleTime=2019-03-05T11:24:41 StartTime=2019-03-05T11:24:41 EndTime=2019-03-06T11:24:41 Deadline=N/A PreemptTime=None SuspendTime=None SecsPreSuspend=0 Partition=KNL AllocNode:Sid=hickory-1:4991 ReqNodeList=(null) ExcNodeList=(null) NodeList=nid00605 BatchHost=nid00605 NumNodes=1 NumCPUs=256 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:1 TRES=cpu=256,mem=96763M,node=1,gres/craynetwork=1 Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=* MinCPUsNode=1 MinMemoryNode=96763M MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 Gres=craynetwork:1 Reservation=(null) OverSubscribe=NO Contiguous=0 Licenses=(null) Network=(null) Command=./.prg29913/tmp/DIVAcmdEXEC29913.py None /home/j0497482/bin/em25d_SEAM mode=forward model=mod1.h5 model_H=none i_bwc=0 flist=0.15,0.25,0.5,1.0 verbose=5 sabs=1 rabs=1 acqui_file=acq1 nky=20 ofile=out_forward.nc minOff=2000.0,2000.0,2000.0,2000.0 maxOff=10000.0,10000.0,10000.0,10000.0 NoiseEx=1.0e-14,1.0e-14,1.0e-14,1.0e-14 bedThreshold=2 WorkDir=/data/gpfs/Users/j0497482/data/EM_data/Model46 StdErr=/data/gpfs/Users/j0497482/data/EM_data/Model46/./logs/Job_Standalone_244667.slurm_err StdIn=/dev/null StdOut=/data/gpfs/Users/j0497482/data/EM_data/Model46/./logs/Job_Standalone_244667.slurm_log Power= sacctmgr show qos normal normal 0 00:00:00 cluster 1.000000 On 3/5/19, 10:47 AM, "slurm-users on behalf of Michael Gutteridge" <slurm-users-boun...@lists.schedmd.com on behalf of michael.gutteri...@gmail.com> wrote: Hi It might be useful to see the configuration of the partition and how the QOS is set up... but at first blush I suspect you may need to set OverPartQOS (https://slurm.schedmd.com/resource_limits.html) to get the QOS limit to take precedence over the limit in the partition. However, the "reason" should be different if that were the case. Look at that, maybe send the QOS and partition config. - Michael On Tue, Mar 5, 2019 at 7:40 AM Matthew BETTINGER <matthew.bettin...@external.total.com> wrote: Hey slurm gurus. We have been trying to enable slurm QOS on a cray system here off and on for quite a while but can never get it working. Every time we try to enable QOS we disrupt the cluster and users and have to fall back. I'm not sure what we are doing wrong. We run a pretty open system here since we are a research group but there are time where we need to let a user run a job to exceed a partition limit. In lieu of using QOs the only other way we have figured out how to do this is create a new partition and push out the modified slurm.conf. It's a hassle. I'm not sure what information is needed exactly to troubleshoot this but I understand to enable QOS we need to enable this line in slurm.conf AccountingStorageEnforce=limits,qos Every time we attempt this no one can submit a job, slurm says waiting on resources I believe. We have accounting enabled and everyone is a member of the default qos group "normal". Configuration data as of 2019-03-05T09:36:19 AccountingStorageBackupHost = (null) AccountingStorageEnforce = none AccountingStorageHost = hickory-1 AccountingStorageLoc = N/A AccountingStoragePort = 6819 AccountingStorageTRES = cpu,mem,energy,node,bb/cray,gres/craynetwork,gres/gpu AccountingStorageType = accounting_storage/slurmdbd AccountingStorageUser = N/A AccountingStoreJobComment = Yes AcctGatherEnergyType = acct_gather_energy/rapl AcctGatherFilesystemType = acct_gather_filesystem/none AcctGatherInfinibandType = acct_gather_infiniband/none AcctGatherNodeFreq = 30 sec AcctGatherProfileType = acct_gather_profile/none AllowSpecResourcesUsage = 1 AuthInfo = (null) AuthType = auth/munge BackupAddr = hickory-2 BackupController = hickory-2 BatchStartTimeout = 10 sec BOOT_TIME = 2019-03-04T16:11:55 BurstBufferType = burst_buffer/cray CacheGroups = 0 CheckpointType = checkpoint/none ChosLoc = (null) ClusterName = hickory CompleteWait = 0 sec ControlAddr = hickory-1 ControlMachine = hickory-1 CoreSpecPlugin = cray CpuFreqDef = Performance CpuFreqGovernors = Performance,OnDemand CryptoType = crypto/munge DebugFlags = (null) DefMemPerNode = UNLIMITED DisableRootJobs = No EioTimeout = 60 EnforcePartLimits = NO Epilog = (null) EpilogMsgTime = 2000 usec EpilogSlurmctld = (null) ExtSensorsType = ext_sensors/none ExtSensorsFreq = 0 sec FairShareDampeningFactor = 1 FastSchedule = 0 FirstJobId = 1 GetEnvTimeout = 2 sec GresTypes = gpu,craynetwork GroupUpdateForce = 1 GroupUpdateTime = 600 sec HASH_VAL = Match HealthCheckInterval = 0 sec HealthCheckNodeState = ANY HealthCheckProgram = (null) InactiveLimit = 0 sec JobAcctGatherFrequency = 30 JobAcctGatherType = jobacct_gather/linux JobAcctGatherParams = (null) JobCheckpointDir = /var/slurm/checkpoint JobCompHost = localhost JobCompLoc = /var/log/slurm_jobcomp.log JobCompPort = 0 JobCompType = jobcomp/none JobCompUser = root JobContainerType = job_container/cncu JobCredentialPrivateKey = (null) JobCredentialPublicCertificate = (null) JobFileAppend = 0 JobRequeue = 1 JobSubmitPlugins = cray KeepAliveTime = SYSTEM_DEFAULT KillOnBadExit = 1 KillWait = 30 sec LaunchParameters = (null) LaunchType = launch/slurm Layouts = Licenses = (null) LicensesUsed = (null) MailDomain = (null) MailProg = /bin/mail MaxArraySize = 1001 MaxJobCount = 10000 MaxJobId = 67043328 MaxMemPerCPU = 128450 MaxStepCount = 40000 MaxTasksPerNode = 512 MCSPlugin = mcs/none MCSParameters = (null) MemLimitEnforce = Yes MessageTimeout = 10 sec MinJobAge = 300 sec MpiDefault = none MpiParams = ports=20000-32767 MsgAggregationParams = (null) NEXT_JOB_ID = 244342 NodeFeaturesPlugins = (null) OverTimeLimit = 0 min PluginDir = /opt/slurm/17.02.6/lib64/slurm PlugStackConfig = /etc/opt/slurm/plugstack.conf PowerParameters = (null) PowerPlugin = PreemptMode = OFF PreemptType = preempt/none PriorityParameters = (null) PriorityDecayHalfLife = 7-00:00:00 PriorityCalcPeriod = 00:05:00 PriorityFavorSmall = No PriorityFlags = PriorityMaxAge = 7-00:00:00 PriorityUsageResetPeriod = NONE PriorityType = priority/multifactor PriorityWeightAge = 0 PriorityWeightFairShare = 0 PriorityWeightJobSize = 0 PriorityWeightPartition = 0 PriorityWeightQOS = 0 PriorityWeightTRES = (null) PrivateData = none ProctrackType = proctrack/cray Prolog = (null) PrologEpilogTimeout = 65534 PrologSlurmctld = (null) PrologFlags = (null) PropagatePrioProcess = 0 PropagateResourceLimits = (null) PropagateResourceLimitsExcept = AS RebootProgram = (null) ReconfigFlags = (null) RequeueExit = (null) RequeueExitHold = (null) ResumeProgram = (null) ResumeRate = 300 nodes/min ResumeTimeout = 60 sec ResvEpilog = (null) ResvOverRun = 0 min ResvProlog = (null) ReturnToService = 2 RoutePlugin = route/default SallocDefaultCommand = (null) SbcastParameters = (null) SchedulerParameters = (null) SchedulerTimeSlice = 30 sec SchedulerType = sched/backfill SelectType = select/cray SelectTypeParameters = CR_CORE_MEMORY,OTHER_CONS_RES,NHC_ABSOLUTELY_NO SlurmUser = root(0) SlurmctldDebug = info SlurmctldLogFile = /var/spool/slurm/slurmctld.log SlurmctldPort = 6817 SlurmctldTimeout = 120 sec SlurmdDebug = info SlurmdLogFile = /var/spool/slurmd/%h.log SlurmdPidFile = /var/spool/slurmd/slurmd.pid SlurmdPlugstack = (null) SlurmdPort = 6818 SlurmdSpoolDir = /var/spool/slurmd SlurmdTimeout = 300 sec SlurmdUser = root(0) SlurmSchedLogFile = (null) SlurmSchedLogLevel = 0 SlurmctldPidFile = /var/spool/slurm/slurmctld.pid SlurmctldPlugstack = (null) SLURM_CONF = /etc/opt/slurm/slurm.conf SLURM_VERSION = 17.02.6 SrunEpilog = (null) SrunPortRange = 0-0 SrunProlog = (null) StateSaveLocation = /apps/cluster/hickory/slurm/ SuspendExcNodes = (null) SuspendExcParts = (null) SuspendProgram = (null) SuspendRate = 60 nodes/min SuspendTime = NONE SuspendTimeout = 30 sec SwitchType = switch/cray TaskEpilog = (null) TaskPlugin = task/cray,task/affinity,task/cgroup TaskPluginParam = (null type) TaskProlog = (null) TCPTimeout = 2 sec TmpFS = /tmp TopologyParam = (null) TopologyPlugin = topology/none TrackWCKey = No TreeWidth = 50 UsePam = 0 UnkillableStepProgram = (null) UnkillableStepTimeout = 60 sec VSizeFactor = 0 percent WaitTime = 0 sec Slurmctld(primary/backup) at hickory-1/hickory-2 are UP/UP