What is the contents of your /etc/slurm/job_submit.lua file?
Did you reconfigure slurmctld?
Check the log file by: grep job_submit /var/log/slurm/slurmctld.log
What is your Slurm version?

You can read about job_submit plugins in this Wiki page:
https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_configuration/#job-submit-plugins

I hope this helps,
Ole


On 3/20/24 09:49, Gestió Servidors via slurm-users wrote:
after adding “EnforcePartLimits=ALL” in slurm.conf and restarting slurmctld daemon, job continues being accepted… so I don’t undertand where I’m doing some wrong.

My slurm.conf is this:

ControlMachine=my_server

MailProg=/bin/mail

MpiDefault=none

ProctrackType=proctrack/linuxproc

ReturnToService=2

SlurmctldPidFile=/var/run/slurmctld.pid

SlurmctldPort=6817

SlurmdPidFile=/var/run/slurmd.pid

SlurmdPort=6818

SlurmdSpoolDir=/var/spool/slurmd

SlurmUser=slurm

SlurmdUser=root

AuthType=auth/munge

StateSaveLocation=/var/log/slurm

SwitchType=switch/none

TaskPlugin=task/none,task/affinity,task/cgroup

TaskPluginParam=none

DebugFlags=NO_CONF_HASH,Backfill,BackfillMap,SelectType,Steps,TraceJobs

*JobSubmitPlugins=lua*

SchedulerType=sched/backfill

SelectType=select/cons_tres

SelectTypeParameters=CR_Core

SchedulerParameters=max_script_size=20971520

*EnforcePartLimits=ALL*

CoreSpecPlugin=core_spec/none

AccountingStorageType=accounting_storage/slurmdbd

AccountingStoreFlags=job_comment

JobCompType=jobcomp/filetxt

JobCompLoc=/var/log/slurm/job_completions

ClusterName=my_cluster

JobAcctGatherType=jobacct_gather/linux

SlurmctldDebug=5

SlurmctldLogFile=/var/log/slurmctld.log

SlurmdDebug=5

SlurmdLogFile=/var/log/slurmd.log

AccountingStorageEnforce=limits

AccountingStorageHost=my_server

NodeName=clus[01-06] CPUs=12 SocketsPerBoard=2 CoresPerSocket=6 ThreadsPerCore=1 RealMemory=128387 TmpDisk=81880 Feature=big-mem

NodeName=clus[07-12] CPUs=12 SocketsPerBoard=2 CoresPerSocket=6 ThreadsPerCore=1 RealMemory=15491 TmpDisk=81880 Feature=small-mem

NodeName=clus-login CPUs=4 SocketsPerBoard=2 CoresperSocket=2 ThreadsperCore=1 RealMemory=15886 TmpDisk=30705

*PartitionName=nodo.q Nodes=clus[01-12] Default=YES MaxTime=04:00:00 State=UP AllocNodes=clus-login,clus05 MaxCPUsPerNode=12*

KillOnBadExit=1

OverTimeLimit=30 # si el trabajo dura mas de 30 minutos despues del tiempo maximo (2 horas), se cancela

TCPTimeout=5

PriorityType=priority/multifactor

PriorityDecayHalfLife=7-0

PriorityCalcPeriod=5

PriorityUsageResetPeriod=QUARTERLY

PriorityFavorSmall=NO

PriorityMaxAge=7-0

PriorityWeightAge=10000

PriorityWeightFairshare=1000000

PriorityWeightJobSize=1000

PriorityWeightPartition=1000

PriorityWeightQOS=0

PropagateResourceLimitsExcept=MEMLOCK

And testing script is this:

#!/bin/bash

*#SBATCH --time=5-00:00:00*

srun /bin/hostname

date

sleep 50

date

Why my job is being submited into the queue and not refused BEFORE being queued?

--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to