Oh, ok. I guess I was expecting that the GPU job was suspended copying GPU memory to RAM memory.
I tried also: REQUEUE,GANG and CANCEL,GANG. None of these options seems to be able to preempt GPU jobs On Fri, 13 Jan 2023 at 12:30, Kevin Broch <kbr...@rivosinc.com> wrote: > My guess, is that this isn't possible with GANG,SUSPEND. GPU memory isn't > managed in Slurm so the idea of suspending GPU memory for another job to > use the rest simply isn't possible. > > On Fri, Jan 13, 2023 at 4:08 AM Helder Daniel <hdan...@ualg.pt> wrote: > >> Hi Kevin >> >> I did a "scontrol show partition". >> Oversubscribe was not enabled. >> I enable it in slurm.conf with: >> >> (...) >> GresTypes=gpu >> NodeName=asimov Gres=gpu:4 Sockets=1 CoresPerSocket=32 ThreadsPerCore=2 >> State=UNKNOWN >> PartitionName=asimov01 *OverSubscribe=FORCE* Nodes=asimov Default=YES >> MaxTime=INFINITE MaxNodes=1 DefCpuPerGPU=2 State=UP >> >> but now it is working only with CPU jobs. It does not preempt gpu jobs. >> Lauching 3 cpu only jobs, each requiring 32 out of 64 cores it preempt >> after the timeslice as expected >> >> sbatch --cpus-per-task=32 test-cpu.sh >> >> JOBID PARTITION NAME USER ST TIME NODES >> NODELIST(REASON) >> 352 asimov01 cpu-only hdaniel R 0:58 1 asimov >> 353 asimov01 cpu-only hdaniel R 0:25 1 asimov >> 351 asimov01 cpu-only hdaniel S 0:36 1 asimov >> >> But launching 3 GPU jobs, each requiring 2 out of 4 GPUs it does not >> preempt the first 2 that start running. >> It says that the 3rd job is hanging on resources. >> >> JOBID PARTITION NAME USER ST TIME NODES >> NODELIST(REASON) >> 356 asimov01 gpu hdaniel PD 0:00 1 >> (Resources) >> 354 asimov01 gpu hdaniel R 3:05 1 asimov >> 355 asimov01 gpu hdaniel R 3:02 1 asimov >> >> Do I need to change anything else in the configuration to support also >> gpu gang scheduling? >> Thanks >> >> >> ============================================================================ >> scontrol show partition asimov01 >> PartitionName=asimov01 >> AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL >> AllocNodes=ALL Default=YES QoS=N/A >> DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 >> Hidden=NO >> MaxNodes=1 MaxTime=UNLIMITED MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED >> Nodes=asimov >> PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO >> OverSubscribe=NO >> OverTimeLimit=NONE PreemptMode=GANG,SUSPEND >> State=UP TotalCPUs=64 TotalNodes=1 SelectTypeParameters=NONE >> JobDefaults=DefCpuPerGPU=2 >> DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED >> >> On Fri, 13 Jan 2023 at 11:16, Kevin Broch <kbr...@rivosinc.com> wrote: >> >>> Problem might be that OverSubscribe is not enabled? w/o it, I don't >>> believe the time-slicing can be GANG scheduled >>> >>> Can you do a "scontrol show partition" to verify that it is? >>> >>> On Thu, Jan 12, 2023 at 6:24 PM Helder Daniel <hdan...@ualg.pt> wrote: >>> >>>> Hi, >>>> >>>> I am trying to enable gang scheduling on a server with a CPU with 32 >>>> cores and 4 GPUs. >>>> >>>> However, using Gang sched, the cpu jobs (or gpu jobs) are not being >>>> preempted after the time slice, which is set to 30 secs. >>>> >>>> Below is a snapshot of squeue. There are 3 jobs each needing 32 cores. >>>> The first 2 jobs launched are never preempted. The 3rd job is forever (or >>>> at least until one of the other 2 ends) starving: >>>> >>>> JOBID PARTITION NAME USER ST TIME NODES >>>> NODELIST(REASON) >>>> 313 asimov01 cpu-only hdaniel PD 0:00 1 >>>> (Resources) >>>> 311 asimov01 cpu-only hdaniel R 1:52 1 >>>> asimov >>>> 312 asimov01 cpu-only hdaniel R 1:49 1 >>>> asimov >>>> >>>> The same happens with GPU jobs. If I launch 5 jobs, requiring one GPU >>>> each, the 5th job will never run. The preemption is not working with the >>>> specified timeslice. >>>> >>>> I tried several combinations: >>>> >>>> SchedulerType=sched/builtin and backfill >>>> SelectType=select/cons_tres and linear >>>> >>>> I'll appreciate any help and suggestions >>>> The slurm.conf is below. >>>> Thanks >>>> >>>> ClusterName=asimov >>>> SlurmctldHost=localhost >>>> MpiDefault=none >>>> ProctrackType=proctrack/linuxproc # proctrack/cgroup >>>> ReturnToService=2 >>>> SlurmctldPidFile=/var/run/slurmctld.pid >>>> SlurmctldPort=6817 >>>> SlurmdPidFile=/var/run/slurmd.pid >>>> SlurmdPort=6818 >>>> SlurmdSpoolDir=/var/lib/slurm/slurmd >>>> SlurmUser=slurm >>>> StateSaveLocation=/var/lib/slurm/slurmctld >>>> SwitchType=switch/none >>>> TaskPlugin=task/none # task/cgroup >>>> # >>>> # TIMERS >>>> InactiveLimit=0 >>>> KillWait=30 >>>> MinJobAge=300 >>>> SlurmctldTimeout=120 >>>> SlurmdTimeout=300 >>>> Waittime=0 >>>> # >>>> # SCHEDULING >>>> #FastSchedule=1 #obsolete >>>> SchedulerType=sched/builtin #backfill >>>> SelectType=select/cons_tres >>>> SelectTypeParameters=CR_Core #CR_Core_Memory let's only one job run >>>> at a time >>>> PreemptType = preempt/partition_prio >>>> PreemptMode = SUSPEND,GANG >>>> SchedulerTimeSlice=30 #in seconds, default 30 >>>> # >>>> # LOGGING AND ACCOUNTING >>>> #AccountingStoragePort= >>>> AccountingStorageType=accounting_storage/none >>>> #AccountingStorageEnforce=associations >>>> #ClusterName=bip-cluster >>>> JobAcctGatherFrequency=30 >>>> JobAcctGatherType=jobacct_gather/linux >>>> SlurmctldDebug=info >>>> SlurmctldLogFile=/var/log/slurm/slurmctld.log >>>> SlurmdDebug=info >>>> SlurmdLogFile=/var/log/slurm/slurmd.log >>>> # >>>> # >>>> # COMPUTE NODES >>>> #NodeName=asimov CPUs=64 RealMemory=500 State=UNKNOWN >>>> #PartitionName=LocalQ Nodes=ALL Default=YES MaxTime=INFINITE State=UP >>>> >>>> # Partitions >>>> GresTypes=gpu >>>> NodeName=asimov Gres=gpu:4 Sockets=1 CoresPerSocket=32 ThreadsPerCore=2 >>>> State=UNKNOWN >>>> PartitionName=asimov01 Nodes=asimov Default=YES MaxTime=INFINITE >>>> MaxNodes=1 DefCpuPerGPU=2 State=UP >>>> >>>> >> >> -- >> com os melhores cumprimentos, >> >> Helder Daniel >> Universidade do Algarve >> Faculdade de Ciências e Tecnologia >> Departamento de Engenharia Electrónica e Informática >> https://www.ualg.pt/pt/users/hdaniel >> > -- com os melhores cumprimentos, Helder Daniel Universidade do Algarve Faculdade de Ciências e Tecnologia Departamento de Engenharia Electrónica e Informática https://www.ualg.pt/pt/users/hdaniel