On this part, I don’t think that’s always the case. On a node with 384 GB (with 2 GB reserved for the OS), we’ve got several jobs running under mem=32000:
===== $ grep 'NodeName=gpunode\[00' /etc/slurm/slurm.conf NodeName=gpunode[001-003] CoresPerSocket=14 RealMemory=382000 Sockets=2 ThreadsPerCore=1 Weight=10011 Gres=gpu:2 $ squeue -t R | grep gpunode001 555699 bigme lstm_rel_w namartinda R 16:17 *:*: 1 32000M gpunode001 2020-01-28T08:41:31 2020-01-28T08:41:31 2020-01-28T14:41:31 N/A 555700 bigme lstm_rel_w namartinda R 16:17 *:*: 1 32000M gpunode001 2020-01-28T08:41:31 2020-01-28T08:41:31 2020-01-28T14:41:31 N/A … 555709 bigme lstm_rel_w namartinda R 16:17 *:*: 1 32000M gpunode001 2020-01-28T08:41:31 2020-01-28T08:41:31 2020-01-28T14:41:32 N/A 555688 bigme lstm_rel_w namartinda R 36:37 *:*: 1 32000M gpunode001 2020-01-28T08:21:10 2020-01-28T08:21:11 2020-01-28T14:21:11 N/A $ ===== This is with SelectType=select/cons_res , SelectTypeParameters=CR_Core_Memory , and cgroups enabled. > On Jan 27, 2020, at 10:45 PM, Mahmood Naderan <mahmood...@gmail.com> wrote: > > 1) --mem belongs to the physical memory which is requested by job and is > later reserved for the job by slurm. > So, on a 64GB node, if a user requests --mem=50GB, actually no one else can > run a job with 10GB memory need.