Hi With the following memory stats on two nodes [root@hpc slurm]# scontrol show node compute-0-0 | grep Memory RealMemory=64259 AllocMem=0 FreeMem=63429 Sockets=32 Boards=1 [root@hpc slurm]# scontrol show node compute-0-1 | grep Memory RealMemory=120705 AllocMem=1024 FreeMem=103051 Sockets=32 Boards=1
the following script #!/bin/bash #SBATCH --job-name=qe #SBATCH --output=my_fb.log #SBATCH --partition=SEA2 #SBATCH --account=fish2 #SBATCH --mem=18GB #SBATCH --nodes=2 #SBATCH --ntasks-per-node=8 mpirun -np $SLURM_NTASKS /share/apps/q-e-qe-6.5/bin/pw.x -in 5.in fails with AssocGrpMemLimit error. The user limit is 32G according to the following command [root@hpc slurm]# sacctmgr list association format=partition,account,user,grptres%30 | grep mn sea fish mn cpu=16,mem=32G sea2 fish2 mn cpu=16,mem=32G local mn According to squeue, there is another running job as below [root@hpc slurm]# squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 492 SEA2 qe mn PD 0:00 2 (AssocGrpMemLimit) 481 SEA U8Phi0.6 abb R 2-18:57:06 3 compute-0-[1-2],hpc The memory limit for the second user is 12G as below [root@hpc slurm]# sacctmgr list association format=partition,account,user,grptres%30 | grep abbas sea fish abb cpu=15,mem=12G local abb May I know what is exactly limiting the memory request for the use mn? Regards, Mahmood