Please see the latest update # for i in {0..2}; do scontrol show node compute-0-$i | grep RealMemory; done && scontrol show node hpc | grep RealMemory RealMemory=64259 AllocMem=1024 FreeMem=57163 Sockets=32 Boards=1 RealMemory=120705 AllocMem=1024 FreeMem=97287 Sockets=32 Boards=1 RealMemory=64259 AllocMem=1024 FreeMem=40045 Sockets=32 Boards=1 RealMemory=64259 AllocMem=1024 FreeMem=24154 Sockets=10 Boards=1
$ sbatch slurm_qe.sh Submitted batch job 125 $ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 125 SEA qe-fb mahmood PD 0:00 4 (Resources) 124 SEA U1phi1 abspou R 3:52 4 compute-0-[0-2],hpc $ scontrol show -d job 125 JobId=125 JobName=qe-fb UserId=mahmood(1000) GroupId=mahmood(1000) MCS_label=N/A Priority=1751 Nice=0 Account=fish QOS=normal WCKey=*default JobState=PENDING Reason=Resources Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 DerivedExitCode=0:0 RunTime=00:00:00 TimeLimit=30-00:00:00 TimeMin=N/A SubmitTime=2019-12-17T12:29:08 EligibleTime=2019-12-17T12:29:08 AccrueTime=2019-12-17T12:29:08 StartTime=Unknown EndTime=Unknown Deadline=N/A SuspendTime=None SecsPreSuspend=0 LastSchedEval=2019-12-17T12:29:09 Partition=SEA AllocNode:Sid=hpc.scu.ac.ir:22742 ReqNodeList=(null) ExcNodeList=(null) NodeList=(null) NumNodes=4-4 NumCPUs=20 NumTasks=20 CPUs/Task=1 ReqB:S:C:T=0:0:*:* TRES=cpu=20,mem=40G,node=4,billing=20 Socks/Node=* NtasksPerN:B:S:C=5:0:*:* CoreSpec=* MinCPUsNode=5 MinMemoryNode=10G MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null) Command=/home/mahmood/qe/f_borophene/slurm_qe.sh WorkDir=/home/mahmood/qe/f_borophene StdErr=/home/mahmood/qe/f_borophene/my_fb.log StdIn=/dev/null StdOut=/home/mahmood/qe/f_borophene/my_fb.log Power= $ cat slurm_qe.sh #!/bin/bash #SBATCH --job-name=qe-fb #SBATCH --output=my_fb.log #SBATCH --partition=SEA #SBATCH --account=fish #SBATCH --mem=10GB #SBATCH --nodes=4 #SBATCH --ntasks-per-node=5 mpirun -np $SLURM_NTASKS /share/apps/q-e-qe-6.5/bin/pw.x -in f_borophene_scf.in You can also see the job detail of 124 $ scontrol show -d job 124 JobId=124 JobName=U1phi1 UserId= abspou(1002) GroupId= abspou(1002) MCS_label=N/A Priority=958 Nice=0 Account=fish QOS=normal WCKey=*default JobState=RUNNING Reason=None Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 DerivedExitCode=0:0 RunTime=00:06:17 TimeLimit=30-00:00:00 TimeMin=N/A SubmitTime=2019-12-17T12:25:17 EligibleTime=2019-12-17T12:25:17 AccrueTime=2019-12-17T12:25:17 StartTime=2019-12-17T12:25:17 EndTime=2020-01-16T12:25:17 Deadline=N/A SuspendTime=None SecsPreSuspend=0 LastSchedEval=2019-12-17T12:25:17 Partition=SEA AllocNode:Sid=hpc.scu.ac.ir:20085 ReqNodeList=(null) ExcNodeList=(null) NodeList=compute-0-[0-2],hpc BatchHost=compute-0-0 NumNodes=4 NumCPUs=24 NumTasks=24 CPUs/Task=1 ReqB:S:C:T=0:0:*:* TRES=cpu=24,mem=4G,node=4,billing=24 Socks/Node=* NtasksPerN:B:S:C=6:0:*:* CoreSpec=* Nodes=compute-0-[0-2],hpc CPU_IDs=0-5 Mem=1024 GRES= MinCPUsNode=6 MinMemoryNode=1G MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null) Command=/home/abspou/OpenFOAM/abbaspour-6/run/laminarSMOKEPhi1U1/slurm_script.sh WorkDir=/home/abspou/OpenFOAM/abbaspour-6/run/laminarSMOKEPhi1U1 StdErr=/home/abspou/OpenFOAM/abbaspour-6/run/laminarSMOKEPhi1U1/alpha3.45U1phi1lamSmoke.log StdIn=/dev/null StdOut=/home/abspou/OpenFOAM/abbaspour-6/run/laminarSMOKEPhi1U1/alpha3.45U1phi1lamSmoke.log Power= I can not figure out what is the root of the problem. Regards, Mahmood On Tue, Dec 17, 2019 at 11:18 AM Marcus Wagner <wag...@itc.rwth-aachen.de> wrote: > Dear Mahmood, > > could you please show the output of > > scontrol show -d job 119 > > Best > Marcus >