I test with env strace srun --pack-group=0 --ntasks=2 : --pack-group=1 --ntasks=4 pw.x -i mos2.rlx.in
in the slurm script and everything is fine now!! This is going to be a nasty bug to find... Regards, Mahmood On Thu, Mar 28, 2019 at 9:18 PM Mahmood Naderan <mahmood...@gmail.com> wrote: > Yes that works. > > $ grep "Parallel version" big-mem > Parallel version (MPI), running on 1 processors > Parallel version (MPI), running on 1 processors > Parallel version (MPI), running on 1 processors > Parallel version (MPI), running on 1 processors > $ squeue > JOBID PARTITION NAME USER ST TIME NODES > NODELIST(REASON) > 776 QUARTZ myQE ghatee R 1:08 2 > compute-0-2,rocks7 > $ grep "pseudo file is empty or wrong" big-mem > $ squeue > JOBID PARTITION NAME USER ST TIME NODES > NODELIST(REASON) > 776 QUARTZ myQE ghatee R 1:47 2 > compute-0-2,rocks7 > $ cat slurm_script.sh > #!/bin/bash > #SBATCH --job-name=myQE > #SBATCH --output=big-mem > #SBATCH --ntasks-per-node=2 > #SBATCH --nodes=2 > #SBATCH --mem-per-cpu=16G > #SBATCH --partition=QUARTZ > #SBATCH --account=z5 > srun pw.x -i mos2.rlx.in > > > I will try to dig more. > > Regards, > Mahmood > > > > > On Thu, Mar 28, 2019 at 9:04 PM Frava <fravad...@gmail.com> wrote: > >> Well, does it also crash when you run it with two nodes in a normal way >> (not using heterogeneous jobs) ? >> >> #!/bin/bash >> #SBATCH --job-name=myQE_2Nx2MPI >> #SBATCH --output=big-mem >> #SBATCH --nodes=2 >> #SBATCH --ntasks-per-node=2 >> #SBATCH --mem-per-cpu=16g >> #SBATCH --partition=QUARTZ >> #SBATCH --account=z5 >> # >> srun pw.x -i mos2.rlx.in >> >>