BTW, when I manually run on a node, e.g. compute-0-2, I get this output
]$ mpirun -np 4 pw.x -i mos2.rlx.in Program PWSCF v.6.2 starts on 28Mar2019 at 11:40:36 This program is part of the open-source Quantum ESPRESSO suite for quantum simulation of materials; please cite "P. Giannozzi et al., J. Phys.:Condens. Matter 21 395502 (2009); "P. Giannozzi et al., J. Phys.:Condens. Matter 29 465901 (2017); URL http://www.quantum-espresso.org", in publications or presentations arising from this work. More details at http://www.quantum-espresso.org/quote Parallel version (MPI), running on 4 processors MPI processes distributed on 1 nodes R & G space division: proc/nbgrp/npool/nimage = 4 Reading input from mos2.rlx.in Warning: card &CELL ignored Warning: card CELL_DYNAMICS = "BFGS" ignored Warning: card PRESS_CONV_THR = 5.00000E-01 ignored Warning: card / ignored Current dimensions of program PWSCF are: Max number of different atomic species (ntypx) = 10 Max number of k-points (npk) = 40000 Max angular momentum in pseudopotentials (lmaxx) = 3 file Mo.revpbe-spn-rrkjus_psl.0.3.0.UPF: wavefunction(s) 4S renormalized Subspace diagonalization in iterative solution of the eigenvalue problem: a serial algorithm will be used Found symmetry operation: I + ( 0.0000 0.1667 0.0000) ... ... ... Regards, Mahmood On Thu, Mar 28, 2019 at 8:23 PM Mahmood Naderan <mahmood...@gmail.com> wrote: > The run is not consistent. I have manually test "mpirun -np 4 pw.x -i > mos2.rlx.in" on compute-0-2 and rocks7 nodes and it is fine. > However, with the script "srun --pack-group=0 --ntasks=2 : --pack-group=1 > --ntasks=4 pw.x -i mos2.rlx.in" I see some errors in the output file > which results in abortion after 60 seconds. > > The errors are about not finding some files. Although the config file uses > absolute path for the intermediate files and files are existed, the errors > sound bizarre. > > > compute-0-2 > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 3387 ghatee 20 0 1930488 129684 8336 R 100.0 0.2 0:09.71 pw.x > 3388 ghatee 20 0 1930476 129700 8336 R 99.7 0.2 0:09.68 pw.x > > > > rocks7 > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 5592 ghatee 20 0 1930568 127764 8336 R 100.0 0.2 0:17.29 pw.x > 549 ghatee 20 0 116844 3652 1804 S 0.0 0.0 0:00.14 bash > > > > As you can see, 2 tasks are fine on compute-0-2, but there should be 4 > tasks on rocks7. > Input file contains > outdir = "/home/ghatee/job/2h-unitcell" , > pseudo_dir = "/home/ghatee/q-e-qe-5.4/pseudo/" , > > > The output file says > > Program PWSCF v.6.2 starts on 28Mar2019 at 11:43:58 > > This program is part of the open-source Quantum ESPRESSO suite > for quantum simulation of materials; please cite > "P. Giannozzi et al., J. Phys.:Condens. Matter 21 395502 (2009); > "P. Giannozzi et al., J. Phys.:Condens. Matter 29 465901 (2017); > URL http://www.quantum-espresso.org", > in publications or presentations arising from this work. More details > at > http://www.quantum-espresso.org/quote > > Parallel version (MPI), running on 1 processors > > MPI processes distributed on 1 nodes > Reading input from mos2.rlx.in > Warning: card &CELL ignored > Warning: card CELL_DYNAMICS = "BFGS" ignored > Warning: card PRESS_CONV_THR = 5.00000E-01 ignored > Warning: card / ignored > > Program PWSCF v.6.2 starts on 28Mar2019 at 11:43:58 > > This program is part of the open-source Quantum ESPRESSO suite > for quantum simulation of materials; please cite > "P. Giannozzi et al., J. Phys.:Condens. Matter 21 395502 (2009); > "P. Giannozzi et al., J. Phys.:Condens. Matter 29 465901 (2017); > URL http://www.quantum-espresso.org", > in publications or presentations arising from this work. More details > at > http://www.quantum-espresso.org/quote > > Parallel version (MPI), running on 1 processors > > MPI processes distributed on 1 nodes > Reading input from mos2.rlx.in > Warning: card &CELL ignored > Warning: card CELL_DYNAMICS = "BFGS" ignored > Warning: card PRESS_CONV_THR = 5.00000E-01 ignored > Warning: card / ignored > > Current dimensions of program PWSCF are: > Max number of different atomic species (ntypx) = 10 > Max number of k-points (npk) = 40000 > Max angular momentum in pseudopotentials (lmaxx) = 3 > > Current dimensions of program PWSCF are: > Max number of different atomic species (ntypx) = 10 > Max number of k-points (npk) = 40000 > Max angular momentum in pseudopotentials (lmaxx) = 3 > > Program PWSCF v.6.2 starts on 28Mar2019 at 20:13:58 > > This program is part of the open-source Quantum ESPRESSO suite > for quantum simulation of materials; please cite > "P. Giannozzi et al., J. Phys.:Condens. Matter 21 395502 (2009); > "P. Giannozzi et al., J. Phys.:Condens. Matter 29 465901 (2017); > URL http://www.quantum-espresso.org", > in publications or presentations arising from this work. More details > at > http://www.quantum-espresso.org/quote > > Parallel version (MPI), running on 1 processors > > MPI processes distributed on 1 nodes > Reading input from mos2.rlx.in > Warning: card &CELL ignored > Warning: card CELL_DYNAMICS = "BFGS" ignored > Warning: card PRESS_CONV_THR = 5.00000E-01 ignored > Warning: card / ignored > > Current dimensions of program PWSCF are: > Max number of different atomic species (ntypx) = 10 > Max number of k-points (npk) = 40000 > Max angular momentum in pseudopotentials (lmaxx) = 3 > > Program PWSCF v.6.2 starts on 28Mar2019 at 20:13:58 > > This program is part of the open-source Quantum ESPRESSO suite > for quantum simulation of materials; please cite > "P. Giannozzi et al., J. Phys.:Condens. Matter 21 395502 (2009); > "P. Giannozzi et al., J. Phys.:Condens. Matter 29 465901 (2017); > URL http://www.quantum-espresso.org", > in publications or presentations arising from this work. More details > at > http://www.quantum-espresso.org/quote > > Parallel version (MPI), running on 1 processors > > MPI processes distributed on 1 nodes > Reading input from mos2.rlx.in > Warning: card &CELL ignored > Warning: card CELL_DYNAMICS = "BFGS" ignored > Warning: card PRESS_CONV_THR = 5.00000E-01 ignored > Warning: card / ignored > > Current dimensions of program PWSCF are: > Max number of different atomic species (ntypx) = 10 > Max number of k-points (npk) = 40000 > Max angular momentum in pseudopotentials (lmaxx) = 3 > > Program PWSCF v.6.2 starts on 28Mar2019 at 20:13:58 > > This program is part of the open-source Quantum ESPRESSO suite > for quantum simulation of materials; please cite > "P. Giannozzi et al., J. Phys.:Condens. Matter 21 395502 (2009); > "P. Giannozzi et al., J. Phys.:Condens. Matter 29 465901 (2017); > URL http://www.quantum-espresso.org", > in publications or presentations arising from this work. More details > at > http://www.quantum-espresso.org/quote > > Program PWSCF v.6.2 starts on 28Mar2019 at 20:13:58 > > This program is part of the open-source Quantum ESPRESSO suite > for quantum simulation of materials; please cite > "P. Giannozzi et al., J. Phys.:Condens. Matter 21 395502 (2009); > "P. Giannozzi et al., J. Phys.:Condens. Matter 29 465901 (2017); > URL http://www.quantum-espresso.org", > in publications or presentations arising from this work. More details > at > http://www.quantum-espresso.org/quote > > Parallel version (MPI), running on 1 processors > > MPI processes distributed on 1 nodes > > Parallel version (MPI), running on 1 processors > > MPI processes distributed on 1 nodes > Reading input from mos2.rlx.in > Reading input from mos2.rlx.in > Warning: card &CELL ignored > Warning: card CELL_DYNAMICS = "BFGS" ignored > Warning: card &CELL ignored > Warning: card CELL_DYNAMICS = "BFGS" ignored > Warning: card PRESS_CONV_THR = 5.00000E-01 ignored > Warning: card / ignored > Warning: card PRESS_CONV_THR = 5.00000E-01 ignored > Warning: card / ignored > > Current dimensions of program PWSCF are: > Max number of different atomic species (ntypx) = 10 > Max number of k-points (npk) = 40000 > Max angular momentum in pseudopotentials (lmaxx) = 3 > > Current dimensions of program PWSCF are: > Max number of different atomic species (ntypx) = 10 > Max number of k-points (npk) = 40000 > Max angular momentum in pseudopotentials (lmaxx) = 3 > file Mo.revpbe-spn-rrkjus_psl.0.3.0.UPF: wavefunction(s) > 4S renormalized > file Mo.revpbe-spn-rrkjus_psl.0.3.0.UPF: wavefunction(s) > 4S renormalized > file Mo.revpbe-spn-rrkjus_psl.0.3.0.UPF: wavefunction(s) > 4S renormalized > file Mo.revpbe-spn-rrkjus_psl.0.3.0.UPF: wavefunction(s) > 4S renormalized > file Mo.revpbe-spn-rrkjus_psl.0.3.0.UPF: wavefunction(s) > 4S renormalized > file Mo.revpbe-spn-rrkjus_psl.0.3.0.UPF: wavefunction(s) > 4S renormalized > ERROR(FoX) > Cannot open file > ERROR(FoX) > Cannot open file > > > > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% > Error in routine read_ncpp (2): > pseudo file is empty or wrong > > > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% > > stopping ... > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > with errorcode 1. > ... > ... > ... > > > > > > Verifying that there are 6 " Parallel version (MPI), running on 1 > processors" lines, it seems that it starts normally as I specified in the > slurm script. However, I am suspecious that the program is NOT multicore > MPI job. It is 6 instances of a serial run and there may be some races > during the run. > Any thought? > > Regards, > Mahmood > > > > > On Thu, Mar 28, 2019 at 3:59 PM Frava <fravad...@gmail.com> wrote: > >> I didn't receive the last mail from Mahmood but Marcus is right, >> Mahmood's heterogeneous job submission seems to be working now. >> Well, separating each pack in the srun command and asking for the correct >> number of tasks to be launched for each pack is the way I figured the >> heterogeneous jobs worked with SLURM v18.08.0 (I didn't test it with more >> recent SLURM versions). >> >>