> > -------- Forwarded Message -------- > > From: Nikos Papadimitriou <nik...@ipta.demokritos.gr> > > To: gmx-users@gromacs.org > > Subject: [gmx-users] Re: Gromacs 4.5.4 on multi-node cluster > > Date: Thu, 8 Dec 2011 11:44:36 +0200 > > > > > > -------- Forwarded Message -------- > > > > From: Nikos Papadimitriou <nik...@ipta.demokritos.gr> > > > > To: gmx-users@gromacs.org > > > > Subject: [gmx-users] Gromacs 4.5.4 on multi-node cluster > > > > Date: Wed, 7 Dec 2011 16:26:46 +0200 > > > > > > > > Dear All, > > > > > > > > I had been running Gromacs 4.0.7 on a 12-node cluster (Intel > > > > i7-920 4-cores) with OS Rocks 5.4.2. Recently, I have upgraded > > > > the cluster OS to Rocks 5.4.3 and I have installed Gromacs 4.5.4 > > > > from the Bio Roll repository. When running in parallel on the > > > > same node, everything works fine. However, when I am trying to > > > > run on more than one nodes the run stalls immediately with the > > > > following message: > > > > > > > > [gromacs@tornado > > > > Test]$ /home/gromacs/.Installed/openmpi/bin/mpirun -np 2 > > > > -machinefile > > > > machines /home/gromacs/.Installed/gromacs/bin/mdrun_mpi -s > > > > md_run.tpr -o md_traj.trr -c md_confs.gro -e md.edr -g md.log -v > > > > NNODES=2, MYRANK=0, HOSTNAME=compute-1-1.local > > > > NNODES=2, MYRANK=1, HOSTNAME=compute-1-2.local > > > > NODEID=0 argc=12 > > > > NODEID=1 argc=12 > > > > > > > > The mdrun_mpi thread seems to start in both nodes but the run > > > > does not go on and no file is produced. It seems that the nodes > > > > are waiting for some kind of communication between them. The > > > > problem occurs even for the simplest case (i.e. NVT simulation > > > > of 1000 Argon atoms without Coulombic interactions). Openmpi and > > > > networking between the nodes seem to work fine since there are > > > > not any problems with other software that run with MPI. > > > > > > > > In an attempt to find a solution, I have manually compiled and > > > > installed Gromacs 4.5.5 (with --enable-mpi) after having > > > > installed the latest version of openmpi and fftw3 and no error > > > > occurred during the installation. However, when trying to run on > > > > two different nodes exactly the same problem appears. > > > > > > > > Have you any idea what might cause this situation? > > > > Thank you in advance! > > > > -------- Forwarded Message -------- > > > > From: Mark Abraham <mark.abra...@anu.edu.au> > > > > Reply-to: "Discussion list for GROMACS users" > > > > <gmx-users@gromacs.org> > > > > To: Discussion list for GROMACS users <gmx-users@gromacs.org> > > > > Subject: [gmx-users] Gromacs 4.5.4 on multi-node cluster > > > > Date: Wed, 7 Dec 2011 16:53:49 +0200 > > > > > > > > On 8/12/2011 1:26 AM, Nikos Papadimitriou wrote: > > > > > > > > > Dear All, > > > > > > > > > > I had been running Gromacs 4.0.7 on a 12-node cluster (Intel > > > > > i7-920 4-cores) with OS Rocks 5.4.2. Recently, I have upgraded > > > > > the cluster OS to Rocks 5.4.3 and I have installed Gromacs > > > > > 4.5.4 from the Bio Roll repository. When running in parallel > > > > > on the same node, everything works fine. However, when I am > > > > > trying to run on more than one nodes the run stalls > > > > > immediately with the following message: > > > > > > > > > > [gromacs@tornado > > > > > Test]$ /home/gromacs/.Installed/openmpi/bin/mpirun -np 2 > > > > > -machinefile > > > > > machines /home/gromacs/.Installed/gromacs/bin/mdrun_mpi -s > > > > > md_run.tpr -o md_traj.trr -c md_confs.gro -e md.edr -g md.log > > > > > -v > > > > > NNODES=2, MYRANK=0, HOSTNAME=compute-1-1.local > > > > > NNODES=2, MYRANK=1, HOSTNAME=compute-1-2.local > > > > > NODEID=0 argc=12 > > > > > NODEID=1 argc=12 > > > > > > > > > > The mdrun_mpi thread seems to start in both nodes but the run > > > > > does not go on and no file is produced. It seems that the > > > > > nodes are waiting for some kind of communication between them. > > > > > The problem occurs even for the simplest case (i.e. NVT > > > > > simulation of 1000 Argon atoms without Coulombic > > > > > interactions). Openmpi and networking between the nodes seem > > > > > to work fine since there are not any problems with other > > > > > software that run with MPI. > > > > > > > > > > > > Can you run 2-processor MPI test program with that machine file? > > > > > > > > Mark > > > > > > > > "Unfortunately", other MPI programs run fine on 2 or more nodes. > > There seems to be no problem with MPI. > > > > > > > > > > > > In an attempt to find a solution, I have manually compiled and > > > > > installed Gromacs 4.5.5 (with --enable-mpi) after having > > > > > installed the latest version of openmpi and fftw3 and no error > > > > > occurred during the installation. However, when trying to run > > > > > on two different nodes exactly the same problem appears. > > > > > > > > > > Have you any idea what might cause this situation? > > > > > Thank you in advance! > > > > -------- Forwarded Message -------- > > From: Dimitris Dellis <nte...@gmail.com> > > Reply-to: "Discussion list for GROMACS users" > > <gmx-users@gromacs.org> > > To: Nikos Papadimitriou <nik...@ipta.demokritos.gr>, Discussion list > > for GROMACS users <gmx-users@gromacs.org> > > Subject: [gmx-users] Re: Gromacs 4.5.4 on multi-node cluster > > Date: Thu, 8 Dec 2011 12:06:10 +0200 > > > > Hi. > > This is openmpi related. > > > > Probably you have active the virbr0 interface with IP 192.168.122.1 > > on nodes. > > Stop and disable the libvirtd (and probably libvirt-guests) service > > if you don't need it. > > > > Alternatively, > > 1. add --mca btl_tcp_if_exclude lo,virbr0 in mpirun flags > > or > > 2. add > > in /home/gromacs/.Installed/openmpi/etc/openmpi-mca-params.conf the > > following line : > > btl_tcp_if_exclude = lo,virbr0 > > to exclude virbr0 from the interfaces list that openmpi can use for > > communication. > > > > (if virtbr1 etc. are present, add also in exclude list) > > > >
Thank you very much! It works
-- gmx-users mailing list gmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. Can't post? Read http://www.gromacs.org/Support/Mailing_Lists