Hi.
This is openmpi related.
Probably you have active the virbr0 interface with IP 192.168.122.1 on
nodes.
Stop and disable the libvirtd (and probably libvirt-guests) service if
you don't need it.
Alternatively,
1. add --mca btl_tcp_if_exclude lo,virbr0 in mpirun flags
or
2. add in /home/gromacs/.Installed/openmpi/etc/openmpi-mca-params.conf
the following line :
btl_tcp_if_exclude = lo,virbr0
to exclude virbr0 from the interfaces list that openmpi can use for
communication.
(if virtbr1 etc. are present, add also in exclude list)
On 12/08/2011 11:44 AM, Nikos Papadimitriou wrote:
email message attachment
-------- Forwarded Message --------
*From*: Nikos Papadimitriou <nik...@ipta.demokritos.gr
<mailto:nikos%20papadimitriou%20%3cnik...@ipta.demokritos.gr%3e>>
*To*: gmx-users@gromacs.org <mailto:gmx-users@gromacs.org>
*Subject*: [gmx-users] Gromacs 4.5.4 on multi-node cluster
*Date*: Wed, 7 Dec 2011 16:26:46 +0200
Dear All,
I had been running Gromacs 4.0.7 on a 12-node cluster (Intel i7-920
4-cores) with OS Rocks 5.4.2. Recently, I have upgraded the cluster
OS to Rocks 5.4.3 and I have installed Gromacs 4.5.4 from the Bio
Roll repository. When running in parallel on the same node,
everything works fine. However, when I am trying to run on more than
one nodes the run stalls immediately with the following message:
[gromacs@tornado Test]$ /home/gromacs/.Installed/openmpi/bin/mpirun
-np 2 -machinefile machines
/home/gromacs/.Installed/gromacs/bin/mdrun_mpi -s md_run.tpr -o
md_traj.trr -c md_confs.gro -e md.edr -g md.log -v
NNODES=2, MYRANK=0, HOSTNAME=compute-1-1.local
NNODES=2, MYRANK=1, HOSTNAME=compute-1-2.local
NODEID=0 argc=12
NODEID=1 argc=12
The mdrun_mpi thread seems to start in both nodes but the run does
not go on and no file is produced. It seems that the nodes are
waiting for some kind of communication between them. The problem
occurs even for the simplest case (i.e. NVT simulation of 1000 Argon
atoms without Coulombic interactions). Openmpi and networking
between the nodes seem to work fine since there are not any problems
with other software that run with MPI.
In an attempt to find a solution, I have manually compiled and
installed Gromacs 4.5.5 (with --enable-mpi) after having installed
the latest version of openmpi and fftw3 and no error occurred during
the installation. However, when trying to run on two different nodes
exactly the same problem appears.
Have you any idea what might cause this situation?
Thank you in advance!
email message attachment
-------- Forwarded Message --------
*From*: Mark Abraham <mark.abra...@anu.edu.au
<mailto:mark%20abraham%20%3cmark.abra...@anu.edu.au%3e>>
*Reply-to*: "Discussion list for GROMACS users" <gmx-users@gromacs.org>
*To*: Discussion list for GROMACS users <gmx-users@gromacs.org
<mailto:discussion%20list%20for%20gromacs%20users%20%3cgmx-us...@gromacs.org%3e>>
*Subject*: [gmx-users] Gromacs 4.5.4 on multi-node cluster
*Date*: Wed, 7 Dec 2011 16:53:49 +0200
On 8/12/2011 1:26 AM, Nikos Papadimitriou wrote:
Dear All,
I had been running Gromacs 4.0.7 on a 12-node cluster (Intel i7-920
4-cores) with OS Rocks 5.4.2. Recently, I have upgraded the cluster
OS to Rocks 5.4.3 and I have installed Gromacs 4.5.4 from the Bio
Roll repository. When running in parallel on the same node,
everything works fine. However, when I am trying to run on more
than one nodes the run stalls immediately with the following message:
[gromacs@tornado Test]$ /home/gromacs/.Installed/openmpi/bin/mpirun
-np 2 -machinefile machines
/home/gromacs/.Installed/gromacs/bin/mdrun_mpi -s md_run.tpr -o
md_traj.trr -c md_confs.gro -e md.edr -g md.log -v
NNODES=2, MYRANK=0, HOSTNAME=compute-1-1.local
NNODES=2, MYRANK=1, HOSTNAME=compute-1-2.local
NODEID=0 argc=12
NODEID=1 argc=12
The mdrun_mpi thread seems to start in both nodes but the run does
not go on and no file is produced. It seems that the nodes are
waiting for some kind of communication between them. The problem
occurs even for the simplest case (i.e. NVT simulation of 1000
Argon atoms without Coulombic interactions). Openmpi and networking
between the nodes seem to work fine since there are not any
problems with other software that run with MPI.
Can you run 2-processor MPI test program with that machine file?
Mark
"Unfortunately", other MPI programs run fine on 2 or more nodes. There
seems to be no problem with MPI.
In an attempt to find a solution, I have manually compiled and
installed Gromacs 4.5.5 (with --enable-mpi) after having installed
the latest version of openmpi and fftw3 and no error occurred
during the installation. However, when trying to run on two
different nodes exactly the same problem appears.
Have you any idea what might cause this situation?
Thank you in advance!
--
gmx-users mailing list gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists