Dear All,

I had been running Gromacs 4.0.7 on a 12-node cluster (Intel i7-920
4-cores) with OS Rocks 5.4.2. Recently, I have upgraded the cluster OS
to Rocks 5.4.3 and I have installed Gromacs 4.5.4 from the Bio Roll
repository. When running in parallel on the same node, everything works
fine. However, when I am trying to run on more than one nodes the run
stalls immediately with the following message:

[gromacs@tornado Test]$ /home/gromacs/.Installed/openmpi/bin/mpirun -np
2 -machinefile machines /home/gromacs/.Installed/gromacs/bin/mdrun_mpi
-s md_run.tpr -o md_traj.trr -c md_confs.gro -e md.edr -g md.log -v
NNODES=2, MYRANK=0, HOSTNAME=compute-1-1.local
NNODES=2, MYRANK=1, HOSTNAME=compute-1-2.local
NODEID=0 argc=12
NODEID=1 argc=12

The mdrun_mpi thread seems to start in both nodes but the run does not
go on and no file is produced. It seems that the nodes are waiting for
some kind of communication between them. The problem occurs even for the
simplest case (i.e. NVT simulation of 1000 Argon atoms without Coulombic
interactions). Openmpi and networking between the nodes seem to work
fine since there are not any problems with other software that run with
MPI.

In an attempt to find a solution, I have manually compiled and installed
Gromacs 4.5.5 (with --enable-mpi) after having installed the latest
version of openmpi and fftw3 and no error occurred during the
installation. However, when trying to run on two different nodes exactly
the same problem appears.

Have you any idea what might cause this situation?
Thank you in advance!
-- 
gmx-users mailing list    gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

Reply via email to