On Thu, 03 May 2012, Rolf vandeVaart wrote: > I tried your program on a single node and it worked fine.
It works fine on a single node, but deadlocks when it communicates in between nodes. Single node communication doesn't use tcp by default. > Yes, TCP message passing in Open MPI has been working well for some > time. Ok. Which version(s) of openmpi are you using successfully? [I'm assuming that this is in an environment which doesn't use IB.] > 1. Can you run something like hostname successfully (mpirun -np 10 > -hostfile yourhostfile hostname) Yes, but this only shows that processes start and output is returned, which doesn't utilize the in-band message passing at all. > 2. If that works, then you can also run with a debug switch to see > what connections are being made by MPI. You can see the connections being made in the attached log: [archimedes:29820] btl: tcp: attempting to connect() to [[60576,1],2] address 138.23.141.162 on port 2001 > I would suggest reading through here for some ideas and for the > debug switch. Thanks. I checked the FAQ, and didn't see anything that shed any light, unfortunately. Don Armstrong -- Fate and Temperament are two words for one and the same concept. -- Novalis [Hermann Hesse _Demian_] http://www.donarmstrong.com http://rzlab.ucr.edu