On Thu, 03 May 2012, Rolf vandeVaart wrote:
> I tried your program on a single node and it worked fine. 

It works fine on a single node, but deadlocks when it communicates in
between nodes. Single node communication doesn't use tcp by default.

> Yes, TCP message passing in Open MPI has been working well for some
> time.

Ok. Which version(s) of openmpi are you using successfully? [I'm
assuming that this is in an environment which doesn't use IB.]

> 1. Can you run something like hostname successfully (mpirun -np 10
> -hostfile yourhostfile hostname)

Yes, but this only shows that processes start and output is returned,
which doesn't utilize the in-band message passing at all.

> 2. If that works, then you can also run with a debug switch to see
> what connections are being made by MPI.

You can see the connections being made in the attached log:

[archimedes:29820] btl: tcp: attempting to connect() to [[60576,1],2] address 
138.23.141.162 on port 2001

> I would suggest reading through here for some ideas and for the
> debug switch.

Thanks. I checked the FAQ, and didn't see anything that shed any
light, unfortunately.


Don Armstrong

-- 
Fate and Temperament are two words for one and the same concept.
 -- Novalis [Hermann Hesse _Demian_]

http://www.donarmstrong.com              http://rzlab.ucr.edu

Reply via email to