Mark,
I think the problem is likely due to the networking differences
between the nodes. Check out these two FAQ entries:
http://www.open-mpi.org/faq/?category=tcp#tcp-multi-network
http://www.open-mpi.org/faq/?category=tcp#tcp-selection
Specifically, I think you should try using a pair of these
Are all three machines running the same OS and version, perchance? If
the machines are heterogeneous in terms of OS, glibc version, etc.,
weird things like these hangs can occur.
Additionally, are you running a firewall on any of these machines?
Ensure that iptables isn't running. It doe
On Jan 15, 2008 7:54 PM, Mark Kosmowski wrote:
> Dear Open-MPI Community:
>
> I have a 3 node cluster, each a dual opteron workstation running
> OpenSUSE 10.1 64-bit. The node names are LT, SGT and PFC. When I
> start an mpirun job from either SGT or PFC, things work as they are
> supposed to.
On Tue, Jan 15, 2008 at 07:54:33PM -0500, Mark Kosmowski wrote:
> Dear Open-MPI Community:
>
> I have a 3 node cluster, each a dual opteron workstation running
> OpenSUSE 10.1 64-bit. The node names are LT, SGT and PFC. When I
> start an mpirun job from either SGT or PFC, things work as they are
Dear Open-MPI Community:
I have a 3 node cluster, each a dual opteron workstation running
OpenSUSE 10.1 64-bit. The node names are LT, SGT and PFC. When I
start an mpirun job from either SGT or PFC, things work as they are
supposed to. However, if I start the same job from LT, the jobs hangs
at