Re: [OMPI users] odd network behavior

2008-01-25 Thread Tim Mattox
Mark, I think the problem is likely due to the networking differences between the nodes. Check out these two FAQ entries: http://www.open-mpi.org/faq/?category=tcp#tcp-multi-network http://www.open-mpi.org/faq/?category=tcp#tcp-selection Specifically, I think you should try using a pair of these

Re: [OMPI users] odd network behavior

2008-01-18 Thread Jeff Squyres
Are all three machines running the same OS and version, perchance? If the machines are heterogeneous in terms of OS, glibc version, etc., weird things like these hangs can occur. Additionally, are you running a firewall on any of these machines? Ensure that iptables isn't running. It doe

Re: [OMPI users] odd network behavior

2008-01-17 Thread Mark Kosmowski
On Jan 15, 2008 7:54 PM, Mark Kosmowski wrote: > Dear Open-MPI Community: > > I have a 3 node cluster, each a dual opteron workstation running > OpenSUSE 10.1 64-bit. The node names are LT, SGT and PFC. When I > start an mpirun job from either SGT or PFC, things work as they are > supposed to.

Re: [OMPI users] odd network behavior

2008-01-16 Thread Barry Rountree
On Tue, Jan 15, 2008 at 07:54:33PM -0500, Mark Kosmowski wrote: > Dear Open-MPI Community: > > I have a 3 node cluster, each a dual opteron workstation running > OpenSUSE 10.1 64-bit. The node names are LT, SGT and PFC. When I > start an mpirun job from either SGT or PFC, things work as they are

[OMPI users] odd network behavior

2008-01-15 Thread Mark Kosmowski
Dear Open-MPI Community: I have a 3 node cluster, each a dual opteron workstation running OpenSUSE 10.1 64-bit. The node names are LT, SGT and PFC. When I start an mpirun job from either SGT or PFC, things work as they are supposed to. However, if I start the same job from LT, the jobs hangs at