[OMPI users] example program "ring" hangs when running across multiple hardware nodes

2013-07-04 Thread Jed O. Kaplan
Dear openmpi gurus,

I am running openmpi 1.7.2 on a homogenous cluster of Apple XServes
running OS X 10.6.8. My hardware nodes are connected through four
gigabit ethernet connections; I have no infiniband or other high-speed
interconnect. The problem I describe below is the same if I use openmpi
1.6.5. My openmpi installation is compiled with Intel icc and ifort. See
the attached result of ompi_info --all for more details on my
installation and runtime parameters, and other diagnostic information
below

My problem is that I noticed that inter-hardware communication hangs in
one of my own programs; I thought this was the fault of my own bad
programming, so I tried some of the example programs that are
distributed with the openmpi source code. In the program "ring_*" using
whichever of the APIs (c, cxx, fortran etc.), I have the same faulty
behavior that I noticed in my own program: if I run the program on a
single hardware node (with multiple processes) it works fine. As soon as
I run the program across hardware nodes, it hangs. Below you will find
an example of the program output and other diagnostic information.

This problem has really frustrated me. Unfortunately I am not
experienced enough with openmpi to get more into the debugging.

Thank you in advance for any help you can give me!

Jed Kaplan

--- DETAILS OF MY PROBLEM ---

-- this run works because it is only on one hardware node --

jkaplan@grkapsrv2:~/openmpi_examples >  mpirun --prefix /usr/local
--hostfile arvehosts.txt -np 3 ring_c
Process 0 sending 10 to 1, tag 201 (3 processes in ring)
Process 0 sent to 1
Process 0 decremented value: 9
Process 0 decremented value: 8
Process 0 decremented value: 7
Process 0 decremented value: 6
Process 0 decremented value: 5
Process 0 decremented value: 4
Process 0 decremented value: 3
Process 0 decremented value: 2
Process 0 decremented value: 1
Process 0 decremented value: 0
Process 0 exiting
Process 1 exiting
Process 2 exiting

-- this run hangs when running over two hardware nodes --

jkaplan@grkapsrv2:~/openmpi_examples >  mpirun --prefix /usr/local
--hostfile arvehosts.txt -np 4 ring_c
Process 0 sending 10 to 1, tag 201 (4 processes in ring)
Process 0 sent to 1
Process 0 decremented value: 9
Process 0 decremented value: 8
... hangs forever ...
^CKilled by signal 2.

-- here is what my hostfile looks like --

jkaplan@grkapsrv2:~/openmpi_examples > cat arvehosts.txt 
#host file for ARVE group mac servers

10.0.0.21 slots=3
10.0.0.31 slots=8
10.0.0.41 slots=8
10.0.0.51 slots=8
10.0.0.61 slots=8 
10.0.0.71 slots=8

-- results of ifconfig - this looks pretty much the same on all of my
servers, with different ip addresses of course --

jkaplan@grkapsrv2:~/openmpi_examples > ifconfig
lo0: flags=8049 mtu 16384
inet6 ::1 prefixlen 128 
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 
inet 127.0.0.1 netmask 0xff00 
gif0: flags=8010 mtu 1280
stf0: flags=0<> mtu 1280
en0: flags=8863 mtu 1500
ether 00:24:36:f3:dc:fc 
inet6 fe80::224:36ff:fef3:dcfc%en0 prefixlen 64 scopeid 0x4 
inet 128.178.107.85 netmask 0xff00 broadcast 128.178.107.255
media: autoselect (1000baseT )
status: active
en1: flags=8863 mtu 1500
ether 00:24:36:f3:dc:fa 
inet6 fe80::224:36ff:fef3:dcfa%en1 prefixlen 64 scopeid 0x5 
inet 10.0.0.2 netmask 0xff00 broadcast 10.255.255.255
media: autoselect (1000baseT )
status: active
en2: flags=8863 mtu 1500
ether 00:24:36:f5:ba:4e 
inet6 fe80::224:36ff:fef5:ba4e%en2 prefixlen 64 scopeid 0x6 
inet 10.0.0.21 netmask 0xff00 broadcast 10.255.255.255
media: autoselect (1000baseT )
status: active
en3: flags=8863 mtu 1500
ether 00:24:36:f5:ba:4f 
inet6 fe80::224:36ff:fef5:ba4f%en3 prefixlen 64 scopeid 0x7 
inet 10.0.0.22 netmask 0xff00 broadcast 10.255.255.255
media: autoselect (1000baseT )
status: active
fw0: flags=8822 mtu 4078
lladdr 04:1e:64:ff:fe:f8:aa:d2 
media: autoselect 
status: inactive
jkaplan@grkapsrv2:~/science_projects/mpitest/openmpi_examples > ompi_info --all
  Prefix: /usr/local
 Exec_prefix: /usr/local
  Bindir: /usr/local/bin
 Sbindir: /usr/local/sbin
  Libdir: /usr/local/lib
  Incdir: /usr/local/include
  Mandir: /usr/local/share/man
   Pkglibdir: /usr/local/lib/openmpi
  Libexecdir: /usr/local/libexec
 Datarootdir: /usr/local/share
 Datadir: /usr/local/share
  Sysconfdir: /usr/local/etc
  Sharedstatedir: /usr/local/com
   Localstatedir: /usr/local/var
 Infodir: /usr/local/share/info
  Pkgdatadir: /usr/local/share/openmpi
   Pkglibdir: /usr/local/lib/openmpi
   Pkgincludedir: /usr/local/include/openmpi
 Configured architecture: x86_64-apple-darw

Re: [OMPI users] example program "ring" hangs when running across multiple hardware nodes (SOLVED)

2013-07-05 Thread Jed O. Kaplan
Dear Gus,

Thanks for your help - your clue solved my problem!

The ultimate solution was to limit mpi communications to the local,
unrouted subnet. I made this the default behavior of all users of my
cluster by adding the following line to the bottom of my
$prefix/etc/openmpi-mca-params.conf file

btl_tcp_if_include = 10.0.0.0/8

Thanks again - what a relief!

Jed

On Fri, Jul 5, 2013, at 01:25 AM, Gustavo Correa wrote:
> Hi Jed 
> 
> You could try to select only ethernet interface that match your node's IP
> addresses,
> which seems to be en2.
> 
> The en1 interface seems to be an external IP. 
> Not sure about en3, but it is awkward that it has a 
> different IP than en2, but in the same subnet.
> I wonder if this may be the reason for the program hanging.
> 
> You may need to search all nodes ifconfig for a consistent set of
> interfaces/IP addresses,
> and tailor your mpiexec command line and your hostfile accordingly.
> 
> Say, something like this:
> 
> mpiexec -mca btl_tcp_if_include en2 -hostfile your_hostfile -np 43
> ./ring_c
> 
> See this FAQ (actually, all of them are very informative):
> http://www.open-mpi.org/faq/?category=tcp#tcp-selection
> 
> I hope this helps,
> Gus Correa