On May 22, 2007, at 7:52 PM, Tom Clune wrote:

For example, if it is ppp0, try:

   mpirun -np 1 -mca oob_tcp_exclude ppp0 uptime

This seems to at least produce a bit of output before hanging:

LM000953070:~ tlclune$ mpirun -np 1 -mca oob_tcp_exclude ppp0 uptime
[153.sub-70-211-6.myvzw.com:07562] [0,0,0] mca_oob_tcp_init: invalid address '' returned for selected oob interfaces. [153.sub-70-211-6.myvzw.com:07562] [0,0,0] ORTE_ERROR_LOG: Error in file oob_tcp.c at line 1216

Tom -

I managed to track this down a bit. We try to use the ppp0 interface (the cell phone device) for network connectivity, as it's the only non-localhost address up at the time. Unfortunately, we can't use the address to route messages that way and Open MPI hangs. The problem is made worse due to a bug that I'm still trying to track down in Open MPI. When you tell Open MPI to not use a device (like ppp0), it should just use whatever other devices are available. In your case, that would be localhost, which is what you're using when you don't have any network connectivity at all. But it appears that this instead causes Open MPI to segfault / hang. I'm looking into exactly why this is happening and should have a fix in the next day or so.

Brian

--
  Brian W. Barrett
  Open MPI Team, CCS-1
  Los Alamos National Laboratory


Reply via email to