Hello,

I'm having troubles to run my software after our administrators changed the cluster configuration. It was working perfectly before, however now I get these errors:

$ mpirun --hostfile ./../hostfile -np 10 ./src/smallTest
--------------------------------------------------------------------------
Process 0.1.1 is unable to reach 0.1.4 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

 PML add procs failed
 --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
/

/I assume this could be because of:

$ /sbin/route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.125.17.0    *               255.255.255.0   U     0      0        0 eth1
192.168.12.0    *               255.255.255.0   U     0      0        0 eth1
161.254.0.0     *               255.255.0.0     U     0      0        0 eth1
default         192.125.17.1    0.0.0.0         UG    0      0        0 eth1

So "narrowly scoped netmasks" which (as it's written in the FAQ) are not supported in the OpenMPI. I asked for a workaround on this newsgroup some time ago - but no answer uptill now. So my question is: what alternative should I choose that will work in such configuration? Do you have some experience in other MPI implementations, for example LamMPI?

Thank you for your support.

regards, Marcin

Reply via email to