Hello,
I'm having troubles to run my software after our administrators changed
the cluster configuration. It was working perfectly before, however now
I get these errors:
$ mpirun --hostfile ./../hostfile -np 10 ./src/smallTest
--------------------------------------------------------------------------
Process 0.1.1 is unable to reach 0.1.4 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
PML add procs failed
--> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
/
/I assume this could be because of:
$ /sbin/route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use
Iface
192.125.17.0 * 255.255.255.0 U 0 0 0 eth1
192.168.12.0 * 255.255.255.0 U 0 0 0 eth1
161.254.0.0 * 255.255.0.0 U 0 0 0 eth1
default 192.125.17.1 0.0.0.0 UG 0 0 0 eth1
So "narrowly scoped netmasks" which (as it's written in the FAQ) are not
supported in the OpenMPI. I asked for a workaround on this newsgroup
some time ago - but no answer uptill now. So my question is: what
alternative should I choose that will work in such configuration? Do you
have some experience in other MPI implementations, for example LamMPI?
Thank you for your support.
regards, Marcin