[OMPI users] mpirun hangs on remote nodes -- how to find where and why?

Bill Johnstone Mon, 16 Jul 2007 11:02:25 -0400

Hello.

I'm trying to use Open MPI 1.2.3 on a cluster of dual-processor AMD64
nodes.  These nodes are all connected via gigabit ethernet on a
private, self-contained IP network.  The OS is GNU/Linux, gcc 4.1.2,
kernel 2.6.21 .  Open MPI was configured with --prefix=/usr/local and
installed via make install.  Compilation and installation went
successfully.  I have verified that non-interactive logins contain
/usr/local/bin in the PATH, and ld.so.conf has an entry for the Open
MPI lib dir (and ld.so.cache is up-to-date).  This is a more-or-less
"vanilla" installation without any external schedulers/resource
managers.


I am simply trying to test Open MPI for the first time (we previously
used LAM), and trying to do so via trivial system executables like
"env".

The problem is this: if I invoke mpirun such that it needs to launch on
nodes other than the one I'm invoking it on, it seems to launch and
then hang. Ctrl+C yields a "mpirun: killing job..." message, but the
job never dies.  I have to suspend the job and use kill -9, otherwise
it doesn't die. 

If I invoke on the host I'm logged into (any node in the hostfile),
without any host specification or hostfile provision, it works fine
(i.e. job is run on local machine).  My mpirun hostfile contains
entries like:

node1.x86-64 slots=2 max_slots=2

so for example, if I do:

headnode $ mpirun -hostfile runnodes.txt -np 1 env

where runnodes.txt does not contain any entry for the headnode, then
mpirun hangs as I described.  I have verified that I can do:

headnode $ ssh node1.x86-64 env

which works fine.

Even using mpirun -v, I can't seem to find a command-line option which
would give me the diagnostic information to figure out where mpirun
gets stuck, what it has done up to that point, etc.  How can I figure
out what's going wrong?  Is there a way to verbosely report the actions
taken so far?

These machines have multiple onboard ethernet interfaces with only one
configured to communicate with the cluster, but even using the "--mca
btl_tcp_if_include eth1" argument to mpirun makes no difference.

The only potential thing I could come up with is as follows.  All name
resolution is done via /etc/hosts and no DNS server is present. 
However, the cluster actually contains machines of 2 different
architectures, and we wanted nodes to be named node<#>.<arch> where #
goes from 1 to N, and example archs would be x86-64 or alpha .  To make
this work, the init scripts on the machines set the hostname to the
fully-qualified node name, e.g. node1.x86-64.cluster , rather than the
typical practice of just the name preceding the first dot.  In
/etc/resolv.conf, the "domain" keyword is set to <arch>.<TLD>, e.g.
x86-64.cluster .  All the /etc/hosts entries do contain the node names
in the format of node<#>.<arch> as well as the fully-qualified
versions.  So, other than setting the hostname to the fully-qualified
value, this is a fairly typical GNU/Linux setup.

We used the same practice with LAM and it never posed a problem, but
thought I'd mention it in just in case.




____________________________________________________________________________________
Now that's room service!  Choose from over 150,000 hotels
in 45,000 destinations on Yahoo! Travel to find your fit.
http://farechase.yahoo.com/promo-generic-14795097

[OMPI users] mpirun hangs on remote nodes -- how to find where and why?

Reply via email to