Dear List,
I'm having a really weird issue with openmpi - version 1.3.3 (version
1.2.8 doesn't seem to exhibit this behavior). Essentially when I start
jobs from the cluster front-end node using mpirun, mpirun sits idle
for up to a minute and a half (for 30 nodes) before running the
command I've given it. Running the same command on any other node in
the cluster returns in a fraction of a second. Upon further research
it appears its an issue with the way orted on the compute nodes are
attempting to talk back to the front-end node. When I launch mpirun
from the front-end node this is the process it spawns on the compute
node (public ip scrambled for security purposes)-
orted --daemonize -mca ess env -mca orte_ess_jobid 1816657920 -mca
orte_ess_vpid 1 -mca orte_ess_num_procs 3 --hnp-uri 1816657920.0;tcp://
130.X.X.X:56866;tcp://172.40.10.1:56866;tcp://172.20.10.1:56866
Throwing in some firewall debugging rules indicate that the compute
nodes were trying to talk back to mpirun on the front-end node over
the front-end node's public ip. Based on this, and looking at the
arguments passed above it seemed as though the public ip of the front
end node was being tried before any its private IPs, and the delay I
was seeing was orted waiting for the connection to the front-end
node's public ip to timeout before it tried it's cluster-facing ip and
the connection succeeded.
I was able to work around this by specifying "--mca oob_tcp_if_include
bond0,eth0" to mpirun (the front-end node has 2 bonded nics as its
cluster facing interface). When I provided that argument the
previously experienced delay disappeared. I could easily put that into
openmpi-mca-params.conf and be done with the problem but I would like
to know why openmpi chose to use the public ip of the node before it's
internal IP and if this is expected behavior. I suspect that it may
not be.
-Aaron
- [OMPI users] oob mca question Aaron Knister
-