On Mon, 2007-12-17 at 17:19 -0500, Jeff Squyres wrote: > On Dec 17, 2007, at 8:35 AM, Marco Sbrighi wrote: > > > I'm using Open MPI 1.2.2 over OFED 1.2 on an 256 nodes, dual Opteron, > > dual core, Linux cluster. Of course, with Infiniband 4x interconnect. > > > > Each cluster node is equipped with 4 (or more) ethernet interface, > > namely 2 gigabit ones plus 2 IPoIB. The two gig are named eth0,eth1, > > while the two IPoIB are named ib0,ib1. > > > > It happens that the eth0 is a management network, with poor > > performances, and furthermore we wouldn't use the ib* to carry MPI's > > traffic (neither OOB or TCP), so we would like the eth1 is used for > > open > > MPI OOB and TCP. > > > > In order to drive the OOB over only eth1 I've tried various > > combinations > > of oob_tcp_[ex|in]clude MCA statements, starting from the obvious > > > > oob_tcp_exclude = lo,eth0,ib0,ib1 > > > > then trying the othe obvious: > > > > oob_tcp_include = eth1 > > This one statement (_include) should be sufficient.
I agree with your interpretation, but what I'm experimenting here is "it should" but in fact it doesn't ..... > > Assumedly this(these) statement(s) are in a config file that is being > read by Open MPI, such as $HOME/.openmpi/mca-params.conf? I've tried many combinations: only in $HOME/.openmpi/mca-params.conf, only in command line and both; but none seems to work correctly. Nevertheless, what I'm expecting is that if something is specified in $HOME/.openmpi/mca-params.conf, then if differently specified in command line, the last should be assumed, I think. > > > and both at the same time. > > > > Next I've tried the following: > > > > oob_tcp_exclude = eth0 > > > > but after the job starts, I still have a lot of tcp connections > > established using eth0 or ib0 or ib1. > > Furthermore It happens the following error: > > > > [node191:03976] [0,1,14]-[0,1,12] mca_oob_tcp_peer_complete_connect: > > connection failed: Connection timed out (110) - retrying > > This is quite odd. :-( > > > I've found only a way in order to have tcp connections binded only to > > the eth1 interface, using both the following MCA directives in the > > command line: > > > > mpirun .... --mca oob_tcp_include eth1 --mca oob_tcp_include > > lo,eth0,ib0,ib1 ..... > > > > This sounds me as bug. > > Yes, it does. Specifying the MCA same param twice on the command line > results in undefined behavior -- it will only take one of them, and I > assume it'll take the first (but I'd have to check the code to be sure). OK, I can obtain the same behaviour using only one statement: --mca oob_tcp_include eth1,lo,eth0,ib0,ib1 note that using --mca mpi_show_mca_params what I'm seeing in the report is the same for both statements (twice and single): ..... [node255:30188] oob_tcp_debug=0 [node255:30188] oob_tcp_include=eth1,lo,eth0,ib0,ib1 [node255:30188] oob_tcp_exclude= ....... > > > Is there someone able to reproduce this behaviour? > > If this is a bug, are there fixes? > > > I'm unfortunately unable to reproduce this behavior. I have a test > cluster with 2 IP interfaces: ib0, eth0. I have tried several > combinations of MCA params with 1.2.2: > > --mca oob_tcp_include ib0 > --mca oob_tcp_include ib0,bogus > --mca oob_tcp_include eth0 > --mca oob_tcp_include eth0,bogus > --mca oob_tcp_exclude ib0 > --mca oob_tcp_exclude ib0,bogus > --mca oob_tcp_exclude eth0 > --mca oob_tcp_exclude eth0,bogus > > All do as they are supposed to -- including or excluding ib0 or eth0. > > I do note, however, that the handling of these parameters changed in > 1.2.3 -- as well as their names. The names changed to > "oob_tcp_if_include" and "oob_tcp_if_exclude" to match other MCA > parameter name conventions from other components. > > Could you try with 1.2.3 or 1.2.4 (1.2.4 is the most recent; 1.2.5 is > due out "soon" -- it *may* get out before the holiday break, but no > promises...)? we have 1.2.3 in another cluster and it performs the same behaviour as 1.2.2 .... (BTW the other cluster has the same eth ifaces) > > If you can't upgrade, let me know and I can provide a debugging patch > that will give us a little more insight into what is happening on your > machines. Thanks. It is quite difficult for us to upgrade the open-mpi now. We have the official CISCO packages installed, and I know the 1.2.2-1 is the only official CISCO's open-mpi distribution today .... In any case I would like to try your debug patch. Thanks Marco > -- ----------------------------------------------------------------- Marco Sbrighi m.sbri...@cineca.it HPC Group CINECA Interuniversity Computing Centre via Magnanelli, 6/3 40033 Casalecchio di Reno (Bo) ITALY tel. 051 6171516