Hi all,

I wanted to setup a 3-node ring network, each connects to the other 2 using
2 Ethernet ports directly without a switch/router.

The interface configurations could be found in the following picture.

https://www.dropbox.com/s/g75i51rrjs51b21/mpi-graph%20-%20New%20Page.png?dl=0

I've used *ifconfig *on each node to configure each port, and made sure I
can ssh from each node to the other 2 nodes.

But a simple ring_c
<https://github.com/open-mpi/ompi/blob/master/examples/ring_c.c> example
doesn't work... So I turn on  --mca btl_base_verbose 30, I could see that
node1 was trying to use 23.0.0.2  (linke between node2 and 3) to get to
node2 though there is a direct link to node 2.

The output log is like:

[node1:01828] btl: tcp: attempting to connect() to [[19529,1],1] address
> 23.0.0.2 on port 1024
> [[19529,1],0][btl_tcp_endpoint.c:606:mca_btl_tcp_endpoint_start_connect]
> from node1 to: node2 Unable to connect to the peer 23.0.0.2  on port 4:
> Network is unreachable


I've read the following posts and FAQs but still couldn't understand this
kind of behavior.

http://www.open-mpi.org/faq/?category=tcp#tcp-routability-1.3
http://www.open-mpi.org/faq/?category=tcp#tcp-selection
http://www.open-mpi.org/community/lists/users/2014/11/25810.php


Any pointers would be appreciated! Thanks in advance!

My open-mpi info:

 Package: Open MPI gtbldadm@ubuntu-12 Distribution
                Open MPI: 1.0.0.22
  Open MPI repo revision: git714842d
   Open MPI release date: May 27, 2015
                Open RTE: 1.0.0.22
  Open RTE repo revision: git714842d
   Open RTE release date: May 27, 2015
                    OPAL: 1.0.0.22
      OPAL repo revision: git714842d
       OPAL release date: May 27, 2015
                 MPI API: 2.1


Best,
Shawn

Reply via email to