Hi Jeff,

Thanks for your suggestion. (And also thanks to Gilles!) I'll play around
with your suggestions and let you know if I make any progresses.

About the version of my Open MPI, it's an Texas Instruments'
implementation. So the version number 1.0.0.22 is their own version.. I
looked at their documentation and it says it's based on Open MPI 1.7.1. So
I guess it's not that old lol.

Thanks again,
Shang

On Fri, Sep 18, 2015 at 1:38 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com
> wrote:

> Whoa; wait -- are you really using Open MPI v1.0?
>
> That's over 10 years old...
>
> Can you update to Open MPI v1.10?
>
>
> > On Sep 18, 2015, at 1:37 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com>
> wrote:
> >
> > Open MPI uses different heuristics depending on whether IP addresses are
> public or private.
> >
> > All your IP addresses are technically "public" -- they're not in
> 10.x.x.x or 192.168.x.x, for example.
> >
> > So Open MPI assumes that they are all routable to each other.
> >
> > You might want to change your 3 networks to be 10.1.x.x/16, 10.2.x.x/16,
> and 10.3.x.x/16.  See if that makes it work.
> >
> >
> >> On Sep 17, 2015, at 12:31 PM, Shang Li <shawn.li.x...@gmail.com> wrote:
> >>
> >> Hi all,
> >>
> >> I wanted to setup a 3-node ring network, each connects to the other 2
> using 2 Ethernet ports directly without a switch/router.
> >>
> >> The interface configurations could be found in the following picture.
> >>
> >>
> https://www.dropbox.com/s/g75i51rrjs51b21/mpi-graph%20-%20New%20Page.png?dl=0
> >>
> >> I've used ifconfig on each node to configure each port, and made sure I
> can ssh from each node to the other 2 nodes.
> >>
> >> But a simple ring_c example doesn't work... So I turn on  --mca
> btl_base_verbose 30, I could see that node1 was trying to use 23.0.0.2
> (linke between node2 and 3) to get to node2 though there is a direct link
> to node 2.
> >>
> >> The output log is like:
> >>
> >> [node1:01828] btl: tcp: attempting to connect() to [[19529,1],1]
> address 23.0.0.2 on port 1024
> >>
> [[19529,1],0][btl_tcp_endpoint.c:606:mca_btl_tcp_endpoint_start_connect]
> from node1 to: node2 Unable to connect to the peer 23.0.0.2  on port 4:
> Network is unreachable
> >>
> >> I've read the following posts and FAQs but still couldn't understand
> this kind of behavior.
> >>
> >> http://www.open-mpi.org/faq/?category=tcp#tcp-routability-1.3
> >> http://www.open-mpi.org/faq/?category=tcp#tcp-selection
> >> http://www.open-mpi.org/community/lists/users/2014/11/25810.php
> >>
> >>
> >> Any pointers would be appreciated! Thanks in advance!
> >>
> >> My open-mpi info:
> >>
> >> Package: Open MPI gtbldadm@ubuntu-12 Distribution
> >>                Open MPI: 1.0.0.22
> >>  Open MPI repo revision: git714842d
> >>   Open MPI release date: May 27, 2015
> >>                Open RTE: 1.0.0.22
> >>  Open RTE repo revision: git714842d
> >>   Open RTE release date: May 27, 2015
> >>                    OPAL: 1.0.0.22
> >>      OPAL repo revision: git714842d
> >>       OPAL release date: May 27, 2015
> >>                 MPI API: 2.1
> >>
> >>
> >> Best,
> >> Shawn
> >>
> >> _______________________________________________
> >> users mailing list
> >> us...@open-mpi.org
> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> >> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/09/27612.php
> >
> >
> > --
> > Jeff Squyres
> > jsquy...@cisco.com
> > For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> >
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/09/27627.php
>

Reply via email to