Whatever the original choice(s) of the BTL are, an interface should disqualify
itself after few missed connections (based on the retry MCA parameter).
However, in order to get anything sensible in this configuration you should
change the default timeout to a reasonable value (30 seconds?).
Whil
On Sep 18, 2015, at 7:26 PM, Gilles Gouaillardet
wrote:
>
> I built a similar environment with master and private ip and that does not
> work.
> my understanding is each tasks has two tcp btl (one per interface),
> and there is currently no mechanism to tell that a node is unreachable
> via a g
Jeff,
I built a similar environment with master and private ip and that does not
work.
my understanding is each tasks has two tcp btl (one per interface),
and there is currently no mechanism to tell that a node is unreachable
via a given btl.
(a btl picks the "best" interface for each node, but it
Hi Jeff,
Thanks for your suggestion. (And also thanks to Gilles!) I'll play around
with your suggestions and let you know if I make any progresses.
About the version of my Open MPI, it's an Texas Instruments'
implementation. So the version number 1.0.0.22 is their own version.. I
looked at their
Whoa; wait -- are you really using Open MPI v1.0?
That's over 10 years old...
Can you update to Open MPI v1.10?
> On Sep 18, 2015, at 1:37 PM, Jeff Squyres (jsquyres)
> wrote:
>
> Open MPI uses different heuristics depending on whether IP addresses are
> public or private.
>
> All your IP
Open MPI uses different heuristics depending on whether IP addresses are public
or private.
All your IP addresses are technically "public" -- they're not in 10.x.x.x or
192.168.x.x, for example.
So Open MPI assumes that they are all routable to each other.
You might want to change your 3 netwo
Shang,
can you please run
mpirun --version
i cannot find the ompi version you are running based on the git hash you
reported
as a temporary workaround, you can do minimal tcp routing :
on the three nodes
1) run
sysctl -w net.ipv4.ip_forward=1
2) route the other nodes interface not on the same
Hi all,
I wanted to setup a 3-node ring network, each connects to the other 2 using
2 Ethernet ports directly without a switch/router.
The interface configurations could be found in the following picture.
https://www.dropbox.com/s/g75i51rrjs51b21/mpi-graph%20-%20New%20Page.png?dl=0
I've used *i