sm will not be used on multi-nodes?

Gilles Gouaillardet Thu, 30 Jun 2016 10:02:34 -0400 (EDT)

currently, coll/tuned is not topology aware.
this is something interesting, and everyone is invited to contribute.
coll/ml is topology aware, but it is kind of unmaintained now.


send/recv involves two abstraction layer
pml, and then the interconnect transport.
typically, pml/ob1 is used, and it uses a btl (btl/tcp, btl/vader,
btl/openib, ...)
an important exception is infinipath, which uses pml/cm and then mtl/psm
(and libfabric, but I do not know the details...)

Cheers,

Gilles

On Thursday, June 30, 2016, Saliya Ekanayake <esal...@gmail.com> wrote:

> OK, I am beginning to see how it works now. One question I still have is,
> in the case of a mult-node communicator it seems coll/tuned (or something
> not coll/sm) well be the one used, so do they do any optimizations to
> reduce communication within a node?
>
> Also where can I find the p2p send recv modules?
>
> Thank you
> the Bcast in coll/sm
>
> coll modules have priority
> (see ompi_info --all)
>
> for a given function (e,g. bcast) the module which implements it and has
> the highest priority is used.
> note a module can disqualify itself on a given communicator (e.g. coll/sm
> on I ter node communucator).
> by default, coll/tuned is very likely used. this module is a bit special
> since it selects a given algorithm based on communicator and message size.
>
> if you give a high priority to coll/sm, then it will be used for single
> node intra communicators, assuming coll/sm implements all
> collective primitives.
>
> Cheers,
>
> Gilles
>
> On Thursday, June 30, 2016, Saliya Ekanayake <esal...@gmail.com
> <javascript:_e(%7B%7D,'cvml','esal...@gmail.com');>> wrote:
>
>> Thank you, Gilles.
>>
>> What is the bcast I should look for? In general, how do I know which
>> module was used to for which communication - can I print this info?
>> On Jun 30, 2016 3:19 AM, "Gilles Gouaillardet" <gil...@rist.or.jp> wrote:
>>
>>> 1) is correct. coll/sm is disqualified if the communicator is an inter
>>> communicator or the communicator spans on several nodes.
>>>
>>> you can have a look at the source code, and you will not that bcast does
>>> not use send/recv. instead, it uses a shared memory, so hopefully, it is
>>> faster than other modules
>>>
>>>
>>> Cheers,
>>>
>>>
>>> Gilles
>>> On 6/30/2016 3:04 PM, Saliya Ekanayake wrote:
>>>
>>> Hi,
>>>
>>> Looking at the *ompi/mca/coll/sm/coll_sm_module.c* it seems this module
>>> will be used only if the calling communicator solely groups processes
>>> within a node. I've got two questions here.
>>>
>>> 1. So is my understanding correct that for something like MPI_COMM_WORLD
>>> where world is multiple processes within a node across many nodes, this
>>> module will not be used?
>>>
>>> 2. If 1, is correct then are there any shared memory optimizations that
>>> happen when a collective like bcast  or allreduce is called, so that
>>> communicating within a node is done efficiently through memory?
>>>
>>> Thank you,
>>> Saliya
>>>
>>>
>>> --
>>> Saliya Ekanayake
>>> Ph.D. Candidate | Research Assistant
>>> School of Informatics and Computing | Digital Science Center
>>> Indiana University, Bloomington
>>>
>>>
>>>
>>> _______________________________________________
>>> users mailing listus...@open-mpi.org
>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2016/06/29564.php
>>>
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2016/06/29565.php
>>>
>>
> _______________________________________________
> users mailing list
> us...@open-mpi.org <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/06/29567.php
>

Re: [OMPI users] The ompi/mca/cool/sm will not be used on multi-nodes?

Reply via email to