sm will not be used on multi-nodes?

Saliya Ekanayake Thu, 30 Jun 2016 11:47:23 -0400 (EDT)

OK, that's good. I'll try that.

So, is *ml* something not being developed now? Any documentation on this
component?


Thank you,
Saliya

On Thu, Jun 30, 2016 at 11:01 AM, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:

> you might want to give coll/ml a try
> mpirun --mca coll_ml_priority 100 ...
>
> Cheers,
>
> Gilles
>
> On Thursday, June 30, 2016, Saliya Ekanayake <esal...@gmail.com> wrote:
>
>> Thank you, Gilles. The reason for digging into intra-node optimizations
>> is that we've implemented several machine learning applications in OpenMPI
>> (Java binding), but found collective communication to be a bottleneck,
>> especially when the number of procs per node is high. I've implemented a
>> shared memory layer within Java (
>> https://www.researchgate.net/publication/291695433_SPIDAL_Java_High_Performance_Data_Analytics_with_Java_and_MPI_on_Large_Multicore_HPC_Clusters),
>> which solved this, but it would be nice to have this built-in.
>>
>> I'll look at the send/recv implementations as well.
>>
>> Regards,
>> Saliya
>>
>> On Thu, Jun 30, 2016 at 10:02 AM, Gilles Gouaillardet <
>> gilles.gouaillar...@gmail.com> wrote:
>>
>>> currently, coll/tuned is not topology aware.
>>> this is something interesting, and everyone is invited to contribute.
>>> coll/ml is topology aware, but it is kind of unmaintained now.
>>>
>>> send/recv involves two abstraction layer
>>> pml, and then the interconnect transport.
>>> typically, pml/ob1 is used, and it uses a btl (btl/tcp, btl/vader,
>>> btl/openib, ...)
>>> an important exception is infinipath, which uses pml/cm and then mtl/psm
>>> (and libfabric, but I do not know the details...)
>>>
>>> Cheers,
>>>
>>> Gilles
>>>
>>> On Thursday, June 30, 2016, Saliya Ekanayake <esal...@gmail.com> wrote:
>>>
>>>> OK, I am beginning to see how it works now. One question I still have
>>>> is, in the case of a mult-node communicator it seems coll/tuned (or
>>>> something not coll/sm) well be the one used, so do they do any
>>>> optimizations to reduce communication within a node?
>>>>
>>>> Also where can I find the p2p send recv modules?
>>>>
>>>> Thank you
>>>> the Bcast in coll/sm
>>>>
>>>> coll modules have priority
>>>> (see ompi_info --all)
>>>>
>>>> for a given function (e,g. bcast) the module which implements it and
>>>> has the highest priority is used.
>>>> note a module can disqualify itself on a given communicator (e.g.
>>>> coll/sm on I ter node communucator).
>>>> by default, coll/tuned is very likely used. this module is a bit
>>>> special since it selects a given algorithm based on communicator and
>>>> message size.
>>>>
>>>> if you give a high priority to coll/sm, then it will be used for single
>>>> node intra communicators, assuming coll/sm implements all
>>>> collective primitives.
>>>>
>>>> Cheers,
>>>>
>>>> Gilles
>>>>
>>>> On Thursday, June 30, 2016, Saliya Ekanayake <esal...@gmail.com> wrote:
>>>>
>>>>> Thank you, Gilles.
>>>>>
>>>>> What is the bcast I should look for? In general, how do I know which
>>>>> module was used to for which communication - can I print this info?
>>>>> On Jun 30, 2016 3:19 AM, "Gilles Gouaillardet" <gil...@rist.or.jp>
>>>>> wrote:
>>>>>
>>>>>> 1) is correct. coll/sm is disqualified if the communicator is an
>>>>>> inter communicator or the communicator spans on several nodes.
>>>>>>
>>>>>> you can have a look at the source code, and you will not that bcast
>>>>>> does not use send/recv. instead, it uses a shared memory, so hopefully, 
>>>>>> it
>>>>>> is faster than other modules
>>>>>>
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>>
>>>>>> Gilles
>>>>>> On 6/30/2016 3:04 PM, Saliya Ekanayake wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Looking at the *ompi/mca/coll/sm/coll_sm_module.c* it seems this
>>>>>> module will be used only if the calling communicator solely groups
>>>>>> processes within a node. I've got two questions here.
>>>>>>
>>>>>> 1. So is my understanding correct that for something like
>>>>>> MPI_COMM_WORLD where world is multiple processes within a node across 
>>>>>> many
>>>>>> nodes, this module will not be used?
>>>>>>
>>>>>> 2. If 1, is correct then are there any shared memory optimizations
>>>>>> that happen when a collective like bcast  or allreduce is called, so that
>>>>>> communicating within a node is done efficiently through memory?
>>>>>>
>>>>>> Thank you,
>>>>>> Saliya
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Saliya Ekanayake
>>>>>> Ph.D. Candidate | Research Assistant
>>>>>> School of Informatics and Computing | Digital Science Center
>>>>>> Indiana University, Bloomington
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing listus...@open-mpi.org
>>>>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> Link to this post: 
>>>>>> http://www.open-mpi.org/community/lists/users/2016/06/29564.php
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> Link to this post:
>>>>>> http://www.open-mpi.org/community/lists/users/2016/06/29565.php
>>>>>>
>>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2016/06/29567.php
>>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2016/06/29569.php
>>>
>>
>>
>>
>> --
>> Saliya Ekanayake
>> Ph.D. Candidate | Research Assistant
>> School of Informatics and Computing | Digital Science Center
>> Indiana University, Bloomington
>>
>>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/06/29571.php
>



-- 
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington

Re: [OMPI users] The ompi/mca/cool/sm will not be used on multi-nodes?

Reply via email to