Re: [OMPI users] MPI_Bcast implementations in OpenMPI

George Bosilca Mon, 25 Apr 2016 12:20:13 -0400 (EDT)

> On Apr 25, 2016, at 11:33 , Dave Love <d.l...@liverpool.ac.uk> wrote:
> 
> George Bosilca <bosi...@icl.utk.edu> writes:
> 
>> Dave,
>> 
>> You are absolutely right, the parameters are now 6-7 years old,
>> gathered on interconnects long gone. Moreover, several discussions in
>> this mailing list indicated that they do not match current network
>> capabilities.
>> 
>> I have recently reshuffled the tuned module to move all the algorithms
>> in the base and therefore make them available to other collective
>> modules (the code is available in master and 1.10 and the future
>> 2.0). This move has the potential for allowing different decision
>> schemes to coexists, and be dynamically selected at runtime based on
>> network properties, network topology, or even applications needs. I
>> continue to have hopes that network vendors will eventually get
>> interested in tailoring the collective selection to match their
>> network capabilities, and provide their users with a performance boost
>> by allowing for network specific algorithm selection.
> 
> That sounds useful, assuming the speed is generally dominated by the
> basic fabric.  What's involved in making the relevant measurements and
> plugging them in?  I did look at using OTPO(?) to check this sort of
> thing once.  I couldn't make it work in the time I had, but Periscope
> might be a good alternative now.


It is a multidimensional space optimization problem. The critical point is 
identifying the switching points between different algorithms based on their 
performance (taking in account, at least, physical topology, number of 
processes and amount of data). The paper I sent on one of my previous email 
discusses how we did the decision functions on the current implementation. 
There are certainly better ways, but the one we took at least did not involve 
any extra software, and was done using simple scripts.

> If it's fairly mechanical -- maybe even if not -- it seems like
> something that should just be done regardless of vendors.  I'm sure
> plenty of people could measure QDR fat tree, for a start (at least where
> measurement isn’t frowned upon).

Based on feedback from the user mailing list, several users did such 
optimizations for their specific applications. This makes the optimization 
problem much simpler, as some of the parameters have discrete values (message 
size). If we assume a symmetric network, and have a small number of message 
sizes of interest, it is enough to run few benchmarks (skampi, to the IMB test 
on the collective of interest), and manually finding the switch point is a 
relatively simple process.

  George.



> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2016/04/29024.php

Re: [OMPI users] MPI_Bcast implementations in OpenMPI

Reply via email to