I've looked in more detail at the current two MPI_Alltoallv algorithms
and wanted to raise a couple of ideas.
Firstly, the new default "pairwise" algorithm.
* There is no optimisation for sparse/empty messages, compare to the old
basic "linear" algorithm.
* The attached "pairwise-nop" patch add
> program launch by supplying appropriate MCA parameters to orterun (a.k.a.
>>>> mpirun and mpiexec).
>>>>
>>>> There is also a largely undocumented feature of the "tuned" collective
>>>> component where a dynamic rules file can be supplied
n-mpi.org]
On Behalf Of Number Cruncher
Sent: Wednesday, December 19, 2012 5:31 PM
To: Open MPI Users
Subject: Re: [OMPI users] MPI_Alltoallv performance regression 1.6.0 to
1.6.1
On 19/12/12 11:08, Paul Kapinos wrote:
Did you *really* wanna to dig into code just in order to switch a
default communic
ce Computing
>> RWTH Aachen University, Center for Computing and Communication
>> Rechen- und Kommunikationszentrum der RWTH Aachen
>> Seffenter Weg 23, D 52074 Aachen (Germany)
>>
>>> -Original Message-
>>> From: users-boun...@open-mpi.org [mailto:u
hen- und Kommunikationszentrum der RWTH Aachen
Seffenter Weg 23, D 52074 Aachen (Germany)
-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org]
On Behalf Of Number Cruncher
Sent: Wednesday, December 19, 2012 5:31 PM
To: Open MPI Users
Subject: Re: [OMPI users]
lto:users-boun...@open-mpi.org]
> On Behalf Of Number Cruncher
> Sent: Wednesday, December 19, 2012 5:31 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] MPI_Alltoallv performance regression 1.6.0 to
> 1.6.1
>
> On 19/12/12 11:08, Paul Kapinos wrote:
> > Did you *reall
On 19/12/12 11:08, Paul Kapinos wrote:
Did you *really* wanna to dig into code just in order to switch a
default communication algorithm?
No, I didn't want to, but with a huge change in performance, I'm forced
to do something! And having looked at the different algorithms, I think
there's a p
Did you *really* wanna to dig into code just in order to switch a default
communication algorithm?
Note there are several ways to set the parameters; --mca on command line is just
one of them (suitable for quick online tests).
http://www.open-mpi.org/faq/?category=tuning#setting-mca-params
W
Having run some more benchmarks, the new default is *really* bad for our
application (2-10x slower), so I've been looking at the source to try
and figure out why.
It seems that the biggest difference will occur when the all_to_all is
actually sparse (e.g. our application); if most N-M process
Hi, Simon,
The pairwise algorithm passes messages in a synchronised ring-like fashion
with increasing stride, so it works best when independent communication
paths could be established between several ports of the network
switch/router. Some 1 Gbps Ethernet equipment is not capable of doing so,
so
10 matches
Mail list logo