On Tuesday 19 May 2009, Peter Kjellstrom wrote:
> On Tuesday 19 May 2009, Roman Martonak wrote:
> > On Tue, May 19, 2009 at 3:29 PM, Peter Kjellstrom wrote:
> > > On Tuesday 19 May 2009, Roman Martonak wrote:
> > > ...
> > >
> > >> openmpi-1.3.2 time per one MD step is 3.
Default algorithm thresholds in mvapich are different from ompi.
Using tunned collectives in Open MPI you may configure the Open MPI
Alltoall threshold as Mvapich defaults.
The follow mca parameters configure Open MPI to use custom rules that
are defined in configure(txt) file.
"--mca use_dynam
Many thanks for the highly helpful analysis. Indeed, what Peter says
seems to be precisely the case here. I tried to run the 32 waters test
on 48 cores now, with the original cutoff of 100 Ry, and with slightly
increased one of 110 Ry. Normally with larger cutoff it should
obviously take more time
The correct MCA parameters are the following:
-mca coll_tuned_use_dynamic_rules 1
-mca coll_tuned_dynamic_rules_filename ./dyn_rules
You can also run the following command:
ompi_info -mca coll_tuned_use_dynamic_rules 1 -param coll tuned
This will give some insight into all the various algorithms
The correct MCA parameters are the following:
-mca coll_tuned_use_dynamic_rules 1
-mca coll_tuned_dynamic_rules_filename ./dyn_rules
Ohh..it was my mistake
You can also run the following command:
ompi_info -mca coll_tuned_use_dynamic_rules 1 -param coll tuned
This will give some insight
On Wednesday 20 May 2009, Rolf Vandevaart wrote:
...
> If I am understanding what is happening, it looks like the original
> MPI_Alltoall made use of three algorithms. (You can look in
> coll_tuned_decision_fixed.c)
>
> If message size < 200 or communicator size > 12
>bruck
> else if message s
The attached program prints hangs at after printing "Iteration 65524".
It does not appear to me that it should. Removal of the barrier call or
changing the barrier call to use MPI_COMM_WORLD does get rid of the
hang, so I believe this program is a minimal representation of a bug.
I have attach
I am 99.99% sure that this bug has been fixed in the current trunk and
will be available in the upcoming 1.3.3 release...
Thanks
Edgar
Lippert, Ross wrote:
The attached program prints hangs at after printing "Iteration 65524".
It does not appear to me that it should. Removal of the barrier
OK. I'll check back again when 1.3.3 comes out. Thanks.
-r
-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Edgar Gabriel
Sent: Wednesday, May 20, 2009 11:16 AM
To: Open MPI Users
Subject: Re: [OMPI users] FW: hanging after many comm
On Wednesday 20 May 2009, Pavel Shamis (Pasha) wrote:
> > With the file Pavel has provided things have changed to the following.
> > (maybe someone can confirm)
> >
> > If message size < 8192
> > bruck
> > else
> > pairwise
> > end
>
> You are right here. Target of my conf file is disable basic_lin
Disabling basic_linear seems like a good idea but your config file sets the
cut-off at 128 Bytes for 64-ranks (the field you set to 8192 seems to result
in a message size of that value divided by the number of ranks).
In my testing bruck seems to win clearly (at least for 64 ranks on my IB) u
On Wednesday 20 May 2009, Pavel Shamis (Pasha) wrote:
> > Disabling basic_linear seems like a good idea but your config file sets
> > the cut-off at 128 Bytes for 64-ranks (the field you set to 8192 seems to
> > result in a message size of that value divided by the number of ranks).
> >
> > In my t
Tomorrow I will add some printf to collective code and check what really
happens there...
Pasha
Peter Kjellstrom wrote:
On Wednesday 20 May 2009, Pavel Shamis (Pasha) wrote:
Disabling basic_linear seems like a good idea but your config file sets
the cut-off at 128 Bytes for 64-ranks (the f
I tried to run with the first dynamic rules file that Pavel proposed
and it works, the time per one MD step on 48 cores decreased from 2.8
s to 1.8 s as expected. It was clearly the basic linear algorithm that
was causing the problem. I will check the performance of bruck and
pairwise on my HW. It
On Wednesday 20 May 2009, Roman Martonak wrote:
> I tried to run with the first dynamic rules file that Pavel proposed
> and it works, the time per one MD step on 48 cores decreased from 2.8
> s to 1.8 s as expected. It was clearly the basic linear algorithm that
> was causing the problem. I will c
15 matches
Mail list logo