Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Alex A. Granovsky
Eugene, you did not take into account the dispersion/dephasing between different processes. As cluster size and the number of instances of parallel process increase, the dispersion increases as well, making different instances to be a kind out of sync - not really out of sync, but just because

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Ashley Pittman
On 9 Sep 2010, at 21:40, Richard Treumann wrote: > > Ashley > > Can you provide an example of a situation in which these semantically > redundant barriers help? I'm not making the case for semantically redundant barriers, I'm making a case for implicit synchronisation in every iteration of

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Richard Treumann
Ashley Can you provide an example of a situation in which these semantically redundant barriers help? I may be missing something but my statement for the text book would be "If adding a barrier to your MPI program makes it run faster, there is almost certainly a flaw in it that is better solv

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Ashley Pittman
On 9 Sep 2010, at 21:10, jody wrote: > Hi > @Ashley: > What is the exact semantics of an asynchronous barrier, I'm not sure of the exact semantics but once you've got your head around the concept it's fairly simple to understand how to use it, you call MPI_IBarrier() and it gives you a handle

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread jody
Hi @Ashley: What is the exact semantics of an asynchronous barrier, and is it part of the MPI specs? Thanks Jody On Thu, Sep 9, 2010 at 9:34 PM, Ashley Pittman wrote: > > On 9 Sep 2010, at 17:00, Gus Correa wrote: > >> Hello All >> >> Gabrielle's question, Ashley's recipe, and Dick Treutmann's

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Ashley Pittman
On 9 Sep 2010, at 17:00, Gus Correa wrote: > Hello All > > Gabrielle's question, Ashley's recipe, and Dick Treutmann's cautionary words, > may be part of a larger context of load balance, or not? > > Would Ashley's recipe of sporadic barriers be a silver bullet to > improve load imbalance prob

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Eugene Loh
Alex A. Granovsky wrote: Isn't in evident from the theory of random processes and probability theory that in the limit of infinitely large cluster and parallel process, the probability of deadlocks with current implementation is unfortunately quite a finite quantity and in

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Alex A. Granovsky
Isn't in evident from the theory of random processes and probability theory that in the limit of infinitely large cluster and parallel process, the probability of deadlocks with current implementation is unfortunately quite a finite quantity and in limit approaches to unity regardless on any p

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Richard Treumann
I was pointing out that most programs have some degree of elastic synchronization built in. Tasks (or groups or components in a coupled model) seldom only produce data.they also consume what other tasks produce and that limits the potential skew. If step n for a task (or group or coupled compo

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Eugene Loh
Gus Correa wrote: More often than not some components lag behind (regardless of how much you tune the number of processors assigned to each component), slowing down the whole scheme. The coupler must sit and wait for that late component, the other components must sit and wait for the coupler, an

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Gus Correa
Hello All Gabrielle's question, Ashley's recipe, and Dick Treutmann's cautionary words, may be part of a larger context of load balance, or not? Would Ashley's recipe of sporadic barriers be a silver bullet to improve load imbalance problems, regardless of which collectives or even point-to-po

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Richard Treumann
Ashley's observation may apply to an application that iterates on many to one communication patterns. If the only collective used is MPI_Reduce, some non-root tasks can get ahead and keep pushing iteration results at tasks that are nearer the root. This could overload them and cause some extra

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Ralph Castain
On Sep 9, 2010, at 1:46 AM, Ashley Pittman wrote: > > On 9 Sep 2010, at 08:31, Terry Frankcombe wrote: > >> On Thu, 2010-09-09 at 01:24 -0600, Ralph Castain wrote: >>> As people have said, these time values are to be expected. All they >>> reflect is the time difference spent in reduce waiting

Re: [OMPI users] users Digest, Vol 1674, Issue 1

2010-09-09 Thread Jeff Squyres
I don't think it's really a fault -- it's just how we designed and implemented it. On Sep 6, 2010, at 7:40 AM, lyb wrote: > Thanks for your answer, but I test with MPICH2, it doesn't have this fault. > Why? >> Message: 9 >> Date: Wed, 1 Sep 2010 20:14:44 -0600 >> From: Ralph Castain >> Subject

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Ashley Pittman
On 9 Sep 2010, at 08:31, Terry Frankcombe wrote: > On Thu, 2010-09-09 at 01:24 -0600, Ralph Castain wrote: >> As people have said, these time values are to be expected. All they >> reflect is the time difference spent in reduce waiting for the slowest >> process to catch up to everyone else. The

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Gabriele Fatigati
Yes Terry, thats' right. 2010/9/9 Terry Frankcombe > On Thu, 2010-09-09 at 01:24 -0600, Ralph Castain wrote: > > As people have said, these time values are to be expected. All they > > reflect is the time difference spent in reduce waiting for the slowest > > process to catch up to everyone els

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Gabriele Fatigati
Mm,I don't understand. The experiments on my appliciation shows that an intensive use of Barrier+ Reduce is more faster than a single Reduce. 2010/9/9 Ralph Castain > As people have said, these time values are to be expected. All they reflect > is the time difference spent in reduce waiting for

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Terry Frankcombe
On Thu, 2010-09-09 at 01:24 -0600, Ralph Castain wrote: > As people have said, these time values are to be expected. All they > reflect is the time difference spent in reduce waiting for the slowest > process to catch up to everyone else. The barrier removes that factor > by forcing all processes t

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Ralph Castain
As people have said, these time values are to be expected. All they reflect is the time difference spent in reduce waiting for the slowest process to catch up to everyone else. The barrier removes that factor by forcing all processes to start from the same place. No mystery here - just a reflec

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Gabriele Fatigati
More in depth, total execution time without Barrier is about 1 sec. Total execution time with Barrier+Reduce is 9453, with 128 procs. 2010/9/9 Terry Frankcombe > Gabriele, > > Can you clarify... those timings are what is reported for the reduction > call specifically, not the total executi

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Gabriele Fatigati
Hy Terry, this time is spent in MPI_Reduce, it isn't total execution time. 2010/9/9 Terry Frankcombe > Gabriele, > > Can you clarify... those timings are what is reported for the reduction > call specifically, not the total execution time? > > If so, then the difference is, to a first approxima