Thanks for all your inputs.
It is good to know this initial latency is an expected behavior and the
workaround by using one dummy iteration before timing is started.
I did not notice this because my older parallel CFD code runs a large number
of time steps and the initial latency was compensated.
But recently I am teaching MPI stuff using small parallel codes and noticed
this behavior.
This relieves my concern about our system performance.
Thanks again.
> Date: Thu, 12 Nov 2009 11:18:24 -0500
> From: Gus Correa
> Subject: Re: [OMPI users] mpi functions are slow when first called and
>become normal afterwards
> To: Open MPI Users
> Message-ID: <4afc3550.10...@ldeo.columbia.edu>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Eugene Loh wrote:
> > RightCFD wrote:
> >>
> >> Date: Thu, 29 Oct 2009 15:45:06 -0400
> >> From: Brock Palen mailto:bro...@umich.edu>>
> >> Subject: Re: [OMPI users] mpi functions are slow when first called
> and
> >>become normal afterwards
> >> To: Open MPI Users mailto:us...@open-mpi.org>>
> >> Message-ID: <890cc430-68b0-4307-8260-24a6fadae...@umich.edu
> >> <mailto:890cc430-68b0-4307-8260-24a6fadae...@umich.edu>>
> >> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
> >>
> >> > When MPI_Bcast and MPI_Reduce are called for the first time, they
> >> > are very slow. But after that, they run at normal and stable
> speed.
> >> > Is there anybody out there who have encountered such problem? If
> you
> >> > need any other information, please let me know and I'll provide.
> >> > Thanks in advance.
> >>
> >> This is expected, and I think you can dig though the message archive
> >> to find the answer. OMPI does not wireup all the communication at
> >> startup, thus the first time you communicate with a host the
> >> connection is made, but after that it is fast because it is already
> >> open. This behavior is expected, and is needed for very large
> systems
> >> where you could run out of sockets for some types of communication
> >> with so many hosts.
> >>
> >> Brock Palen
> >> www.umich.edu/~brockp <http://www.umich.edu/%7Ebrockp>
> >> Center for Advanced Computing
> >> bro...@umich.edu <mailto:bro...@umich.edu>
> >> (734)936-1985
> >>
> >> Thanks for your reply. I am surprised to know this is an expected
> >> behavior of OMPI. I searched the archival but did not find many
> >> relevant messages. I am wondering why other users of OMPI do not
> >> complain this. Is there a way to avoid this when timing an MPI
> >> program?
> >>
> > An example of this is the NAS Parallel Benchmarks, which have been
> > around nearly 20 years. They:
> >
> > *) turn timers on after MPI_Init and off before MPI_Finalize
> > *) execute at least one iteration before starting timers
> >
> > Even so, with at least one of the NPB tests and with at least one MPI
> > implementation, I've seen more than one iteration needed to warm things
> > up. That is, if you timed each iteration, you could see that multiple
> > iterations were needed to warm everything up. In performance analysis,
> > it is reasonably common to expect to have to run multiple iterations and
> > correct data set size to get representative behavior.
> >
> >
>
> And I would guess in OpenMPI, maybe in other implementations too,
> the time you spend warming up, probing the best way to do things,
> is widely compensated for during steady state execution,
> if the number of iterations is not very small.
> This seems to be required to accommodate for the large variety
> of hardware and software platforms, and be efficient on all of them.
> Right?
>
> AFAIK, other high quality software (e.g. FFTW)
> do follow a similar rationale.
>
> Gus Correa
>
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> --
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> End of users Digest, Vol 1403, Issue 4
> **
>