Gilles' concern is correct: we should never return timer values that go 
backwards.

Perhaps the TSC-based WTIME should only be used in a process that is bound to a 
single core...?

An MCA param can be used to force the switch between gettimeofday() and TSC, if 
someone really wants to take their chances with TSC when not bound to core (or 
bound to something wider than a core).



On Nov 27, 2014, at 5:41 AM, Alex A. Granovsky <g...@classic.chem.msu.su> wrote:

> AFAIK, Linux synchronizes all CPU timers on boot. The skew is normally no 
> more than 50-100 CPU cycles.
> 
> The reasons why you can observe larger differences are:
> 
> 1) Main. The CPUs do not have "constant TSC" feature . Without this feature 
> timer frequency changes across different power states of CPU or core.
> 2) Secondary. Some motherboard can overclock CPUs depending on load using FSB 
> clock generator. This results in CPU timers ticking faster or slower than 
> expected, even with "constant TSC" feature  (which is no longer constant 
> again).
> 
> Kind regards,
> Alex Granovsky
> 
> 
> 
> -----Original Message----- From: Gilles Gouaillardet
> Sent: Thursday, November 27, 2014 1:13 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] mpi_wtime implementation
> 
> Folks,
> 
> one drawback of retrieving time with rdtsc is that this value is core
> specific :
> if a task is not bound to a core, then the value returned by MPI_Wtime()
> might go backward.
> 
> if i run the following program with
> taskset -c 1 ./time
> 
> and then move it accross between cores
> (taskset -cp 0 <pid> ; taskset -cp 2 <pid>; ...)
> then the program can abort. in my environment, i can measure up to 150ms
> difference.
> 
> /* some mtt tests will abort if this condition is met */
> 
> 
> i was unable to observe this behavior with gettimeofday()
> 
> /* though it could occur when ntpd synchronizes the clock */
> 
> is there any plan to make the timer function selectable via a mca param ?
> or to automatically fallback to gettimeofday if a task is not bound on a
> core ?
> 
> Cheers,
> 
> Gilles
> 
> $ cat time.c
> #include <stdio.h>
> #include <mpi.h>
> 
> int main (int argc, char *argv[]) {
>   int i;
>   double t = 0;
>   MPI_Init(&argc, &argv);
>   for (;;) {
>       double _t = MPI_Wtime();
>       if (_t < t) {
>           fprintf(stderr, "going back in time %lf < %lf\n", _t, t);
>           MPI_Abort(MPI_COMM_WORLD, 1);
>       }
>       t = _t;
>   }
>   MPI_Finalize();
>   return 0;
> }
> 
> On 2014/11/25 1:59, Dave Goodell (dgoodell) wrote:
>> On Nov 24, 2014, at 12:06 AM, George Bosilca <bosi...@icl.utk.edu> wrote:
>> 
>>> https://github.com/open-mpi/ompi/pull/285 is a potential answer. I would 
>>> like to hear Dave Goodell comment on this before pushing it upstream.
>>> 
>>>  George.
>> I'll take a look at it today.  My notification settings were messed up when 
>> you originally CCed me on the PR, so I didn't see this until now.
>> 
>> -Dave
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/11/25863.php
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/11/25875.php 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/11/25876.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to