AFAIK, Linux synchronizes all CPU timers on boot. The skew is normally no more than 50-100 CPU cycles.

The reasons why you can observe larger differences are:

1) Main. The CPUs do not have "constant TSC" feature . Without this feature timer frequency changes across different power states of CPU or core. 2) Secondary. Some motherboard can overclock CPUs depending on load using FSB clock generator. This results in CPU timers ticking faster or slower than expected, even with "constant TSC" feature (which is no longer constant again).

Kind regards,
Alex Granovsky



-----Original Message----- From: Gilles Gouaillardet
Sent: Thursday, November 27, 2014 1:13 PM
To: Open MPI Users
Subject: Re: [OMPI users] mpi_wtime implementation

Folks,

one drawback of retrieving time with rdtsc is that this value is core
specific :
if a task is not bound to a core, then the value returned by MPI_Wtime()
might go backward.

if i run the following program with
taskset -c 1 ./time

and then move it accross between cores
(taskset -cp 0 <pid> ; taskset -cp 2 <pid>; ...)
then the program can abort. in my environment, i can measure up to 150ms
difference.

/* some mtt tests will abort if this condition is met */


i was unable to observe this behavior with gettimeofday()

/* though it could occur when ntpd synchronizes the clock */

is there any plan to make the timer function selectable via a mca param ?
or to automatically fallback to gettimeofday if a task is not bound on a
core ?

Cheers,

Gilles

$ cat time.c
#include <stdio.h>
#include <mpi.h>

int main (int argc, char *argv[]) {
   int i;
   double t = 0;
   MPI_Init(&argc, &argv);
   for (;;) {
       double _t = MPI_Wtime();
       if (_t < t) {
           fprintf(stderr, "going back in time %lf < %lf\n", _t, t);
           MPI_Abort(MPI_COMM_WORLD, 1);
       }
       t = _t;
   }
   MPI_Finalize();
   return 0;
}

On 2014/11/25 1:59, Dave Goodell (dgoodell) wrote:
On Nov 24, 2014, at 12:06 AM, George Bosilca <bosi...@icl.utk.edu> wrote:

https://github.com/open-mpi/ompi/pull/285 is a potential answer. I would like to hear Dave Goodell comment on this before pushing it upstream.

  George.
I'll take a look at it today. My notification settings were messed up when you originally CCed me on the PR, so I didn't see this until now.

-Dave

_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: http://www.open-mpi.org/community/lists/users/2014/11/25863.php

_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: http://www.open-mpi.org/community/lists/users/2014/11/25875.php

Reply via email to