Folks,

one drawback of retrieving time with rdtsc is that this value is core
specific :
if a task is not bound to a core, then the value returned by MPI_Wtime()
might go backward.

if i run the following program with
taskset -c 1 ./time

and then move it accross between cores
(taskset -cp 0 <pid> ; taskset -cp 2 <pid>; ...)
then the program can abort. in my environment, i can measure up to 150ms
difference.

/* some mtt tests will abort if this condition is met */


i was unable to observe this behavior with gettimeofday()

/* though it could occur when ntpd synchronizes the clock */

is there any plan to make the timer function selectable via a mca param ?
or to automatically fallback to gettimeofday if a task is not bound on a
core ?

Cheers,

Gilles

$ cat time.c
#include <stdio.h>
#include <mpi.h>

int main (int argc, char *argv[]) {
    int i;
    double t = 0;
    MPI_Init(&argc, &argv);
    for (;;) {
        double _t = MPI_Wtime();
        if (_t < t) {
            fprintf(stderr, "going back in time %lf < %lf\n", _t, t);
            MPI_Abort(MPI_COMM_WORLD, 1);
        }
        t = _t;
    }
    MPI_Finalize();
    return 0;
}

On 2014/11/25 1:59, Dave Goodell (dgoodell) wrote:
> On Nov 24, 2014, at 12:06 AM, George Bosilca <bosi...@icl.utk.edu> wrote:
>
>> https://github.com/open-mpi/ompi/pull/285 is a potential answer. I would 
>> like to hear Dave Goodell comment on this before pushing it upstream.
>>
>>   George.
> I'll take a look at it today.  My notification settings were messed up when 
> you originally CCed me on the PR, so I didn't see this until now.
>
> -Dave
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/11/25863.php

Reply via email to