Re: [OMPI users] resolution of MPI_Wtime

2016-04-07 Thread Dave Love
Gilles Gouaillardet  writes:

> clock_gettime() has a higher precision than gettimeofday(), though it
> likely has
> a lower precision and higher overhead than opal_sys_timer_get_cycles()

Do you mean OMPI actually uses cycles for timing somewhere?  The
precision of that would be irrelevant, when it's typically only accurate
to within a factor of two or three.

> my point was, and iirc, opal_sys_timer_get_cycles() on linux is local
> to a given core,
> and hence not suitable for MPI since a task might migrate from one
> core to an other,
> or it might be multithreaded with threads running on different cores.

Is the documentation wrong to say wtime isn't (meant to be) global,
then?  Timing is obviously another reason to bind cores, given the usual
distributed system problems even at the smallest relevant scale.

Anyhow, the lesson seems to be that you shouldn't use mpi_wtime if you
need decent precision or realistic mpi_wtick across implementations.

> Cheers,
>
> Gilles
>
> On 4/6/2016 11:15 PM, Dave Love wrote:
>> Gilles Gouaillardet  writes:
>>
>>> Dave,
>>>
>>> fwiw, on v1.10, we likely use the number of cycles / cpu freq.
>> That would be a horribly broken means of timing.  gettimeofday is
>> actually called under mpi_wtime, as ompi_info claims.
>>
>>> see opal_sys_timer_get_cycles in
>>> https://github.com/open-mpi/ompi-release/blob/v1.10/opal/include/opal/sys/amd64/timer.h
>>>
>>> I cannot remember whether this is a monotonic timer.
>>> (e.g. MPI_Wtime() invoked on a given cpu is always lower or equal to
>>> MPI_Wtime() invoked later and on *any* cpu)
>> That's global, not monotonic.  MPI_Wtime(1) says it isn't necessarily
>> global in OMPI, but it has to be monotonic as I understand it.
>>
>>> that could be the reason why we moved to clock_gettime() in master.
>> The reason to use clock_gettime is higher precision.  (I looked into
>> this after numbers from a test didn't make much sense.)


[OMPI users] Wrong values when reading file with MPI IO

2016-04-07 Thread david . froger . ml

Hello,

Here is a simple `C` program reading a file in parallel with `MPI IO`:

#include 
#include 

#include "mpi.h"

#define N 10

main( int argc, char **argv )
{
int rank, size;
MPI_Init(&argc, &argv);
MPI_Comm_rank( MPI_COMM_WORLD, &rank );
MPI_Comm_size( MPI_COMM_WORLD, &size );

int i0 = N *  rank / size;
int i1 = N * (rank+1) / size;
printf("rank: %d, i0: %d, i1: %d\n", rank, i0, i1);

int i;
double* data = malloc( (i1-i0)*sizeof(double) );
for (i = 0 ; i < i1-i0 ; i++)
data[i] = 123.;

MPI_File f;
MPI_File_open(MPI_COMM_WORLD, "data.bin", MPI_MODE_RDONLY,
  MPI_INFO_NULL, &f);

MPI_File_set_view(f, i0, MPI_DOUBLE, MPI_DOUBLE, "native",
  MPI_INFO_NULL);

MPI_Status status;
MPI_File_read(f, data, i1-i0, MPI_DOUBLE, &status);

int count;
MPI_Get_count(&status, MPI_DOUBLE, &count);
printf("rank %d, %d value read\n", rank, count);

for (i = 0 ; i < i1-i0 ; i++) {
printf("rank: %d index: %d value: %.2f\n", rank, i, 
data[i]);

}

MPI_File_close(&f);

MPI_Finalize();

free(data);

return 0;
}

With one processus:

./read_mpi_io

Values read are correct:

rank: 0, i0: 0, i1: 10
rank 0, 10 value read
rank: 0 index: 0 value: 0.00
rank: 0 index: 1 value: 1.00
rank: 0 index: 2 value: 2.00
rank: 0 index: 3 value: 3.00
rank: 0 index: 4 value: 4.00
rank: 0 index: 5 value: 5.00
rank: 0 index: 6 value: 6.00
rank: 0 index: 7 value: 7.00
rank: 0 index: 8 value: 8.00
rank: 0 index: 9 value: 9.00

But with two processus:

mpirun -n 2 ./read_mpi_io

I get wrong values (zeros):

rank: 0, i0: 0, i1: 5
rank: 1, i0: 5, i1: 10
rank 0, 5 value read
rank: 0 index: 0 value: 0.00
rank 1, 5 value read
rank: 1 index: 0 value: 0.00
rank: 0 index: 1 value: 1.00
rank: 0 index: 2 value: 2.00
rank: 1 index: 1 value: 0.00
rank: 1 index: 2 value: 0.00
rank: 1 index: 3 value: 0.00
rank: 1 index: 4 value: 0.00
rank: 0 index: 3 value: 3.00
rank: 0 index: 4 value: 4.00


What's wrong in my C code?

Thanks,
David


Re: [OMPI users] Wrong values when reading file with MPI IO

2016-04-07 Thread Edgar Gabriel

What version of Open MPI did you execute your test with?
Edgar

On 4/7/2016 1:54 PM, david.froger...@mailoo.org wrote:

Hello,

Here is a simple `C` program reading a file in parallel with `MPI IO`:

  #include 
  #include 

  #include "mpi.h"

  #define N 10

  main( int argc, char **argv )
  {
  int rank, size;
  MPI_Init(&argc, &argv);
  MPI_Comm_rank( MPI_COMM_WORLD, &rank );
  MPI_Comm_size( MPI_COMM_WORLD, &size );

  int i0 = N *  rank / size;
  int i1 = N * (rank+1) / size;
  printf("rank: %d, i0: %d, i1: %d\n", rank, i0, i1);

  int i;
  double* data = malloc( (i1-i0)*sizeof(double) );
  for (i = 0 ; i < i1-i0 ; i++)
  data[i] = 123.;

  MPI_File f;
  MPI_File_open(MPI_COMM_WORLD, "data.bin", MPI_MODE_RDONLY,
MPI_INFO_NULL, &f);

  MPI_File_set_view(f, i0, MPI_DOUBLE, MPI_DOUBLE, "native",
MPI_INFO_NULL);

  MPI_Status status;
  MPI_File_read(f, data, i1-i0, MPI_DOUBLE, &status);

  int count;
  MPI_Get_count(&status, MPI_DOUBLE, &count);
  printf("rank %d, %d value read\n", rank, count);

  for (i = 0 ; i < i1-i0 ; i++) {
  printf("rank: %d index: %d value: %.2f\n", rank, i,
data[i]);
  }

  MPI_File_close(&f);

  MPI_Finalize();

  free(data);

  return 0;
  }

With one processus:

  ./read_mpi_io

Values read are correct:

  rank: 0, i0: 0, i1: 10
  rank 0, 10 value read
  rank: 0 index: 0 value: 0.00
  rank: 0 index: 1 value: 1.00
  rank: 0 index: 2 value: 2.00
  rank: 0 index: 3 value: 3.00
  rank: 0 index: 4 value: 4.00
  rank: 0 index: 5 value: 5.00
  rank: 0 index: 6 value: 6.00
  rank: 0 index: 7 value: 7.00
  rank: 0 index: 8 value: 8.00
  rank: 0 index: 9 value: 9.00

But with two processus:

  mpirun -n 2 ./read_mpi_io

I get wrong values (zeros):

  rank: 0, i0: 0, i1: 5
  rank: 1, i0: 5, i1: 10
  rank 0, 5 value read
  rank: 0 index: 0 value: 0.00
  rank 1, 5 value read
  rank: 1 index: 0 value: 0.00
  rank: 0 index: 1 value: 1.00
  rank: 0 index: 2 value: 2.00
  rank: 1 index: 1 value: 0.00
  rank: 1 index: 2 value: 0.00
  rank: 1 index: 3 value: 0.00
  rank: 1 index: 4 value: 0.00
  rank: 0 index: 3 value: 3.00
  rank: 0 index: 4 value: 4.00


What's wrong in my C code?

Thanks,
David
___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2016/04/28903.php


Re: [OMPI users] Wrong values when reading file with MPI IO

2016-04-07 Thread david . froger . ml

What version of Open MPI did you execute your test with?


mpirun (Open MPI) 1.8.6


Re: [OMPI users] resolution of MPI_Wtime

2016-04-07 Thread Jeff Hammond
On Thu, Apr 7, 2016 at 9:28 AM, Dave Love  wrote:

> ...
> Anyhow, the lesson seems to be that you shouldn't use mpi_wtime if you
> need decent precision or realistic mpi_wtick across implementations.
>
> I certainly hope that this isn't the lesson anyone learns from this.  It
is extremely important to application developers that MPI_Wtime represent a
"best effort" implementation on every platform.

Other implementations of MPI have very accurate counters.

Jeff

-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/


Re: [OMPI users] Wrong values when reading file with MPI IO

2016-04-07 Thread David Froger
You're right, thanks a lot Edgar!

Quoting Edgar Gabriel (2016-04-07 23:18:46)
> I found the bug in your code. The displacement of the file view has to 
> be given in absolute bytes, not in multiples of etypes.
> ---snip---
>   The disp displacement argument specifies the position (absolute offset 
> in bytes from the begin-
> ning of the file) where the view begins.
> ---snip---
> 
> If you change your code to
> 
> MPI_File_set_view(f, i0*sizeof(double), MPI_DOUBLE, MPI_DOUBLE, "native",
>MPI_INFO_NULL);
> 
> you'll get the correct answer.
> 
> Edgar
> 
> On 4/7/2016 2:25 PM, david.froger...@mailoo.org wrote:
> >> What version of Open MPI did you execute your test with?
> > mpirun (Open MPI) 1.8.6
>