[OMPI users] Shared Windows and MPI_Accumulate

2017-03-01 Thread Joseph Schuchart

Hi all,

We are seeing issues in one of our applications, in which processes in a 
shared communicator allocate a shared MPI window and execute 
MPI_Accumulate simultaneously on it to iteratively update each process' 
values. The test boils down to the sample code attached. Sample output 
is as follows:


```
$ mpirun -n 4 ./mpi_shared_accumulate
[1] baseptr[0]: 1010 (expected 1010)
[1] baseptr[1]: 1011 (expected 1011)
[1] baseptr[2]: 1012 (expected 1012)
[1] baseptr[3]: 1013 (expected 1013)
[1] baseptr[4]: 1014 (expected 1014)
[2] baseptr[0]: 1005 (expected 1010) [!!!]
[2] baseptr[1]: 1006 (expected 1011) [!!!]
[2] baseptr[2]: 1007 (expected 1012) [!!!]
[2] baseptr[3]: 1008 (expected 1013) [!!!]
[2] baseptr[4]: 1009 (expected 1014) [!!!]
[3] baseptr[0]: 1010 (expected 1010)
[0] baseptr[0]: 1010 (expected 1010)
[0] baseptr[1]: 1011 (expected 1011)
[0] baseptr[2]: 1012 (expected 1012)
[0] baseptr[3]: 1013 (expected 1013)
[0] baseptr[4]: 1014 (expected 1014)
[3] baseptr[1]: 1011 (expected 1011)
[3] baseptr[2]: 1012 (expected 1012)
[3] baseptr[3]: 1013 (expected 1013)
[3] baseptr[4]: 1014 (expected 1014)
```

Each process should hold the same values but sometimes (not on all 
executions) random processes diverge (marked through [!!!]).


I made the following observations:

1) The issue occurs with both OpenMPI 1.10.6 and 2.0.2 but not with 
MPICH 3.2.
2) The issue occurs only if the window is allocated through 
MPI_Win_allocate_shared, using MPI_Win_allocate works fine.
3) The code assumes that MPI_Accumulate atomically updates individual 
elements (please correct me if that is not covered by the MPI standard).


Both OpenMPI and the example code were compiled using GCC 5.4.1 and run 
on a Linux system (single node). OpenMPI was configure with 
--enable-mpi-thread-multiple and --with-threads but the application is 
not multi-threaded. Please let me know if you need any other information.


Cheers
Joseph

--
Dipl.-Inf. Joseph Schuchart
High Performance Computing Center Stuttgart (HLRS)
Nobelstr. 19
D-70569 Stuttgart

Tel.: +49(0)711-68565890
Fax: +49(0)711-6856832
E-Mail: schuch...@hlrs.de

#include 
#include 
#include 
#include 
#include 

int main(int argc, char **argv)
{
MPI_Init(&argc, &argv);

int *baseptr;
MPI_Win win;
int local_elem_count = 5;
int comm_size;
int comm_rank;

int *local_elem_buf = malloc(sizeof(int) * local_elem_count);

MPI_Comm sharedmem_comm;
MPI_Group sharedmem_group, group_all;
MPI_Comm_split_type(
MPI_COMM_WORLD,
MPI_COMM_TYPE_SHARED,
1,
MPI_INFO_NULL,
&sharedmem_comm);

MPI_Comm_size(sharedmem_comm, &comm_size);
MPI_Comm_rank(sharedmem_comm, &comm_rank);

for (int i = 0; i < local_elem_count; ++i) {
local_elem_buf[i] = comm_rank + 1;
}

MPI_Info win_info;
MPI_Info_create(&win_info);
MPI_Info_set(win_info, "alloc_shared_noncontig", "true");

//NOTE: Using MPI_Win_allocate here works as expected
MPI_Win_allocate_shared(
local_elem_count*sizeof(int),
sizeof(int),
win_info, 
sharedmem_comm, 
&baseptr, 
&win);
MPI_Info_free(&win_info);

MPI_Win_lock_all(0, win);

int loffs = 0;
for (int i = 0; i < local_elem_count; ++i) {
  baseptr[i] = 1000 + loffs++;
}

// accumulate into each process' local memory
for (int i = 0; i < comm_size; i++) {
MPI_Accumulate(
local_elem_buf,
local_elem_count,
MPI_INT,
i, 0,
local_elem_count,
MPI_INT,
MPI_SUM,
win);
MPI_Win_flush(i, win);
}

// wait for completion of all processes
MPI_Barrier(sharedmem_comm);
// print local elements
for (int i = 0; i < local_elem_count; ++i) {
  int expected = (1000 + i) +
 ((comm_size * (comm_size + 1)) / 2);
  printf("[%i] baseptr[%i]: %i (expected %i)%s\n",
comm_rank, i, baseptr[i], expected,
(baseptr[i] != expected) ? " [!!!]" : "");
}

MPI_Win_unlock_all(win);

MPI_Win_free(&win);


MPI_Finalize();

return 0;
}
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Shared Windows and MPI_Accumulate

2017-03-01 Thread Howard Pritchard
Hi Joseph,

I built this test with craypich (Cray MPI) and it passed.  I also tried
with Open MPI master and the test passed.  I also tried with 2.0.2
and can't seem to reproduce on my system.

Could you post the output of config.log?

Also, how intermittent is the problem?


Thanks,

Howard




2017-03-01 8:03 GMT-07:00 Joseph Schuchart :

> Hi all,
>
> We are seeing issues in one of our applications, in which processes in a
> shared communicator allocate a shared MPI window and execute MPI_Accumulate
> simultaneously on it to iteratively update each process' values. The test
> boils down to the sample code attached. Sample output is as follows:
>
> ```
> $ mpirun -n 4 ./mpi_shared_accumulate
> [1] baseptr[0]: 1010 (expected 1010)
> [1] baseptr[1]: 1011 (expected 1011)
> [1] baseptr[2]: 1012 (expected 1012)
> [1] baseptr[3]: 1013 (expected 1013)
> [1] baseptr[4]: 1014 (expected 1014)
> [2] baseptr[0]: 1005 (expected 1010) [!!!]
> [2] baseptr[1]: 1006 (expected 1011) [!!!]
> [2] baseptr[2]: 1007 (expected 1012) [!!!]
> [2] baseptr[3]: 1008 (expected 1013) [!!!]
> [2] baseptr[4]: 1009 (expected 1014) [!!!]
> [3] baseptr[0]: 1010 (expected 1010)
> [0] baseptr[0]: 1010 (expected 1010)
> [0] baseptr[1]: 1011 (expected 1011)
> [0] baseptr[2]: 1012 (expected 1012)
> [0] baseptr[3]: 1013 (expected 1013)
> [0] baseptr[4]: 1014 (expected 1014)
> [3] baseptr[1]: 1011 (expected 1011)
> [3] baseptr[2]: 1012 (expected 1012)
> [3] baseptr[3]: 1013 (expected 1013)
> [3] baseptr[4]: 1014 (expected 1014)
> ```
>
> Each process should hold the same values but sometimes (not on all
> executions) random processes diverge (marked through [!!!]).
>
> I made the following observations:
>
> 1) The issue occurs with both OpenMPI 1.10.6 and 2.0.2 but not with MPICH
> 3.2.
> 2) The issue occurs only if the window is allocated through
> MPI_Win_allocate_shared, using MPI_Win_allocate works fine.
> 3) The code assumes that MPI_Accumulate atomically updates individual
> elements (please correct me if that is not covered by the MPI standard).
>
> Both OpenMPI and the example code were compiled using GCC 5.4.1 and run on
> a Linux system (single node). OpenMPI was configure with
> --enable-mpi-thread-multiple and --with-threads but the application is not
> multi-threaded. Please let me know if you need any other information.
>
> Cheers
> Joseph
>
> --
> Dipl.-Inf. Joseph Schuchart
> High Performance Computing Center Stuttgart (HLRS)
> Nobelstr. 19
> D-70569 Stuttgart
>
> Tel.: +49(0)711-68565890
> Fax: +49(0)711-6856832
> E-Mail: schuch...@hlrs.de
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users