All,

We employ the following pattern to send signals between processes:

```
int com_rank, root = 0;
// allocate MPI window
MPI_Win win = allocate_win();
// do some computation
...
// Process 0 waits for a signal
if (com_rank == root) {
  do {
    MPI_Fetch_and_op(NULL, &res,
      MPI_INT, com_rank, 0, MPI_NO_OP, win);
    MPI_Win_flush_local(com_rank, win);
  } while (res == 0);
} else {
  MPI_Accumulate(
    &val, &res, 1, MPI_INT, root, 0, MPI_SUM, win);
  MPI_Win_flush(root, win);
}
[...]
```

We use MPI_Fetch_and_op to atomically query the local memory location for a signal and MPI_Accumulate to send the signal (I have omitted the reset and other details for simplicity).

If running on a single node (my laptop), this code snippet reproducibly hangs, with the root process indefinitely repeating the do-while-loop and all other processes being stuck in MPI_Win_flush.

An interesting observation here is that if I replace the MPI_Win_flush_local with MPI_Win_flush the application does not hang. However, my understanding is that a local flush should be sufficient for MPI_Fetch_and_op with MPI_NO_OP as remote completion is not required.

I do not observe this hang with MPICH 3.2 and I am aware that the progress semantics of MPI are rather vague. However, I'm curious whether this difference is intended and whether or not repeatedly calling into MPI communication functions (that do not block) should provide progress for incoming RMA operations?

Any input is much appreciated.

Cheers
Joseph
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to