I have run into a freeze / potential bug when using MPI_Comm_accept in a simple client / server implementation. I have attached two simplest programs I could produce:

1. mpi-receiver.c opens a port using MPI_Open_port, saves the port name to a file

2. mpi-receiver enters infinite loop and waits for connections using MPI_Comm_accept

3. mpi-sender.c connects to that port using MPI_Comm_connect, sends one MPI_UNSIGNED_LONG, calls barrier and disconnects using MPI_Comm_disconnect

4. mpi-receiver reads the MPI_UNSIGNED_LONG, prints it, calls barrier and disconnects using MPI_Comm_disconnect and goes to point 2 - infinite loop

All works fine, but only exactly 5 times. After that the receiver hangs in MPI_Recv, after exit from MPI_Comm_accept. That is 100% repeatable. I have tried with Intel MPI - no such problem.

I execute the programs using OpenMPI 1.10 as follows

mpirun -np 1 --mca mpi_leave_pinned 0 ./mpi-receiver


Do you have any clues what could be the reason? Am I doing sth wrong, or is it some problem with internal state of OpenMPI?

Thanks a lot!

Marcin

#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char **argv)
{
  MPI_Info info;
  char port_name[MPI_MAX_PORT_NAME];
  MPI_Comm intercomm;

  MPI_Init(&argc, &argv);
  MPI_Info_create(&info);
  MPI_Open_port(info, port_name);
  printf("port name: %s\n", port_name);

  /* write port name to file */   
  {
    FILE *fd;
    fd = fopen("port.txt", "w+");
    fprintf(fd, "%s", port_name);
    fclose(fd);
  }

  /* accept connections */
  while(1){
    unsigned long data;

    /* accept connection */
    MPI_Comm_accept(port_name, info, 0, MPI_COMM_WORLD, &intercomm);

    /* receive comm size from the sender */
    MPI_Recv(&data, 1, MPI_UNSIGNED_LONG, 0, 1, intercomm, MPI_STATUS_IGNORE);
    printf("received data: %lx\n", data);

    MPI_Barrier(intercomm);
    MPI_Comm_disconnect(&intercomm);
    printf("client disconnected\n");   
  }
}
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[])
{        
  char port_name[MPI_MAX_PORT_NAME+1];
  MPI_Info info;
  MPI_Comm intercomm;
  unsigned long data = 0x12345678;

  /* initialize MPI */
  MPI_Init(&argc, &argv);
  MPI_Info_create(&info);

  /* connect to receiver ranks - port is a string parameter */
  strcpy(port_name, argv[1]);

  /* connect to server - intercomm is the remote communicator */
  MPI_Comm_connect(port_name, info, 0, MPI_COMM_WORLD, &intercomm);
  printf("** connected\n");

  /* send data */
  MPI_Send(&data, 1, MPI_UNSIGNED_LONG, 0, 1, intercomm);
  MPI_Barrier(intercomm);

  /* disconnect */
  MPI_Comm_disconnect(&intercomm);
  MPI_Finalize();
  printf("** disconnected\n");

  return 0;
}

Reply via email to