What version of OMPI are you using?

> On Nov 3, 2017, at 7:48 AM, Florian Lindner <mailingli...@xgm.de> wrote:
> 
> Hello,
> 
> I'm working on a sample program to connect two MPI communicators launched 
> with mpirun using Ports.
> 
> Firstly, I use MPI_Open_port to obtain a name and write that to a file:
> 
>  if (options.participant == A) { // A publishes the port
>    if (options.commType == single and rank == 0)
>      openPublishPort(options);
> 
>    if (options.commType == many)
>      openPublishPort(options);
>  }
>  MPI_Barrier(MPI_COMM_WORLD);
> 
> participant is a command line argument and defines the role of A as server. B 
> is the client.
> 
> void openPublishPort(Options options)
> {
>  using namespace boost::filesystem;
>  int rank;
>  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> 
>  char p[MPI_MAX_PORT_NAME];
>  MPI_Open_port(MPI_INFO_NULL, p);
>  std::string portName(p);
> 
>  create_directory(options.publishDirectory);
>  std::string filename;
>  if (options.commType == many)
>    filename = "A-" + std::to_string(rank) + ".address";
>  if (options.commType == single)
>    filename = "intercomm.address";
> 
>  auto path = options.publishDirectory / filename;
>  DEBUG << "Writing address " << portName << " to " << path;
>  std::ofstream ofs(path.string(), std::ofstream::out);
>  ofs << portName;
> }
> 
> This works fine as far as I see. Next, I try to connect:
> 
>  MPI_Comm icomm;
>  std::string portName;
>  if (options.participant == A) { // receives connections
>    if (options.commType == single) {
>      if (rank == 0)
>        portName = readPort(options);
>      INFO << "Accepting connection on " << portName;
>      MPI_Comm_accept(portName.c_str(), MPI_INFO_NULL, 0, MPI_COMM_WORLD, 
> &icomm);
>      INFO << "Received connection";
>    }
>  }
> 
>  if (options.participant == B) { // connects to the intercomms
>    if (options.commType == single) {
>      if (rank == 0)
>        portName = readPort(options);
>      INFO << "Trying to connect to " << portName;
>      MPI_Comm_connect(portName.c_str(), MPI_INFO_NULL, 0, MPI_COMM_WORLD, 
> &icomm);
>      INFO << "Connected";
>    }
>  }
> 
> 
> options.single says that I want to use a single communicator that contains 
> all ranks on both participants, A and B.
> readPort reads the port name from the file that was written before.
> 
> Now, when I first launch A and, in another terminal, B, nothing happens until 
> a timeout occurs.
> 
> % mpirun -n 1 ./mpiports --commType="single" --participant="A"
> [2017-11-03 15:29:55.469891] [debug]   Writing address 
> 3048013825.0:1069313090 to "./publish/intercomm.address"
> [2017-11-03 15:29:55.470169] [debug]   Read address 3048013825.0:1069313090 
> from "./publish/intercomm.address"
> [2017-11-03 15:29:55.470185] [info]    Accepting connection on 
> 3048013825.0:1069313090
> [asaru:16199] OPAL ERROR: Timeout in file base/pmix_base_fns.c at line 195
> [...]
> 
> and on the other site:
> 
> % mpirun -n 1 ./mpiports --commType="single" --participant="B"
> [2017-11-03 15:29:59.698921] [debug]   Read address 3048013825.0:1069313090 
> from "./publish/intercomm.address"
> [2017-11-03 15:29:59.698947] [info]    Trying to connect to 
> 3048013825.0:1069313090
> [asaru:16238] OPAL ERROR: Timeout in file base/pmix_base_fns.c at line 195
> [...]
> 
> The complete code, including cmake build script can be downloaded at:
> 
> https://www.dropbox.com/s/azo5ti4kjg12zjy/MPI_Ports.tar.gz?dl=0
> 
> Why is the connection not working?
> 
> Thanks a lot,
> Florian
> 
> 
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to