Re: [OMPI users] problems with establishing an intercommunicator

Jeff Squyres Wed, 9 Mar 2011 10:44:24 -0500

The MPI_Comm_connect and MPI_Comm_accept calls are collective over their entire 
communicators.


So if you pass MPI_COMM_WORLD into MPI_Comm_connect/accept, then *all* 
processes in those respective MPI_COMM_WORLD's need to call 
MPI_Comm_connect/accept.

For your 2nd question, when you get this to work, then all processes can send 
directly to each other -- Open MPI doesn't currently have any "routing" 
capabilities (e.g., sending through some other process to get to a 3rd process).

On Mar 8, 2011, at 9:40 PM, Waclaw Kusnierczyk wrote:

> Hello,
> 
> I'm trying to connect two independent MPI process groups with an 
> intercommunicator, using ports, as described in sec. 10.4 of the MPI 
> standard.  One group runs a server, the other a client.  The server opens a 
> port, publishes the port's name, and waits for a connection.  The client 
> obtains the port's name, and connects to it.  The problem is, the code works 
> if both the server and the client are run in a one-process MPI group each.  
> If any of the MPI groups has more than one process, the program hangs.
> 
> The following are two fragments of a minimal code example reproducing the 
> problem on my machine.  The server:
> 
>    if (rank == 0) {
>        MPI_Open_port(MPI_INFO_NULL, port);
>        int fifo = open(argv[1], O_WRONLY);
>        write(fifo, port, MPI_MAX_PORT_NAME);
>        close(fifo);
>        printf("[server] listening on port '%s'\n", port);
>        MPI_Comm_accept(port, MPI_INFO_NULL, 0, this, &that);
>        printf("[server] connected\n");
>        MPI_Close_port(port); }
>    MPI_Barrier(this);
> 
> and the client:
> 
>    if (rank == 0) {
>        int fifo = open(buffer, O_RDONLY);
>        read(fifo, port, MPI_MAX_PORT_NAME);
>        close(fifo);
>        printf("[client] connecting to port '%s'\n", port);
>        MPI_Comm_connect(port, MPI_INFO_NULL, 0, this, &that);
>        printf("[client] connected\n"); }
>    MPI_Barrier(this);
> 
> where 'this' is the local MPI_COMM_WORLD, and the port name is transmitted 
> via a named pipe.  (Complete code together with a makefile is attached for 
> reference.)
> 
> When the compiled codes are run on one MPI process each:
> 
>    mkfifo port
>    mpirun -np 1 ./server port &
>    mpirun -np 1 ./client port
> 
> the connection is established as expected.  With more than one process on 
> either side, however, the execution blocks at the connect-accept step (i.e., 
> after the 'listening' and 'connecting' messages are printed, but before the 
> 'connected' messages are); using the attached code,
> 
>    make NS=2 run
> 
> or
> 
>    make NC=2 run
> 
> should reproduce the problem.
> 
> I'm using OpenMPI on two different machines: 1.4 on a 2-core laptop, and 
> 1.3.3 on a large supercomputer, having the same problem on both.  Where do I 
> go wrong?
> 
> One more, related question:  once I manage to establish an intercommunicator 
> for two multi-process MPI groups, can any process in one group send a message 
> to any process in the other, directly, or does the communication have to go 
> through the root nodes?
> 
> Regards,
> Wacek
> 
> <rendezvous.tgz>_______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] problems with establishing an intercommunicator

Reply via email to