The MPI_Comm_connect and MPI_Comm_accept calls are collective over their entire communicators.
So if you pass MPI_COMM_WORLD into MPI_Comm_connect/accept, then *all* processes in those respective MPI_COMM_WORLD's need to call MPI_Comm_connect/accept. For your 2nd question, when you get this to work, then all processes can send directly to each other -- Open MPI doesn't currently have any "routing" capabilities (e.g., sending through some other process to get to a 3rd process). On Mar 8, 2011, at 9:40 PM, Waclaw Kusnierczyk wrote: > Hello, > > I'm trying to connect two independent MPI process groups with an > intercommunicator, using ports, as described in sec. 10.4 of the MPI > standard. One group runs a server, the other a client. The server opens a > port, publishes the port's name, and waits for a connection. The client > obtains the port's name, and connects to it. The problem is, the code works > if both the server and the client are run in a one-process MPI group each. > If any of the MPI groups has more than one process, the program hangs. > > The following are two fragments of a minimal code example reproducing the > problem on my machine. The server: > > if (rank == 0) { > MPI_Open_port(MPI_INFO_NULL, port); > int fifo = open(argv[1], O_WRONLY); > write(fifo, port, MPI_MAX_PORT_NAME); > close(fifo); > printf("[server] listening on port '%s'\n", port); > MPI_Comm_accept(port, MPI_INFO_NULL, 0, this, &that); > printf("[server] connected\n"); > MPI_Close_port(port); } > MPI_Barrier(this); > > and the client: > > if (rank == 0) { > int fifo = open(buffer, O_RDONLY); > read(fifo, port, MPI_MAX_PORT_NAME); > close(fifo); > printf("[client] connecting to port '%s'\n", port); > MPI_Comm_connect(port, MPI_INFO_NULL, 0, this, &that); > printf("[client] connected\n"); } > MPI_Barrier(this); > > where 'this' is the local MPI_COMM_WORLD, and the port name is transmitted > via a named pipe. (Complete code together with a makefile is attached for > reference.) > > When the compiled codes are run on one MPI process each: > > mkfifo port > mpirun -np 1 ./server port & > mpirun -np 1 ./client port > > the connection is established as expected. With more than one process on > either side, however, the execution blocks at the connect-accept step (i.e., > after the 'listening' and 'connecting' messages are printed, but before the > 'connected' messages are); using the attached code, > > make NS=2 run > > or > > make NC=2 run > > should reproduce the problem. > > I'm using OpenMPI on two different machines: 1.4 on a 2-core laptop, and > 1.3.3 on a large supercomputer, having the same problem on both. Where do I > go wrong? > > One more, related question: once I manage to establish an intercommunicator > for two multi-process MPI groups, can any process in one group send a message > to any process in the other, directly, or does the communication have to go > through the root nodes? > > Regards, > Wacek > > <rendezvous.tgz>_______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/