Try putting an "MPI_Barrier()" call before your MPI_Finalize() [*]. I suspect that one of the programs (the sending side) is calling Finalize before the receiving side has processed the messages.
-bill [*] pet peeve of mine : this should almost always be standard practice. > -----Original Message----- > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On > Behalf Of Xianglong Kong > Sent: Tuesday, February 22, 2011 10:27 AM > To: Open MPI Users > Subject: Re: [OMPI users] Beginner's question: why multiple sends or > receives don't work? > > Hi, Thank you for the reply. > > However, using MPI_waitall instead of MPI_wait didn't solve the > problem. The code would hang at the MPI_waitall. Also, I'm not quit > understand why the code is inherently unsafe. Can the non-blocking > send or receive cause any deadlock? > > Thanks! > > Kong > > On Mon, Feb 21, 2011 at 2:32 PM, Jeff Squyres <jsquy...@cisco.com> > wrote: > > It's because you're waiting on the receive request to complete before > the send request. This likely works locally because the message > transfer is through shared memory and is fast, but it's still an > inherently unsafe way to block waiting for completion (i.e., the > receive might not complete if the send does not complete). > > > > What you probably want to do is build an array of 2 requests and then > issue a single MPI_Waitall() on both of them. This will allow MPI to > progress both requests simultaneously. > > > > > > On Feb 18, 2011, at 11:58 AM, Xianglong Kong wrote: > > > >> Hi, all, > >> > >> I'm an mpi newbie. I'm trying to connect two desktops in my office > >> with each other using a crossing cable and implement a parallel code > >> on them using MPI. > >> > >> Now, the two nodes can ssh to each other without password, and can > >> successfully run the MPI "Hello world" code. However, when I tried > to > >> use multiple MPI non-blocking sends or receives, the job would hang. > >> The problem only showed up if the two processes are launched in the > >> different nodes, the code can run successfully if the two processes > >> are launched in the same node. Also, the code can run successfully > if > >> there are only one send or/and one receive in each process. > >> > >> Here is the code that can run successfully: > >> > >> #include <stdlib.h> > >> #include <stdio.h> > >> #include <string.h> > >> #include <mpi.h> > >> > >> int main(int argc, char** argv) { > >> > >> int myrank, nprocs; > >> > >> MPI_Init(&argc, &argv); > >> MPI_Comm_size(MPI_COMM_WORLD, &nprocs); > >> MPI_Comm_rank(MPI_COMM_WORLD, &myrank); > >> > >> printf("Hello from processor %d of %d\n", myrank, nprocs); > >> > >> MPI_Request reqs1, reqs2; > >> MPI_Status stats1, stats2; > >> > >> int tag1=10; > >> int tag2=11; > >> > >> int buf; > >> int mesg; > >> int source=1-myrank; > >> int dest=1-myrank; > >> > >> if(myrank==0) > >> { > >> mesg=1; > >> > >> MPI_Irecv(&buf, 1, MPI_INT, source, tag1, > MPI_COMM_WORLD, &reqs1); > >> MPI_Isend(&mesg, 1, MPI_INT, dest, tag2, > MPI_COMM_WORLD, &reqs2); > >> > >> > >> } > >> > >> if(myrank==1) > >> { > >> mesg=2; > >> > >> MPI_Irecv(&buf, 1, MPI_INT, source, tag2, > MPI_COMM_WORLD, &reqs1); > >> MPI_Isend(&mesg, 1, MPI_INT, dest, tag1, > MPI_COMM_WORLD, &reqs2); > >> } > >> > >> MPI_Wait(&reqs1, &stats1); > >> printf("myrank=%d,received the message\n",myrank); > >> > >> MPI_Wait(&reqs2, &stats2); > >> printf("myrank=%d,sent the messages\n",myrank); > >> > >> printf("myrank=%d, buf=%d\n",myrank, buf); > >> > >> MPI_Finalize(); > >> return 0; > >> } > >> > >> And here is the code that will hang > >> > >> #include <stdlib.h> > >> #include <stdio.h> > >> #include <string.h> > >> #include <mpi.h> > >> > >> int main(int argc, char** argv) { > >> > >> int myrank, nprocs; > >> > >> MPI_Init(&argc, &argv); > >> MPI_Comm_size(MPI_COMM_WORLD, &nprocs); > >> MPI_Comm_rank(MPI_COMM_WORLD, &myrank); > >> > >> printf("Hello from processor %d of %d\n", myrank, nprocs); > >> > >> MPI_Request reqs1, reqs2; > >> MPI_Status stats1, stats2; > >> > >> int tag1=10; > >> int tag2=11; > >> > >> int source=1-myrank; > >> int dest=1-myrank; > >> > >> if(myrank==0) > >> { > >> int buf1, buf2; > >> > >> MPI_Irecv(&buf1, 1, MPI_INT, source, tag1, > MPI_COMM_WORLD, &reqs1); > >> MPI_Irecv(&buf2, 1, MPI_INT, source, tag2, > MPI_COMM_WORLD, &reqs2); > >> > >> MPI_Wait(&reqs1, &stats1); > >> printf("received one message\n"); > >> > >> MPI_Wait(&reqs2, &stats2); > >> printf("received two messages\n"); > >> > >> printf("myrank=%d, buf1=%d, buf2=%d\n",myrank, buf1, > buf2); > >> } > >> > >> if(myrank==1) > >> { > >> int mesg1=1; > >> int mesg2=2; > >> > >> MPI_Isend(&mesg1, 1, MPI_INT, dest, tag1, > MPI_COMM_WORLD, &reqs1); > >> MPI_Isend(&mesg2, 1, MPI_INT, dest, tag2, > MPI_COMM_WORLD, &reqs2); > >> > >> MPI_Wait(&reqs1, &stats1); > >> printf("sent one message\n"); > >> > >> MPI_Wait(&reqs2, &stats2); > >> printf("sent two messages\n"); > >> } > >> > >> MPI_Finalize(); > >> return 0; > >> } > >> > >> And the output of the second failed code: > >> *********************************************** > >> Hello from processor 0 of 2 > >> > >> Received one message > >> > >> Hello from processor 1 of 2 > >> > >> Sent one message > >> ******************************************************* > >> > >> Can anyone help to point out why the second code didn't work? > >> > >> Thanks! > >> > >> Kong > >> > >> _______________________________________________ > >> users mailing list > >> us...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > -- > > Jeff Squyres > > jsquy...@cisco.com > > For corporate legal information go to: > > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > -- > Xianglong Kong > Department of Mechanical Engineering > University of Rochester > Phone: (585)520-4412 > MSN: dinosaur8...@hotmail.com > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users