Dear Gilles-san and all, I thought MPI_Isend kept the sent data and stacked up in somewhere waiting corresponding MPI_Irecv. The image of my code regarding MPI, 1. send ALL tag-ed message to the other node (MPI_Isend) in master thread, then launch worker threads and 2. receive the corresponding tag-ed messages from the other node (MPI_Irecv ) in the worker threads. 3. do simulation
Doesn’t it work? How silly I was. I coded several sample programs but I couldn’t find the problem. Okay, may I understand that both MPI_Send/Recv and MPI_Isend/Irecv must be called sequencially just like POP/PUSH with stack? With my simulation algorithms the order of send and receive messages cannot be in sequencial in the default way. In that case how do you build the MPI messaging. Should the order of the MPI messages send to the destination node at first? Thank you in advance for your suggestions. Sincerely Hiroshi ABE 2015/11/04 18:10、Gilles Gouaillardet <gilles.gouaillar...@gmail.com> のメール: > Abe-san, > > MPI_Isend followed by MPI_Wait is equivalent to MPI_Send > > Depending on message size and inflight messages, that can deadlock if two > tasks send to each other and no recv has been posted. > > Cheers, > > Gilles > > ABE Hiroshi <hab...@gmail.com> wrote: >> Dear All, >> >> Installed openmpi 1.10.0 and gcc-5.2 using Fink (http://www.finkproject.org) >> but nothing is changed with my code. >> >> Regarding the MPI_Finalize error in my previous mail, it should be my fault. >> I had removed all mpi stuff in /usr/local/ manually and the openmpi-1.10.0 >> had been installed then the error message didn’t appear now. Maybe some old >> version of openmpi stuff still remained there. >> >> Anyway, I found the reason of my problem. The code is : >> >> void >> Block::MPISendEqualInterChangeData( DIRECTION dir, int rank, int id ) { >> >> GetEqualInterChangeData( dir, cf[0] ); >> >> int N = GetNumGrid(); >> int nb = 6*N*N*1; >> nb = 1010; >> // float *buf = new float[ nb ]; >> float *buf = (float *)malloc( sizeof(float)*nb); >> for( int i = 0; i < nb; i++ ) buf[i] = 0.0; >> >> MPI_Request req; >> MPI_Status status; >> >> int tag = 100 * id + (int)dir; >> >> MPI_Isend( buf, nb, MPI_REAL4, rank, tag, MPI_COMM_WORLD, &req ); >> MPI_Wait( &req, &status ); >> >> // delete [] buf; >> free( buf ); >> } >> >> This works. If the “nb” value changes to more than “1010”, MPI_Wait will >> stall. >> This means the upper limit of MPI_Isend would be 4 x 1010 = 4040 bytes. >> >> If this is true, is there any way to increase this?. I guess this should be >> wrong and there should be something wrong with my system. >> >> Any idea and suggestions are really appreciated. >> >> Thank you. >> >> 2015/11/03 8:05、Jeff Squyres (jsquyres) <jsquy...@cisco.com> のメール: >> >>> On Oct 29, 2015, at 10:24 PM, ABE Hiroshi <hab...@gmail.com> wrote: >>>> >>>> Regarding my code I mentioned in my original mail, the behaviour is very >>>> weird. MPI_Isend is called from the different named function, it works. >>>> And I wrote a sample program to try to reproduce my problem but it works >>>> fine, except the problem of MPI_Finalize. >>>> >>>> So I decided to make gcc-5.2 and make openmpi on it, which seems to be a >>>> recommendation of the FINK project. >>> >>> Ok. Per the prior mail, if you can make a small reproducer, that would be >>> most helpful in tracking down the issue. >>> >>> Thanks! >> ABE Hiroshi from Tokorozawa, JAPAN