Re: [OMPI users] MPI_Wait stalls

2015-11-06 Thread Gilles Gouaillardet
Abe-san, I am glad you were able to move forward. btw, George has a Ph.D, but Sheldon Cooper would say about me I am only an engineer Cheers, Gilles On Saturday, November 7, 2015, ABE Hiroshi wrote: > Dear Dr. Bosilca and All, > > Regarding my problem, MPI_Wait stall after MPI_Isend with lar

Re: [OMPI users] MPI_Wait stalls

2015-11-06 Thread ABE Hiroshi
Dear Dr. Bosilca and All, Regarding my problem, MPI_Wait stall after MPI_Isend with large (over 4kbytes) messages has been resolved by Dr. Gouaillardet’s suggestion : 1 MPI_Isend in the master thread 2 Launch worker threads to receive the messages by MPI_Recv 3. MPI_Waitall in the master thread.

Re: [OMPI users] MPI_Wait stalls

2015-11-04 Thread George Bosilca
Dear Abe, Open MPI provides a simple way to validate your code against the eager problem, by forcing the library to use a 0 size eager (basically all messages are then matched). First, identify the networks used by your application and then set both btl__eager_limit and btl__rndv_eager_limit to 0

Re: [OMPI users] MPI_Wait stalls

2015-11-04 Thread ABE Hiroshi
Dear Dr. Bosilca and Dr. Gouaillardet, Thank you for your kind mail. I believe I could configure the problem. As is described in Dr. Boslica’s mail, this should be the eager problem. In order to avoid that we should take one of the methods which are suggested in Dr. Gouaillardet’s mail. Also I

Re: [OMPI users] MPI_Wait stalls

2015-11-04 Thread George Bosilca
A reproducer without the receiver part limited usability. 1) Have you checked that your code doesn't suffer from the eager problem? It might happen that if your message size is under the eager limit, your perception is that the code works when in fact your message is just on the unexpected queue o

Re: [OMPI users] OMPI users] MPI_Wait stalls

2015-11-04 Thread Gilles Gouaillardet
Abe-san, you can be blocking on one side, and non blocking on the other side. for example, one task can do MPI_Send, and the other MPI_Irecv and MPI_Wait. in order to avoid deadlock, your program should do 1. master MPI_Isend and start the workers 2. worker receive and process messages (in there

Re: [OMPI users] OMPI users] MPI_Wait stalls

2015-11-04 Thread ABE Hiroshi
Dear Gilles-san and all, I thought MPI_Isend kept the sent data and stacked up in somewhere waiting corresponding MPI_Irecv. The image of my code regarding MPI, 1. send ALL tag-ed message to the other node (MPI_Isend) in master thread, then launch worker threads and 2. receive the corresponding

Re: [OMPI users] OMPI users] MPI_Wait stalls

2015-11-04 Thread Gilles Gouaillardet
Abe-san, MPI_Isend followed by MPI_Wait is equivalent to MPI_Send Depending on message size and inflight messages, that can deadlock if two tasks send to each other and no recv has been posted. Cheers, Gilles ABE Hiroshi wrote: >Dear All, > >Installed openmpi 1.10.0 and gcc-5.2 using Fink (h

Re: [OMPI users] MPI_Wait stalls

2015-11-04 Thread ABE Hiroshi
Dear All, Installed openmpi 1.10.0 and gcc-5.2 using Fink (http://www.finkproject.org) but nothing is changed with my code. Regarding the MPI_Finalize error in my previous mail, it should be my fault. I had removed all mpi stuff in /usr/local/ manually and the openmpi-1.10.0 had been installed

Re: [OMPI users] MPI_Wait stalls

2015-11-02 Thread Jeff Squyres (jsquyres)
On Oct 29, 2015, at 10:24 PM, ABE Hiroshi wrote: > > Regarding my code I mentioned in my original mail, the behaviour is very > weird. MPI_Isend is called from the different named function, it works. > And I wrote a sample program to try to reproduce my problem but it works > fine, except the

Re: [OMPI users] MPI_Wait stalls

2015-10-29 Thread ABE Hiroshi
Dear All, I tried "mpic++” to make wxWidgets library and it doesn’t change anything. I found the openmpi-1.10.0 on my Mac: OSX 10.9.5 with Apple Clang 6.0 always fails to MPI_Finalize even with a very simple program (bottom of this mail). [Venus:60708] [ 4] Assertion failed: (OPAL_OBJ_MAGIC_ID

Re: [OMPI users] MPI_Wait stalls

2015-10-27 Thread ABE Hiroshi
Dear Nathan and all, Thank you for your information. I tried it in this morning, it seems to get the same result. I will try another option. Thank you for the key to go in. And I found a statement in the FAQ ragarding PETSc which says you should use OpenMPI wrapper compiler. I use wxWidgets libr

Re: [OMPI users] MPI_Wait stalls

2015-10-27 Thread Nathan Hjelm
I have seen hangs when the tcp component is in use. If you are running on a single machine running with mpirun -mca btl self,vader. -Nathan On Mon, Oct 26, 2015 at 09:17:20PM -0600, ABE Hiroshi wrote: >Dear All, > >I have a multithread-ed program and as a next step it is reconstructing

Re: [OMPI users] OMPI users] MPI_Wait stalls

2015-10-27 Thread Gilles Gouaillardet
Abe-san, Please make sure you use the same message size in your application and your test case. Using small messages can hide some application level deadlock. Cheers, Gilles ABE Hiroshi wrote: >Dear Gilles-san, > > >Thank you for your prompt reply.  > >The code is a licenced one so I will try

Re: [OMPI users] MPI_Wait stalls

2015-10-27 Thread ABE Hiroshi
Dear Gilles-san, Thank you for your prompt reply. The code is a licenced one so I will try to make a sample code from scratch to reproduce the behavior. But I’m afraid the simple one might be work without any problems. Because I have a feeling this problem is caused by a comflict with the othe

Re: [OMPI users] MPI_Wait stalls

2015-10-27 Thread Gilles Gouaillardet
Abe-san, could you please post a (ideally trimmed) version of your program so we can try to reproduce and investigate the issue ? Thanks, Gilles On 10/27/2015 12:17 PM, ABE Hiroshi wrote: Dear All, I have a multithread-ed program and as a next step it is reconstructing to MPI program. The

[OMPI users] MPI_Wait stalls

2015-10-26 Thread ABE Hiroshi
Dear All, I have a multithread-ed program and as a next step it is reconstructing to MPI program. The code is to be MPI / Multithread hybrid one. The code proceeds MPI-routines as: 1. Send data by MPI_Isend with exlusive tag numbers to the other node. This is done in ONE master thread. 2. Rece