CLASSIFICATION: UNCLASSIFIED Have you attempted using 2 cores per process? I have noticed that MPI_Comm_accept sometimes behaves strangely on single core variations.
I have a program that makes use of Comm_accept/connect and I also call MPI_Comm_merge. So, you may want to look into that call as well. -Andrew Burns -----Original Message----- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Jalel Chergui Sent: Wednesday, September 16, 2015 11:49 AM To: us...@open-mpi.org Subject: Re: [OMPI users] bug in MPI_Comm_accept? This email was sent from a non-Department of Defense email account, and contained active links. All links are disabled, and require you to copy and paste the address to a Web browser. Please verify the identity of the sender, and confirm authenticity of all links contained within the message. With openmpi-1.7.5, the sender segfaults. Sorry, I cannot see the problem in the codes. Perhaps people out there may help. Jalel Le 16/09/2015 16:40, marcin.krotkiewski a ?crit : I have removed the MPI_Barrier, to no avail. Same thing happens. Adding verbosity, before the receiver hangs I get the following message [node2:03928] mca: bml: Using openib btl to [[12620,1],0] on node node3 So It is somewhere in the openib btl module Marcin On 09/16/2015 04:34 PM, Jalel Chergui wrote: Right, anyway Finalize is necessary at the end of the receiver. The other issue is Barrier which is invoked probably when the sender has exited hence changing the size of intercom. Can you comment that line in both files ? Jalel Le 16/09/2015 16:22, Marcin Krotkiewski a ?crit : But where would I put it? If I put it in the while(1), then MPI_Comm_Accept cannot be called for the second time. If I put it outside of the loop it will never be called. On 09/16/2015 04:18 PM, Jalel Chergui wrote: Can you check with an MPI_Finalize in the receiver ? Jalel Le 16/09/2015 16:06, marcin.krotkiewski a ?crit : I have run into a freeze / potential bug when using MPI_Comm_accept in a simple client / server implementation. I have attached two simplest programs I could produce: 1. mpi-receiver.c opens a port using MPI_Open_port, saves the port name to a file 2. mpi-receiver enters infinite loop and waits for connections using MPI_Comm_accept 3. mpi-sender.c connects to that port using MPI_Comm_connect, sends one MPI_UNSIGNED_LONG, calls barrier and disconnects using MPI_Comm_disconnect 4. mpi-receiver reads the MPI_UNSIGNED_LONG, prints it, calls barrier and disconnects using MPI_Comm_disconnect and goes to point 2 - infinite loop All works fine, but only exactly 5 times. After that the receiver hangs in MPI_Recv, after exit from MPI_Comm_accept. That is 100% repeatable. I have tried with Intel MPI - no such problem. I execute the programs using OpenMPI 1.10 as follows mpirun -np 1 --mca mpi_leave_pinned 0 ./mpi-receiver Do you have any clues what could be the reason? Am I doing sth wrong, or is it some problem with internal state of OpenMPI? Thanks a lot! Marcin _______________________________________________ users mailing list us...@open-mpi.org<mailto:us...@open-mpi.org> Subscription: Caution-www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: Caution-www.open-mpi.org/community/lists/users/2015/09/27585.php -- *------------------------------------------------------------------------* Jalel CHERGUI, LIMSI-CNRS, B?t. 508 - BP 133, 91403 Orsay cedex, FRANCE T?l: (33 1) 69 85 81 27 ; T?l?copie: (33 1) 69 85 80 88 M?l: jalel.cher...@limsi.fr<mailto:jalel.cher...@limsi.fr> ; R?f?rence: Caution-perso.limsi.fr/chergui *------------------------------------------------------------------------* _______________________________________________ users mailing list us...@open-mpi.org<mailto:us...@open-mpi.org> Subscription: Caution-www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: Caution-www.open-mpi.org/community/lists/users/2015/09/27586.php _______________________________________________ users mailing list us...@open-mpi.org<mailto:us...@open-mpi.org> Subscription: Caution-www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: Caution-www.open-mpi.org/community/lists/users/2015/09/27587.php -- *------------------------------------------------------------------------* Jalel CHERGUI, LIMSI-CNRS, B?t. 508 - BP 133, 91403 Orsay cedex, FRANCE T?l: (33 1) 69 85 81 27 ; T?l?copie: (33 1) 69 85 80 88 M?l: jalel.cher...@limsi.fr<mailto:jalel.cher...@limsi.fr> ; R?f?rence: Caution-perso.limsi.fr/chergui *------------------------------------------------------------------------* _______________________________________________ users mailing list us...@open-mpi.org<mailto:us...@open-mpi.org> Subscription: Caution-www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: Caution-www.open-mpi.org/community/lists/users/2015/09/27588.php _______________________________________________ users mailing list us...@open-mpi.org<mailto:us...@open-mpi.org> Subscription: Caution-www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: Caution-www.open-mpi.org/community/lists/users/2015/09/27589.php -- *------------------------------------------------------------------------* Jalel CHERGUI, LIMSI-CNRS, B?t. 508 - BP 133, 91403 Orsay cedex, FRANCE T?l: (33 1) 69 85 81 27 ; T?l?copie: (33 1) 69 85 80 88 M?l: jalel.cher...@limsi.fr<mailto:jalel.cher...@limsi.fr> ; R?f?rence: Caution-perso.limsi.fr/chergui *------------------------------------------------------------------------* CLASSIFICATION: UNCLASSIFIED