Edgar, I forgot to answer your previous question. I used MPI 1.5.4 and the C++ API.
Thatyene Ramos On Mon, Apr 9, 2012 at 8:00 PM, Thatyene Louise Alves de Souza Ramos < thaty...@gmail.com> wrote: > Hi Edgar, sorry about the late response. I've been travelling without > Internet access. > > Well, I took the code Rodrigo provided and modified the client to make the > dup after the creation of the new inter communicator, without 1 process. > That is, I just replaced the lines 54-55 in the *removeRank* method with > my if-else block. > > I tried this because call a new create after the first create did not work > and I thought it would might be the communicator . So, I tried to duplicate > the inter communicator to see if worked. > > Thanks. > > Thatyene Ramos. > > > On Thu, Apr 5, 2012 at 5:10 PM, Edgar Gabriel <gabr...@cs.uh.edu> wrote: > >> so just to confirm, I ran our test suite for inter-communicator >> collective operations and communicator duplication, and everything still >> works. Specifically comm_dup on an intercommunicator is not >> fundamentally broken, but worked for my tests. >> >> Having your code to see what your code precisely does would help me to >> hunt the problem down, since I am otherwise not able to reproduce the >> problem. >> >> Also, which version of Open MPI did you use? >> >> Thanks >> Edgar >> >> On 4/4/2012 3:09 PM, Thatyene Louise Alves de Souza Ramos wrote: >> > Hi Edgar, thank you for the response. >> > >> > Unfortunately, I've tried with and without this option. In both the >> > result was the same... =( >> > >> > On Wed, Apr 4, 2012 at 5:04 PM, Edgar Gabriel <gabr...@cs.uh.edu >> > <mailto:gabr...@cs.uh.edu>> wrote: >> > >> > did you try to start the program with the --mca coll ^inter switch >> that >> > I mentioned? Collective dup for intercommunicators should work, its >> > probably again the bcast over a communicator of size 1 that is >> causing >> > the hang, and you could avoid it with the flag that I mentioned >> above. >> > >> > Also, if you could attach your test code, that would help in hunting >> > things down. >> > >> > Thanks >> > Edgar >> > >> > On 4/4/2012 2:18 PM, Thatyene Louise Alves de Souza Ramos wrote: >> > > Hi there. >> > > >> > > I've made some tests related to the problem reported by Rodrigo. >> And I >> > > think, I'd rather be wrong, that /collective calls like Create >> and Dup >> > > do not work with Inter communicators. I've try this in the client >> > group:/ >> > > >> > > *MPI::Intercomm tmp_inter_comm;* >> > > * >> > > * >> > > *tmp_inter_comm = server_comm.Create >> (server_comm.Get_group().Excl(1, >> > > &rank));* >> > > * >> > > * >> > > *if(server_comm.Get_rank() != rank)* >> > > *server_comm = tmp_inter_comm.Dup();* >> > > *else* >> > > *server_comm = MPI::COMM_NULL;* >> > > * >> > > * >> > > The server_comm is the original inter communicator with the server >> > group. >> > > >> > > I've noticed that the program hangs in the Dup call. It seems >> that the >> > > tmp_inter_comm created without one process still has this process, >> > > because the other processes are waiting for it call the Dup too. >> > > >> > > What do you think? >> > > >> > > On Wed, Mar 28, 2012 at 6:03 PM, Edgar Gabriel <gabr...@cs.uh.edu >> > <mailto:gabr...@cs.uh.edu> >> > > <mailto:gabr...@cs.uh.edu <mailto:gabr...@cs.uh.edu>>> wrote: >> > > >> > > it just uses a different algorithm which avoids the bcast on a >> > > communicator of 1 (which is causing the problem here). >> > > >> > > Thanks >> > > Edgar >> > > >> > > On 3/28/2012 12:08 PM, Rodrigo Oliveira wrote: >> > > > Hi Edgar, >> > > > >> > > > I tested the execution of my code using the option -mca coll >> > ^inter as >> > > > you suggested and the program worked fine, even when I use 1 >> > server >> > > > instance. >> > > > >> > > > What is the modification caused by this parameter? I did not >> > find an >> > > > explanation about the utilization of the module coll inter. >> > > > >> > > > Thanks a lot for your attention and for the solution. >> > > > >> > > > Best regards, >> > > > >> > > > Rodrigo Oliveira >> > > > >> > > > On Tue, Mar 27, 2012 at 1:10 PM, Rodrigo Oliveira >> > > > <rsilva.olive...@gmail.com >> > <mailto:rsilva.olive...@gmail.com> <mailto: >> rsilva.olive...@gmail.com >> > <mailto:rsilva.olive...@gmail.com>> >> > > <mailto:rsilva.olive...@gmail.com >> > <mailto:rsilva.olive...@gmail.com> >> > > <mailto:rsilva.olive...@gmail.com >> > <mailto:rsilva.olive...@gmail.com>>>> wrote: >> > > > >> > > > >> > > > Hi Edgar. >> > > > >> > > > Thanks for the response. I just did not understand why >> > the Barrier >> > > > works before I remove one of the client processes. >> > > > >> > > > I tryed it with 1 server and 3 clients and it worked >> > properly. >> > > After >> > > > I removed 1 of the clients, it stops working. So, the >> > removal is >> > > > affecting the functionality of Barrier, I guess. >> > > > >> > > > Anyone has an idea? >> > > > >> > > > >> > > > On Mon, Mar 26, 2012 at 12:34 PM, Edgar Gabriel >> > > <gabr...@cs.uh.edu <mailto:gabr...@cs.uh.edu> >> > <mailto:gabr...@cs.uh.edu <mailto:gabr...@cs.uh.edu>> >> > > > <mailto:gabr...@cs.uh.edu <mailto:gabr...@cs.uh.edu> >> > <mailto:gabr...@cs.uh.edu <mailto:gabr...@cs.uh.edu>>>> wrote: >> > > > >> > > > I do not recall on what the agreement was on how to >> > treat >> > > the size=1 >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > _______________________________________________ >> > > > users mailing list >> > > > us...@open-mpi.org <mailto:us...@open-mpi.org> >> > <mailto:us...@open-mpi.org <mailto:us...@open-mpi.org>> >> > > > http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > >> > > >> > > _______________________________________________ >> > > users mailing list >> > > us...@open-mpi.org <mailto:us...@open-mpi.org> >> > <mailto:us...@open-mpi.org <mailto:us...@open-mpi.org>> >> > > http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > >> > > >> > > >> > > >> > > _______________________________________________ >> > > users mailing list >> > > us...@open-mpi.org <mailto:us...@open-mpi.org> >> > > http://www.open-mpi.org/mailman/listinfo.cgi/users >> > >> > -- >> > Edgar Gabriel >> > Associate Professor >> > Parallel Software Technologies Lab http://pstl.cs.uh.edu >> > Department of Computer Science University of Houston >> > Philip G. Hoffman Hall, Room 524 Houston, TX-77204, USA >> > Tel: +1 (713) 743-3857 <tel:%2B1%20%28713%29%20743-3857> >> > Fax: +1 (713) 743-3335 <tel:%2B1%20%28713%29%20743-3335> >> > >> > >> > _______________________________________________ >> > users mailing list >> > us...@open-mpi.org <mailto:us...@open-mpi.org> >> > http://www.open-mpi.org/mailman/listinfo.cgi/users >> > >> > >> > >> > >> > _______________________________________________ >> > users mailing list >> > us...@open-mpi.org >> > http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> -- >> Edgar Gabriel >> Associate Professor >> Parallel Software Technologies Lab http://pstl.cs.uh.edu >> Department of Computer Science University of Houston >> Philip G. Hoffman Hall, Room 524 Houston, TX-77204, USA >> Tel: +1 (713) 743-3857 Fax: +1 (713) 743-3335 >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > >