Edgar,

I forgot to answer your previous question. I used MPI 1.5.4 and the C++ API.

Thatyene Ramos

On Mon, Apr 9, 2012 at 8:00 PM, Thatyene Louise Alves de Souza Ramos <
thaty...@gmail.com> wrote:

> Hi Edgar, sorry about the late response. I've been travelling without
> Internet access.
>
> Well, I took the code Rodrigo provided and modified the client to make the
> dup after the creation of the new inter communicator, without 1 process.
> That is, I just replaced the lines 54-55 in the *removeRank* method with
> my if-else block.
>
> I tried this because call a new create after the first create did not work
> and I thought it would might be the communicator . So, I tried to duplicate
> the inter communicator to see if worked.
>
> Thanks.
>
> Thatyene Ramos.
>
>
> On Thu, Apr 5, 2012 at 5:10 PM, Edgar Gabriel <gabr...@cs.uh.edu> wrote:
>
>> so just to confirm, I ran our test suite for inter-communicator
>> collective operations and communicator duplication, and everything still
>> works. Specifically comm_dup on an intercommunicator is not
>> fundamentally broken, but worked for my tests.
>>
>> Having your code to see what your code precisely does would help me to
>> hunt the problem down, since I am otherwise not able to reproduce the
>> problem.
>>
>> Also, which version of Open MPI did you use?
>>
>> Thanks
>> Edgar
>>
>> On 4/4/2012 3:09 PM, Thatyene Louise Alves de Souza Ramos wrote:
>> > Hi Edgar, thank you for the response.
>> >
>> > Unfortunately, I've tried with and without this option. In both the
>> > result was the same... =(
>> >
>> > On Wed, Apr 4, 2012 at 5:04 PM, Edgar Gabriel <gabr...@cs.uh.edu
>> > <mailto:gabr...@cs.uh.edu>> wrote:
>> >
>> >     did you try to start the program with the --mca coll ^inter switch
>> that
>> >     I mentioned? Collective dup for intercommunicators should work, its
>> >     probably again the bcast over a communicator of size 1 that is
>> causing
>> >     the hang, and you could avoid it with the flag that I mentioned
>> above.
>> >
>> >     Also, if you could attach your test code, that would help in hunting
>> >     things down.
>> >
>> >     Thanks
>> >     Edgar
>> >
>> >     On 4/4/2012 2:18 PM, Thatyene Louise Alves de Souza Ramos wrote:
>> >     > Hi there.
>> >     >
>> >     > I've made some tests related to the problem reported by Rodrigo.
>> And I
>> >     > think, I'd rather be wrong, that /collective calls like Create
>> and Dup
>> >     > do not work with Inter communicators. I've try this in the client
>> >     group:/
>> >     >
>> >     > *MPI::Intercomm tmp_inter_comm;*
>> >     > *
>> >     > *
>> >     > *tmp_inter_comm = server_comm.Create
>> (server_comm.Get_group().Excl(1,
>> >     > &rank));*
>> >     > *
>> >     > *
>> >     > *if(server_comm.Get_rank() != rank)*
>> >     > *server_comm = tmp_inter_comm.Dup();*
>> >     > *else*
>> >     > *server_comm = MPI::COMM_NULL;*
>> >     > *
>> >     > *
>> >     > The server_comm is the original inter communicator with the server
>> >     group.
>> >     >
>> >     > I've noticed that the program hangs in the Dup call. It seems
>> that the
>> >     > tmp_inter_comm created without one process still has this process,
>> >     > because the other processes are waiting for it call the Dup too.
>> >     >
>> >     > What do you think?
>> >     >
>> >     > On Wed, Mar 28, 2012 at 6:03 PM, Edgar Gabriel <gabr...@cs.uh.edu
>> >     <mailto:gabr...@cs.uh.edu>
>> >     > <mailto:gabr...@cs.uh.edu <mailto:gabr...@cs.uh.edu>>> wrote:
>> >     >
>> >     >     it just uses a different algorithm which avoids the bcast on a
>> >     >     communicator of 1 (which is causing the problem here).
>> >     >
>> >     >     Thanks
>> >     >     Edgar
>> >     >
>> >     >     On 3/28/2012 12:08 PM, Rodrigo Oliveira wrote:
>> >     >     > Hi Edgar,
>> >     >     >
>> >     >     > I tested the execution of my code using the option -mca coll
>> >     ^inter as
>> >     >     > you suggested and the program worked fine, even when I use 1
>> >     server
>> >     >     > instance.
>> >     >     >
>> >     >     > What is the modification caused by this parameter? I did not
>> >     find an
>> >     >     > explanation about the utilization of the module coll inter.
>> >     >     >
>> >     >     > Thanks a lot for your attention and for the solution.
>> >     >     >
>> >     >     > Best regards,
>> >     >     >
>> >     >     > Rodrigo Oliveira
>> >     >     >
>> >     >     > On Tue, Mar 27, 2012 at 1:10 PM, Rodrigo Oliveira
>> >     >     > <rsilva.olive...@gmail.com
>> >     <mailto:rsilva.olive...@gmail.com> <mailto:
>> rsilva.olive...@gmail.com
>> >     <mailto:rsilva.olive...@gmail.com>>
>> >     >     <mailto:rsilva.olive...@gmail.com
>> >     <mailto:rsilva.olive...@gmail.com>
>> >     >     <mailto:rsilva.olive...@gmail.com
>> >     <mailto:rsilva.olive...@gmail.com>>>> wrote:
>> >     >     >
>> >     >     >
>> >     >     >     Hi Edgar.
>> >     >     >
>> >     >     >     Thanks for the response. I just did not understand why
>> >     the Barrier
>> >     >     >     works before I remove one of the client processes.
>> >     >     >
>> >     >     >     I tryed it with 1 server and 3 clients and it worked
>> >     properly.
>> >     >     After
>> >     >     >     I removed 1 of the clients, it stops working. So, the
>> >     removal is
>> >     >     >     affecting the functionality of Barrier, I guess.
>> >     >     >
>> >     >     >     Anyone has an idea?
>> >     >     >
>> >     >     >
>> >     >     >     On Mon, Mar 26, 2012 at 12:34 PM, Edgar Gabriel
>> >     >     <gabr...@cs.uh.edu <mailto:gabr...@cs.uh.edu>
>> >     <mailto:gabr...@cs.uh.edu <mailto:gabr...@cs.uh.edu>>
>> >     >     >     <mailto:gabr...@cs.uh.edu <mailto:gabr...@cs.uh.edu>
>> >     <mailto:gabr...@cs.uh.edu <mailto:gabr...@cs.uh.edu>>>> wrote:
>> >     >     >
>> >     >     >         I do not recall on what the agreement was on how to
>> >     treat
>> >     >     the size=1
>> >     >     >
>> >     >     >
>> >     >     >
>> >     >     >
>> >     >     >
>> >     >     > _______________________________________________
>> >     >     > users mailing list
>> >     >     > us...@open-mpi.org <mailto:us...@open-mpi.org>
>> >     <mailto:us...@open-mpi.org <mailto:us...@open-mpi.org>>
>> >     >     > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >     >
>> >     >
>> >     >     _______________________________________________
>> >     >     users mailing list
>> >     >     us...@open-mpi.org <mailto:us...@open-mpi.org>
>> >     <mailto:us...@open-mpi.org <mailto:us...@open-mpi.org>>
>> >     >     http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >     >
>> >     >
>> >     >
>> >     >
>> >     > _______________________________________________
>> >     > users mailing list
>> >     > us...@open-mpi.org <mailto:us...@open-mpi.org>
>> >     > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> >     --
>> >     Edgar Gabriel
>> >     Associate Professor
>> >     Parallel Software Technologies Lab      http://pstl.cs.uh.edu
>> >     Department of Computer Science          University of Houston
>> >     Philip G. Hoffman Hall, Room 524        Houston, TX-77204, USA
>> >     Tel: +1 (713) 743-3857 <tel:%2B1%20%28713%29%20743-3857>
>> >          Fax: +1 (713) 743-3335 <tel:%2B1%20%28713%29%20743-3335>
>> >
>> >
>> >     _______________________________________________
>> >     users mailing list
>> >     us...@open-mpi.org <mailto:us...@open-mpi.org>
>> >     http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > users mailing list
>> > us...@open-mpi.org
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> --
>> Edgar Gabriel
>> Associate Professor
>> Parallel Software Technologies Lab      http://pstl.cs.uh.edu
>> Department of Computer Science          University of Houston
>> Philip G. Hoffman Hall, Room 524        Houston, TX-77204, USA
>> Tel: +1 (713) 743-3857                  Fax: +1 (713) 743-3335
>>
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>

Reply via email to