George,

I did not look over all the details of your test, but it looks to me
like you are violating one of the requirements of intercomm_create
namely the request that the two groups have to be disjoint. In your case
the parent process(es) are part of both local intra-communicators, isn't it?

I just have MPI-1.1. at hand right now, but here is what it says:
----

Overlap of local and remote groups that are bound into an
inter-communicator is prohibited. If there is overlap, then the program
is erroneous and is likely to deadlock.

----
so bottom line is that the two local intra-communicators that are being
used have to be disjoint, and the bridgecomm needs to be a communicator
where at least one process of each of the two disjoint groups need to be
able to talk to each other. Interestingly I did not find a sentence
whether it is allowed to be the same process, or whether the two local
leaders need to be separate processes...


Thanks
Edgar


On 6/7/2011 12:57 AM, George Bosilca wrote:
> Frederic,
> 
> Attached you will find an example that is supposed to work. The main 
> difference with your code is on T3, T4 where you have inversed the local and 
> remote comm. As depicted on the picture attached below, during the 3th step 
> you will create the intercomm between ab and c (no overlap) using ac as a 
> bridge communicator (here the two roots, a and c, can exchange messages).
> 
> Based on the MPI 2.2 standard, especially on the paragraph in PS:, the 
> attached code should have been working. Unfortunately, I couldn't run it 
> successfully neither with Open MPI trunk nor MPICH2 1.4rc1. 
> 
>  george.
> 
> PS: Here is what the MPI standard states about the MPI_Intercomm_create:
>> The function MPI_INTERCOMM_CREATE can be used to create an 
>> inter-communicator from two existing intra-communicators, in the following 
>> situation: At least one selected member from each group (the “group leader”) 
>> has the ability to communicate with the selected member from the other 
>> group; that is, a “peer” communicator exists to which both leaders belong, 
>> and each leader knows the rank of the other leader in this peer 
>> communicator. Furthermore, members of each group know the rank of their 
>> leader.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Jun 1, 2011, at 05:00 , Frédéric Feyel wrote:
> 
>> Hello,
>>
>> I have a problem using MPI_Intercomm_create.
>>
>> I 5 tasks, let's say T0, T1, T2, T3, T4 resulting from two spawn
>> operations by T0.
>>
>> So I have two intra-communicator :
>>
>> intra0 contains : T0, T1, T2
>> intra1 contains : T0, T3, T4
>>
>> my goal is to make a collective loop to build a single intra-communicator
>> containing T0, T1, T2, T3, T4
>>
>> I tried to do it using MPI_Intercomm_create and MPI_Intercom_merge calls,
>> but without success (I always get MPI internal errors).
>>
>> What I am doing :
>>
>> on T0 :
>> *******
>>
>> MPI_Intercom_create(intra0,0,intra1,0,1,&new_com)
>>
>> on T1 and T2 :
>> **************
>>
>> MPI_Intercom_create(intra0,0,MPI_COMM_WORLD,0,1,&new_com)
>>
>> on T3 and T4 :
>> **************
>>
>> MPI_Intercom_create(intra1,0,MPI_COMM_WORLD,0,1,&new_com)
>>
>>
>> I'm certainly missing something. Could anybody help me to solve this
>> problem ?
>>
>> Best regards,
>>
>> Frédéric.
>>
>> PS : of course I did an extensive web search without finding anything
>> usefull on my problem.
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Edgar Gabriel
Assistant Professor
Parallel Software Technologies Lab      http://pstl.cs.uh.edu
Department of Computer Science          University of Houston
Philip G. Hoffman Hall, Room 524        Houston, TX-77204, USA
Tel: +1 (713) 743-3857                  Fax: +1 (713) 743-3335

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to