Dear all, thank you very much for the time spent at looking at my problem.

After reading your contributions, it's not clear wether there is a bug in
OpenMPI or not.

So I created a small self contained source code to analyse the behavior,
and the problem is still there.

I was wondering if the local and remote leader in the 2 groups could be
the same process. Unfortunately, I get
an error in the two cases (local and remote leader identical or not).

What do you think about my small source code ?

Best regards,

Frédéric.


On Tue, 07 Jun 2011 10:31:51 -0500, Edgar Gabriel <gabr...@cs.uh.edu>
wrote:
> On 6/7/2011 10:23 AM, George Bosilca wrote:
>> 
>> On Jun 7, 2011, at 11:00 , Edgar Gabriel wrote:
>> 
>>> George,
>>> 
>>> I did not look over all the details of your test, but it looks to
>>> me like you are violating one of the requirements of
>>> intercomm_create namely the request that the two groups have to be
>>> disjoint. In your case the parent process(es) are part of both
>>> local intra-communicators, isn't it?
>> 
>> The two groups of the two local communicators are disjoints. One
>> contains A,B while the other only C. The bridge communicator contains
>> A,C.
>> 
>> I'm confident my example is supposed to work. At least for Open MPI
>> the error is under the hood, as the resulting inter-communicator is
>> valid but contains NULL endpoints for the remote process.
> 
> I'll come back to that later, I am not yet convinced that your code is
> correct :-) Your local groups might be disjoint, but I am worried about
> the ranks of the remote leader in your example. THey can not be 0 from
> both groups perspective.
> 
>> 
>> Regarding the fact that the two leader should be separate processes,
>> you will not find any wording about this in the current version of
>> the standard. In the 1.1 there were two opposite sentences about this
>> one stating that the two groups can be disjoint, while the other
>> claiming that the two leaders can be the same process. After
>> discussion, the agreement was that the two groups have to be
>> disjoint, and the standard has been amended to match the agreement.
> 
> 
> I realized that this is a non-issue. If the two local groups are
> disjoint, there is no way that the two local leaders are the same
process.
> 
> Thanks
> Edgar
> 
>> 
>> george.
>> 
>> 
>>> 
>>> I just have MPI-1.1. at hand right now, but here is what it says: 
>>> ----
>>> 
>>> Overlap of local and remote groups that are bound into an 
>>> inter-communicator is prohibited. If there is overlap, then the
>>> program is erroneous and is likely to deadlock.
>>> 
>>> ---- so bottom line is that the two local intra-communicators that
>>> are being used have to be disjoint, and the bridgecomm needs to be
>>> a communicator where at least one process of each of the two
>>> disjoint groups need to be able to talk to each other.
>>> Interestingly I did not find a sentence whether it is allowed to be
>>> the same process, or whether the two local leaders need to be
>>> separate processes...
>>> 
>>> 
>>> Thanks Edgar
>>> 
>>> 
>>> On 6/7/2011 12:57 AM, George Bosilca wrote:
>>>> Frederic,
>>>> 
>>>> Attached you will find an example that is supposed to work. The
>>>> main difference with your code is on T3, T4 where you have
>>>> inversed the local and remote comm. As depicted on the picture
>>>> attached below, during the 3th step you will create the intercomm
>>>> between ab and c (no overlap) using ac as a bridge communicator
>>>> (here the two roots, a and c, can exchange messages).
>>>> 
>>>> Based on the MPI 2.2 standard, especially on the paragraph in
>>>> PS:, the attached code should have been working. Unfortunately, I
>>>> couldn't run it successfully neither with Open MPI trunk nor
>>>> MPICH2 1.4rc1.
>>>> 
>>>> george.
>>>> 
>>>> PS: Here is what the MPI standard states about the
>>>> MPI_Intercomm_create:
>>>>> The function MPI_INTERCOMM_CREATE can be used to create an
>>>>> inter-communicator from two existing intra-communicators, in
>>>>> the following situation: At least one selected member from each
>>>>> group (the “group leader”) has the ability to communicate with
>>>>> the selected member from the other group; that is, a “peer”
>>>>> communicator exists to which both leaders belong, and each
>>>>> leader knows the rank of the other leader in this peer
>>>>> communicator. Furthermore, members of each group know the rank
>>>>> of their leader.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Jun 1, 2011, at 05:00 , Frédéric Feyel wrote:
>>>> 
>>>>> Hello,
>>>>> 
>>>>> I have a problem using MPI_Intercomm_create.
>>>>> 
>>>>> I 5 tasks, let's say T0, T1, T2, T3, T4 resulting from two
>>>>> spawn operations by T0.
>>>>> 
>>>>> So I have two intra-communicator :
>>>>> 
>>>>> intra0 contains : T0, T1, T2 intra1 contains : T0, T3, T4
>>>>> 
>>>>> my goal is to make a collective loop to build a single
>>>>> intra-communicator containing T0, T1, T2, T3, T4
>>>>> 
>>>>> I tried to do it using MPI_Intercomm_create and
>>>>> MPI_Intercom_merge calls, but without success (I always get MPI
>>>>> internal errors).
>>>>> 
>>>>> What I am doing :
>>>>> 
>>>>> on T0 : *******
>>>>> 
>>>>> MPI_Intercom_create(intra0,0,intra1,0,1,&new_com)
>>>>> 
>>>>> on T1 and T2 : **************
>>>>> 
>>>>> MPI_Intercom_create(intra0,0,MPI_COMM_WORLD,0,1,&new_com)
>>>>> 
>>>>> on T3 and T4 : **************
>>>>> 
>>>>> MPI_Intercom_create(intra1,0,MPI_COMM_WORLD,0,1,&new_com)
>>>>> 
>>>>> 
>>>>> I'm certainly missing something. Could anybody help me to solve
>>>>> this problem ?
>>>>> 
>>>>> Best regards,
>>>>> 
>>>>> Frédéric.
>>>>> 
>>>>> PS : of course I did an extensive web search without finding
>>>>> anything usefull on my problem.
>>>>> 
>>>>> _______________________________________________ users mailing
>>>>> list us...@open-mpi.org 
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________ users mailing
>>>> list us...@open-mpi.org 
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> -- Edgar Gabriel Assistant Professor Parallel Software Technologies
>>> Lab      http://pstl.cs.uh.edu Department of Computer Science
>>> University of Houston Philip G. Hoffman Hall, Room 524
>>> Houston, TX-77204, USA Tel: +1 (713) 743-3857                  Fax:
>>> +1 (713) 743-3335
>>> 
>>> _______________________________________________ users mailing list 
>>> us...@open-mpi.org 
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> _______________________________________________ users mailing list 
>> us...@open-mpi.org 
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
#include <stdio.h>
#include <malloc.h>
#include <mpi.h>

int main(int argc, char **argv)
{
  int initialized;
  MPI_Comm boss_comm;

  MPI_Initialized(&initialized);
  if (!initialized) {
    MPI_Init(&argc,&argv);

    MPI_Comm_get_parent(&boss_comm);
    if (boss_comm != MPI_COMM_NULL) {
      // inside the spawned tasks
      int w_size,my_rank;
      MPI_Comm comm,intercom;

      MPI_Comm_size(MPI_COMM_WORLD,&w_size);
      MPI_Comm_rank(MPI_COMM_WORLD,&my_rank);
      printf("Slave task started, world size=%d my rank=%d\n",w_size,my_rank);

      printf("Slave task, merging the communicator...\n");
      MPI_Intercomm_merge(boss_comm,1,&comm);

      MPI_Comm_size(comm,&w_size);
      MPI_Comm_rank(comm,&my_rank);
      printf("Slave task after the merge, world size=%d my 
rank=%d\n",w_size,my_rank);

      printf("Slave, creating an intercom\n");

      int local_leader;

      MPI_Bcast(&local_leader,1,MPI_INT,0,comm);
      printf("Slave, local leader=%d\n",local_leader);
      MPI_Intercomm_create(comm,local_leader, // current comm , local leader
                           MPI_COMM_WORLD,0, // peer comm , remote leader : 
significant only in the master
                           100 , &intercom);

    } else {

      // the first task
      char** args = malloc(sizeof(char*)); args[0]=NULL;
      MPI_Comm intercomm_1,intercomm_2,comm_1,comm_2,full_comm1,full_comm2;
      MPI_Comm intercomm_full;

      printf("Master task: first spawn\n");
      MPI_Comm_dup(MPI_COMM_SELF,&comm_1);
      
MPI_Comm_spawn("spawn-example",args,2,MPI_INFO_NULL,0,comm_1,&intercomm_1,(int*)MPI_ERRCODES_IGNORE);

      printf("Master task: second spawn\n");
      MPI_Comm_dup(MPI_COMM_SELF,&comm_2);
      
MPI_Comm_spawn("spawn-example",args,2,MPI_INFO_NULL,0,comm_2,&intercomm_2,(int*)MPI_ERRCODES_IGNORE);

      printf("Master task: the two groups have been spawned\n");
      printf("Master task: merging the communicators...\n");

      printf("Master task: merging the first communicator...\n");
      MPI_Intercomm_merge(intercomm_1,0,&full_comm1);

      printf("Master task: merging the second communicator...\n");
      MPI_Intercomm_merge(intercomm_2,0,&full_comm2);

      printf("Master task: creating an intercom...\n");
      // Creates an intercom from two separated group communicator
      // Those two separeted communicators must have at least one task in 
common (the "leader")
      // whose rank must be known on each side.
      // Here the leader is the master task, whose rank is zero by construction

      int local_leader;

      local_leader=0; MPI_Bcast(&local_leader,1,MPI_INT,0,full_comm1);
      local_leader=1; MPI_Bcast(&local_leader,1,MPI_INT,0,full_comm2);

      MPI_Intercomm_create(full_comm1,0, // current comm , local leader
                           full_comm2,1, // peer comm , remote leader
                           100 , &intercomm_full);


    }

  }


  MPI_Finalize();
}

Reply via email to