Hi there.

I am facing a very strange problem when using MPI_Barrier over an
inter-communicator after some operations I describe bellow:

1) I start a server calling mpirun.
2) The server spawns 2 copies of a client using MPI_Comm_spawn, creating an
inter-communicator between the two groups. The server group with 1 process
(lets name it as A) and the client group with 2 processes (group B).
3) After that, I need to detach one of the processes (rank 0) in group B
from the inter-communicator AB. To do that I do the following steps:

Server side:
        .....
        tmp_inter_comm = client_comm.Create ( client_comm.Get_group ( ) );
client_comm.Free ( );
client_comm = tmp_inter_comm;
        .....
        client_comm.Barrier();
        .....

Client side:
        ....
        rank = 0;
        tmp_inter_comm = server_comm.Create ( server_comm.Get_group (
).Excl ( 1, &rank ) );
server_comm.Free ( );
server_comm = tmp_inter_comm;
        .....
        if (server_comm != MPI::COMM_NULL)
            server_comm.Barrier();


The problem: everything works fine until the call to Barrier. In that
point, the server exits the barrier, but the client at the group B does
not. Observe that we have only one process inside B, because I used Excl to
remove one process from the original group.

p.s.: This occurs in the version 1.5.4 and the C++ API.

I am very concerned about this problem because this solution plays a very
important role in my master thesis.

Is this an ompi problem or am I doing something wrong?

Thanks in advance

Rodrigo Oliveira

Reply via email to