<laugh> I never argue the standard with George, so I'll take his word for it.

I tried your program and it worked just fine for me without the sleep. However, 
I still think there is something wrong in it. I tried adjusting the number of 
processes and that caused it to hang, for one. Afraid I don't have time to 
debug it further, but can only suggest you take the code I sent you as a 
working example to use in your debugging.


On Aug 31, 2014, at 11:02 PM, George Bosilca <bosi...@icl.utk.edu> wrote:

> Based on the MPI standard (MPI 3.0 section 10.5.4 page 399) there is no need 
> to disconnect the child processes from the parent in order to cleanly 
> finalize. From this perspective, the original example is correct, but 
> sub-optimal as the parent processes calling MPI_Finalize might block until 
> all connected processes (in this case all the children processes) will call 
> MPI_Finalize. To be more precise, the disconnect has a single role to 
> redivide the application in separated groups of connected processes in order 
> to prevent error propagation (such as MPI_Abort).
> 
>   George.
> 
> 
> 
> On Mon, Sep 1, 2014 at 12:58 AM, Ralph Castain <r...@open-mpi.org> wrote:
> You need to disconnect the parent/child from each other prior to finalizing - 
> see the attached example
> 
> 
> 
> 
> On Aug 31, 2014, at 9:44 PM, Roy <open...@jsp.selfip.org> wrote:
> 
> > Hi all,
> >
> > I'm using MPI_Comm_spawn to start new child process.
> > I found that if the parent process execute MPI_Finalize before the child
> > process, the child process core dump on MPI_Finalize.
> >
> > I couldn't find the correct way to have a clean shutdown of all processes
> > ( parent and child ).
> > What that I found is that sleep(2) in the parent process just before
> > calling MPI_Finalize, gives the CPU cycle for the child process to finish
> > its own MPI_Finalize, and only then there is no core dump.
> >
> > Although this resolve the issue, I can't accept this as acceptable solution.
> >
> > I guess I'm doing something wrong ( implementation or design ), so this is
> > why I'm sending this email to the group ( and yes, I did check the FAQ,
> > and done some search on the distribution list archive ).
> >
> > Here is the entire code to reproduce the issue :
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > #include <stdio.h>
> > #include <string.h>
> > #include <unistd.h>
> > #include <mpi.h>
> > #include <stdlib.h>
> >
> > int main(int argc, char* argv[]){
> >       int  my_rank; /* rank of process */
> >       int  p;       /* number of processes */
> >       int source;   /* rank of sender */
> >       int dest;     /* rank of receiver */
> >       int tag=0;    /* tag for messages */
> >       char message[100];        /* storage for message */
> >       MPI_Status status ;   /* return status for receive */
> >
> >       /* start up MPI */
> >
> >       MPI_Init(&argc, &argv);
> >
> >       /* find out process rank */
> >       MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
> >       fprintf(stderr,"My rank is : %d\n",my_rank);
> >       /* find out number of processes */
> >       MPI_Comm_size(MPI_COMM_WORLD, &p);
> >
> >       MPI_Comm parent;
> >       MPI_Comm_get_parent(&parent);
> >
> >       if ( parent != MPI_COMM_NULL){
> >               /* create message */
> >               dest = 0;
> >               /* use strlen+1 so that '\0' get transmitted */
> >
> >               MPI_Recv(message, 100, MPI_CHAR, 0, tag,parent, &status);
> >               fprintf(stderr,"Got [%s] from root\n",message);
> >               /* shut down MPI */
> >               MPI_Finalize();
> >
> >       }
> >       else{
> >               printf("Hello MPI World From process 0: Num processes: 
> > %d\n",p);
> >               MPI_Comm everyone;
> >               MPI_Comm_spawn("mpitest",MPI_ARGV_NULL,1,MPI_INFO_NULL,0,     
> >   MPI_COMM_SELF,&everyone,
> > MPI_ERRCODES_IGNORE);
> >               /* find out number of processes */
> >               MPI_Comm_size(everyone, &p);
> >               fprintf(stderr,"New world size:%d\n",p);
> >               for (source = 0; source < p; source++) {
> >                       sprintf(message, "Hello MPI World from root to 
> > process %d!", source);
> >                       MPI_Send(message, strlen(message)+1, MPI_CHAR,source, 
> > tag, everyone);
> >               }
> >
> >               /**
> >                * Why this sleep resolve my core dump issues ?
> >                * Any timing between the parent to child process ?
> >                * Based on the document, and FAQ, I couldn't not find an 
> > answer for
> > this issue.
> >                *
> >                * If you comment out the sleep(2), the child process will 
> > crash on the
> > MPI_Finalize with
> >                * singal 11, Segmentation fault.
> >                */
> >               //sleep(2); //un-comment this line to have the sleep, and 
> > avoid the core
> > dumps.
> >
> >               /* shut down MPI */
> >               MPI_Finalize();
> >
> >       }
> >       return 0;
> > }
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > Anyone for the rescue ?
> >
> >
> > Thank you,
> > Roy Avidor
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > Searchable archives: 
> > http://www.open-mpi.org/community/lists/users/2014/09/index.php
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/09/25207.php
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/09/25208.php

Reply via email to