<laugh> I never argue the standard with George, so I'll take his word for it.
I tried your program and it worked just fine for me without the sleep. However, I still think there is something wrong in it. I tried adjusting the number of processes and that caused it to hang, for one. Afraid I don't have time to debug it further, but can only suggest you take the code I sent you as a working example to use in your debugging. On Aug 31, 2014, at 11:02 PM, George Bosilca <bosi...@icl.utk.edu> wrote: > Based on the MPI standard (MPI 3.0 section 10.5.4 page 399) there is no need > to disconnect the child processes from the parent in order to cleanly > finalize. From this perspective, the original example is correct, but > sub-optimal as the parent processes calling MPI_Finalize might block until > all connected processes (in this case all the children processes) will call > MPI_Finalize. To be more precise, the disconnect has a single role to > redivide the application in separated groups of connected processes in order > to prevent error propagation (such as MPI_Abort). > > George. > > > > On Mon, Sep 1, 2014 at 12:58 AM, Ralph Castain <r...@open-mpi.org> wrote: > You need to disconnect the parent/child from each other prior to finalizing - > see the attached example > > > > > On Aug 31, 2014, at 9:44 PM, Roy <open...@jsp.selfip.org> wrote: > > > Hi all, > > > > I'm using MPI_Comm_spawn to start new child process. > > I found that if the parent process execute MPI_Finalize before the child > > process, the child process core dump on MPI_Finalize. > > > > I couldn't find the correct way to have a clean shutdown of all processes > > ( parent and child ). > > What that I found is that sleep(2) in the parent process just before > > calling MPI_Finalize, gives the CPU cycle for the child process to finish > > its own MPI_Finalize, and only then there is no core dump. > > > > Although this resolve the issue, I can't accept this as acceptable solution. > > > > I guess I'm doing something wrong ( implementation or design ), so this is > > why I'm sending this email to the group ( and yes, I did check the FAQ, > > and done some search on the distribution list archive ). > > > > Here is the entire code to reproduce the issue : > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > #include <stdio.h> > > #include <string.h> > > #include <unistd.h> > > #include <mpi.h> > > #include <stdlib.h> > > > > int main(int argc, char* argv[]){ > > int my_rank; /* rank of process */ > > int p; /* number of processes */ > > int source; /* rank of sender */ > > int dest; /* rank of receiver */ > > int tag=0; /* tag for messages */ > > char message[100]; /* storage for message */ > > MPI_Status status ; /* return status for receive */ > > > > /* start up MPI */ > > > > MPI_Init(&argc, &argv); > > > > /* find out process rank */ > > MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); > > fprintf(stderr,"My rank is : %d\n",my_rank); > > /* find out number of processes */ > > MPI_Comm_size(MPI_COMM_WORLD, &p); > > > > MPI_Comm parent; > > MPI_Comm_get_parent(&parent); > > > > if ( parent != MPI_COMM_NULL){ > > /* create message */ > > dest = 0; > > /* use strlen+1 so that '\0' get transmitted */ > > > > MPI_Recv(message, 100, MPI_CHAR, 0, tag,parent, &status); > > fprintf(stderr,"Got [%s] from root\n",message); > > /* shut down MPI */ > > MPI_Finalize(); > > > > } > > else{ > > printf("Hello MPI World From process 0: Num processes: > > %d\n",p); > > MPI_Comm everyone; > > MPI_Comm_spawn("mpitest",MPI_ARGV_NULL,1,MPI_INFO_NULL,0, > > MPI_COMM_SELF,&everyone, > > MPI_ERRCODES_IGNORE); > > /* find out number of processes */ > > MPI_Comm_size(everyone, &p); > > fprintf(stderr,"New world size:%d\n",p); > > for (source = 0; source < p; source++) { > > sprintf(message, "Hello MPI World from root to > > process %d!", source); > > MPI_Send(message, strlen(message)+1, MPI_CHAR,source, > > tag, everyone); > > } > > > > /** > > * Why this sleep resolve my core dump issues ? > > * Any timing between the parent to child process ? > > * Based on the document, and FAQ, I couldn't not find an > > answer for > > this issue. > > * > > * If you comment out the sleep(2), the child process will > > crash on the > > MPI_Finalize with > > * singal 11, Segmentation fault. > > */ > > //sleep(2); //un-comment this line to have the sleep, and > > avoid the core > > dumps. > > > > /* shut down MPI */ > > MPI_Finalize(); > > > > } > > return 0; > > } > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > Anyone for the rescue ? > > > > > > Thank you, > > Roy Avidor > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > Searchable archives: > > http://www.open-mpi.org/community/lists/users/2014/09/index.php > > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/09/25207.php > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/09/25208.php