Hi all, I'm using MPI_Comm_spawn to start new child process. I found that if the parent process execute MPI_Finalize before the child process, the child process core dump on MPI_Finalize.
I couldn't find the correct way to have a clean shutdown of all processes ( parent and child ). What that I found is that sleep(2) in the parent process just before calling MPI_Finalize, gives the CPU cycle for the child process to finish its own MPI_Finalize, and only then there is no core dump. Although this resolve the issue, I can't accept this as acceptable solution. I guess I'm doing something wrong ( implementation or design ), so this is why I'm sending this email to the group ( and yes, I did check the FAQ, and done some search on the distribution list archive ). Here is the entire code to reproduce the issue : ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <stdio.h> #include <string.h> #include <unistd.h> #include <mpi.h> #include <stdlib.h> int main(int argc, char* argv[]){ int my_rank; /* rank of process */ int p; /* number of processes */ int source; /* rank of sender */ int dest; /* rank of receiver */ int tag=0; /* tag for messages */ char message[100]; /* storage for message */ MPI_Status status ; /* return status for receive */ /* start up MPI */ MPI_Init(&argc, &argv); /* find out process rank */ MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); fprintf(stderr,"My rank is : %d\n",my_rank); /* find out number of processes */ MPI_Comm_size(MPI_COMM_WORLD, &p); MPI_Comm parent; MPI_Comm_get_parent(&parent); if ( parent != MPI_COMM_NULL){ /* create message */ dest = 0; /* use strlen+1 so that '\0' get transmitted */ MPI_Recv(message, 100, MPI_CHAR, 0, tag,parent, &status); fprintf(stderr,"Got [%s] from root\n",message); /* shut down MPI */ MPI_Finalize(); } else{ printf("Hello MPI World From process 0: Num processes: %d\n",p); MPI_Comm everyone; MPI_Comm_spawn("mpitest",MPI_ARGV_NULL,1,MPI_INFO_NULL,0, MPI_COMM_SELF,&everyone, MPI_ERRCODES_IGNORE); /* find out number of processes */ MPI_Comm_size(everyone, &p); fprintf(stderr,"New world size:%d\n",p); for (source = 0; source < p; source++) { sprintf(message, "Hello MPI World from root to process %d!", source); MPI_Send(message, strlen(message)+1, MPI_CHAR,source, tag, everyone); } /** * Why this sleep resolve my core dump issues ? * Any timing between the parent to child process ? * Based on the document, and FAQ, I couldn't not find an answer for this issue. * * If you comment out the sleep(2), the child process will crash on the MPI_Finalize with * singal 11, Segmentation fault. */ //sleep(2); //un-comment this line to have the sleep, and avoid the core dumps. /* shut down MPI */ MPI_Finalize(); } return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Anyone for the rescue ? Thank you, Roy Avidor