Hi Josh/All, I just tested a simple c application with blcr and it worked fine. ########################################## #include <unistd.h> #include <stdlib.h> #include <stdio.h> #include <string.h> #include <fcntl.h> #include <limits.h> #include <sys/types.h> #include <sys/stat.h> #include<signal.h> #include <fcntl.h> #include <unistd.h>
char * getprocessid() { FILE * read_fp; char buffer[BUFSIZ + 1]; int chars_read; char * buffer_data="12345"; memset(buffer, '\0', sizeof(buffer)); read_fp = popen("uname -a", "r"); /* ... */ return buffer_data; } int main(int argc, char ** argv) { int rank; int size; char * thedata; int n=0; thedata=getprocessid(); printf(" the data is %s", thedata); while( n <10) { printf("value is %d\n", n); n++; sleep(1); } printf("bye\n"); } jean@sun32:/tmp$ cr_run ./pipetest3 & [1] 31807 jean@sun32:~$ the data is 12345value is 0 value is 1 value is 2 ... value is 9 bye jean@sun32:/tmp$ cr_checkpoint 31807 jean@sun32:/tmp$ cr_restart context.31807 value is 7 value is 8 value is 9 bye ############################################## It looks like its more to do with Openmpi. Any ideas from you side? Thank you. Kind regards, Jean. --- On Mon, 29/3/10, Josh Hursey <jjhur...@open-mpi.org> wrote: From: Josh Hursey <jjhur...@open-mpi.org> Subject: Re: [OMPI users] Segmentation fault (11) To: "Open MPI Users" <us...@open-mpi.org> List-Post: users@lists.open-mpi.org Date: Monday, 29 March, 2010, 16:08 I wonder if this is a bug with BLCR (since the segv stack is in the BLCR thread). Can you try an non-MPI version of this application that uses popen(), and see if BLCR properly checkpoints/restarts it? If so, we can start to see what Open MPI might be doing to confuse things, but I suspect that this might be a bug with BLCR. Either way let us know what you find out. Cheers, Josh On Mar 27, 2010, at 6:17 AM, jody wrote: > I'm not sure if this is the cause of your problems: > You define the constant BUFFER_SIZE, but in the code you use a constant > called BUFSIZ... > Jody > > > On Fri, Mar 26, 2010 at 10:29 PM, Jean Potsam <jeanpot...@yahoo.co.uk> wrote: > Dear All, > I am having a problem with openmpi . I have installed openmpi >1.4 and blcr 0.8.1 > > I have written a small mpi application as follows below: > > ####################### > #include <unistd.h> > #include <stdlib.h> > #include <stdio.h> > #include <string.h> > #include <fcntl.h> > #include <limits.h> > #include <sys/types.h> > #include <sys/stat.h> > #include <mpi.h> > #include<signal.h> > #include <fcntl.h> > #include <unistd.h> > > #define BUFFER_SIZE PIPE_BUF > > char * getprocessid() > { > FILE * read_fp; > char buffer[BUFSIZ + 1]; > int chars_read; > char * buffer_data="12345"; > memset(buffer, '\0', sizeof(buffer)); > read_fp = popen("uname -a", "r"); > /* > ... > */ > return buffer_data; > } > > int main(int argc, char ** argv) > { > MPI_Status status; > int rank; > int size; > char * thedata; > MPI_Init(&argc, &argv); > MPI_Comm_size(MPI_COMM_WORLD,&size); > MPI_Comm_rank(MPI_COMM_WORLD,&rank); > thedata=getprocessid(); > printf(" the data is %s", thedata); > MPI_Finalize(); > } > ############################ > > I get the following result: > > ####################### > jean@sunn32:~$ mpicc pipetest2.c -o pipetest2 > jean@sunn32:~$ mpirun -np 1 -am ft-enable-cr -mca btl ^openib pipetest2 > [sun32:19211] *** Process received signal *** > [sun32:19211] Signal: Segmentation fault (11) > [sun32:19211] Signal code: Address not mapped (1) > [sun32:19211] Failing at address: 0x4 > [sun32:19211] [ 0] [0xb7f3c40c] > [sun32:19211] [ 1] /lib/libc.so.6(cfree+0x3b) [0xb796868b] > [sun32:19211] [ 2] /usr/local/blcr/lib/libcr.so.0(cri_info_free+0x2a) > [0xb7a5925a] > [sun32:19211] [ 3] /usr/local/blcr/lib/libcr.so.0 [0xb7a5ac72] > [sun32:19211] [ 4] /lib/libc.so.6(__libc_fork+0x186) [0xb7991266] > [sun32:19211] [ 5] /lib/libc.so.6(_IO_proc_open+0x7e) [0xb7958b6e] > [sun32:19211] [ 6] /lib/libc.so.6(popen+0x6c) [0xb7958dfc] > [sun32:19211] [ 7] pipetest2(getprocessid+0x42) [0x8048836] > [sun32:19211] [ 8] pipetest2(main+0x4d) [0x8048897] > [sun32:19211] [ 9] /lib/libc.so.6(__libc_start_main+0xe5) [0xb7912455] > [sun32:19211] [10] pipetest2 [0x8048761] > [sun32:19211] *** End of error message *** > ##################################################### > > > However, If I compile the application using gcc, it works fine. The problem > arises with: > read_fp = popen("uname -a", "r"); > > Does anyone has an idea how to resolve this problem? > > Many thanks > > Jean > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users