I wonder if this is a bug with BLCR (since the segv stack is in the
BLCR thread). Can you try an non-MPI version of this application that
uses popen(), and see if BLCR properly checkpoints/restarts it?
If so, we can start to see what Open MPI might be doing to confuse
things, but I suspect that this might be a bug with BLCR. Either way
let us know what you find out.
Cheers,
Josh
On Mar 27, 2010, at 6:17 AM, jody wrote:
I'm not sure if this is the cause of your problems:
You define the constant BUFFER_SIZE, but in the code you use a
constant called BUFSIZ...
Jody
On Fri, Mar 26, 2010 at 10:29 PM, Jean Potsam
<jeanpot...@yahoo.co.uk> wrote:
Dear All,
I am having a problem with openmpi . I have installed
openmpi 1.4 and blcr 0.8.1
I have written a small mpi application as follows below:
#######################
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <fcntl.h>
#include <limits.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <mpi.h>
#include<signal.h>
#include <fcntl.h>
#include <unistd.h>
#define BUFFER_SIZE PIPE_BUF
char * getprocessid()
{
FILE * read_fp;
char buffer[BUFSIZ + 1];
int chars_read;
char * buffer_data="12345";
memset(buffer, '\0', sizeof(buffer));
read_fp = popen("uname -a", "r");
/*
...
*/
return buffer_data;
}
int main(int argc, char ** argv)
{
MPI_Status status;
int rank;
int size;
char * thedata;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD,&size);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
thedata=getprocessid();
printf(" the data is %s", thedata);
MPI_Finalize();
}
############################
I get the following result:
#######################
jean@sunn32:~$ mpicc pipetest2.c -o pipetest2
jean@sunn32:~$ mpirun -np 1 -am ft-enable-cr -mca btl ^openib
pipetest2
[sun32:19211] *** Process received signal ***
[sun32:19211] Signal: Segmentation fault (11)
[sun32:19211] Signal code: Address not mapped (1)
[sun32:19211] Failing at address: 0x4
[sun32:19211] [ 0] [0xb7f3c40c]
[sun32:19211] [ 1] /lib/libc.so.6(cfree+0x3b) [0xb796868b]
[sun32:19211] [ 2] /usr/local/blcr/lib/libcr.so.0(cri_info_free
+0x2a) [0xb7a5925a]
[sun32:19211] [ 3] /usr/local/blcr/lib/libcr.so.0 [0xb7a5ac72]
[sun32:19211] [ 4] /lib/libc.so.6(__libc_fork+0x186) [0xb7991266]
[sun32:19211] [ 5] /lib/libc.so.6(_IO_proc_open+0x7e) [0xb7958b6e]
[sun32:19211] [ 6] /lib/libc.so.6(popen+0x6c) [0xb7958dfc]
[sun32:19211] [ 7] pipetest2(getprocessid+0x42) [0x8048836]
[sun32:19211] [ 8] pipetest2(main+0x4d) [0x8048897]
[sun32:19211] [ 9] /lib/libc.so.6(__libc_start_main+0xe5) [0xb7912455]
[sun32:19211] [10] pipetest2 [0x8048761]
[sun32:19211] *** End of error message ***
#####################################################
However, If I compile the application using gcc, it works fine. The
problem arises with:
read_fp = popen("uname -a", "r");
Does anyone has an idea how to resolve this problem?
Many thanks
Jean
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users