I wonder if this is a bug with BLCR (since the segv stack is in the BLCR thread). Can you try an non-MPI version of this application that uses popen(), and see if BLCR properly checkpoints/restarts it?

If so, we can start to see what Open MPI might be doing to confuse things, but I suspect that this might be a bug with BLCR. Either way let us know what you find out.

Cheers,
Josh

On Mar 27, 2010, at 6:17 AM, jody wrote:

I'm not sure if this is the cause of your problems:
You define the constant BUFFER_SIZE, but in the code you use a constant called BUFSIZ...
Jody


On Fri, Mar 26, 2010 at 10:29 PM, Jean Potsam <jeanpot...@yahoo.co.uk> wrote:
Dear All,
I am having a problem with openmpi . I have installed openmpi 1.4 and blcr 0.8.1

I have written a small mpi application as follows below:

#######################
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <fcntl.h>
#include <limits.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <mpi.h>
#include<signal.h>
#include <fcntl.h>
#include <unistd.h>

#define BUFFER_SIZE PIPE_BUF

char * getprocessid()
{
    FILE * read_fp;
    char buffer[BUFSIZ + 1];
    int chars_read;
    char * buffer_data="12345";
    memset(buffer, '\0', sizeof(buffer));
  read_fp = popen("uname -a", "r");
     /*
      ...
 */
     return buffer_data;
}

int main(int argc, char ** argv)
{
  MPI_Status status;
 int rank;
   int size;
char * thedata;
    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD,&size);
    MPI_Comm_rank(MPI_COMM_WORLD,&rank);
 thedata=getprocessid();
 printf(" the data is %s", thedata);
    MPI_Finalize();
}
############################

I get the following result:

#######################
jean@sunn32:~$ mpicc pipetest2.c -o pipetest2
jean@sunn32:~$ mpirun -np 1 -am ft-enable-cr -mca btl ^openib pipetest2
[sun32:19211] *** Process received signal ***
[sun32:19211] Signal: Segmentation fault (11)
[sun32:19211] Signal code: Address not mapped (1)
[sun32:19211] Failing at address: 0x4
[sun32:19211] [ 0] [0xb7f3c40c]
[sun32:19211] [ 1] /lib/libc.so.6(cfree+0x3b) [0xb796868b]
[sun32:19211] [ 2] /usr/local/blcr/lib/libcr.so.0(cri_info_free +0x2a) [0xb7a5925a]
[sun32:19211] [ 3] /usr/local/blcr/lib/libcr.so.0 [0xb7a5ac72]
[sun32:19211] [ 4] /lib/libc.so.6(__libc_fork+0x186) [0xb7991266]
[sun32:19211] [ 5] /lib/libc.so.6(_IO_proc_open+0x7e) [0xb7958b6e]
[sun32:19211] [ 6] /lib/libc.so.6(popen+0x6c) [0xb7958dfc]
[sun32:19211] [ 7] pipetest2(getprocessid+0x42) [0x8048836]
[sun32:19211] [ 8] pipetest2(main+0x4d) [0x8048897]
[sun32:19211] [ 9] /lib/libc.so.6(__libc_start_main+0xe5) [0xb7912455]
[sun32:19211] [10] pipetest2 [0x8048761]
[sun32:19211] *** End of error message ***
#####################################################


However, If I compile the application using gcc, it works fine. The problem arises with:
  read_fp = popen("uname -a", "r");

Does anyone has an idea how to resolve this problem?

Many thanks

Jean


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to