On Nov 6, 2009, at 7:59 AM, Kritiraj Sajadah wrote:

> Hi Everyone,
>              I have install openmpi 1.3 and blcr 0.81 on my laptop (single 
> processor).
> 
> I am trying to checkpoint a small test application:
> 
> ###########
> 
> #include <mpi.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include<unistd.h>
> #include<signal.h>
> 
> int main(int argc, char **argv)
> {
> int rank,size;
> MPI_Init(&argc, &argv);
> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> MPI_Comm_size(MPI_COMM_WORLD, &size);
> printf("I am processor no %d of a total of %d procs \n", rank, size);
> system("sleep 10");
> printf("I am processor no %d of a total of %d procs \n", rank, size);
> system("sleep 10");
> printf("I am processor no %d of a total of %d procs \n", rank, size);
> system("sleep 10");
> printf("mpisleep bye \n");
> MPI_Finalize();
> return 0;
> }
> ###################
> 
> I compile it as follows:
> 
> mpicc mpisleep.c -o mpisleep
> 
> and i run it as follows:
> 
> mpirun -am ft-enable-cr -np 2 mpisleep.
> 
> When i try checkpointing ( ompi-checkpoint -v 8118) it, it checkpoints fine 
> but when i restart it, i get the following:
> 
> I am processor no 0 of a total of 2 procs 
> I am processor no 1 of a total of 2 procs 
> mpisleep bye 
> --------------------------------------------------------------------------
> mpirun noticed that process rank 1 with PID 8118 on node raj-laptop exited on 
> signal 13 (Broken pipe).
> --------------------------------------------------------------------------

Does the behavior change if you remove the 'system()' calls and replace them 
with 'sleep()'. The 'system()' call is a shorthand for fork/exec. fork/exec has 
been known to cause problems when called my an MPI process.

Give that a try and let me know if it helps.

-- Josh

> 
> Any suggestions is very much appreciated
> 
> Raj
> 
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to