Hi , I am working on mpi
I've have installed openmpi 1.4.3 with blcr included.
I ran a simple mpi application using a hostfile:

pc1 slots=2 max-slots=2
pc2 slots=2 max-slots=2

And, i ran command to run it with checkpoint supported
#mpirun --hostfile myhost -np 4 --am ft-enable-cr ./mpi_app

When i checkpointed, i got an error:

[pc1:04836] Error: expected_component: PID information unavailable!
--------------------------------------------------------------------------
Error: The local checkpoint contains invalid or incomplete metadata for
Process 3411083265.2.
       This usually indicates that the local checkpoint is invalid.
       Check the metadata file (snapshot_meta.data) in the following
directory:
         /root/ompi_global_snapshot_4836.ckpt/0/opal_snapshot_2.ckpt
--------------------------------------------------------------------------
[pc1:04836] [[52049,0],0] ORTE_ERROR_LOG: Error in file snapc_full_global.c
at line 1054

I'm glad if anyone can help me.

Reply via email to