The document attached to the Open MPI Wiki discusses all of the MCA
parameters for checkpoint/restart.
http://svn.open-mpi.org/trac/ompi/wiki/ProcessFT_CR
There are two ways to save checkpoint file data. I would suggest that
you set these parameters in your $HOME/.openmpi/mca-params.conf file
so you don't have to pass them everytime to mpirun (Assuming $HOME is
shared on all machines).
1) If you save to a globally shared directory (e.g., NFS directory)
then you can set the following MCA paramter in mpirun to point to
this location. This overrides the default directory which is $HOME.
snapc_base_global_snapshot_dir=$HOME/my/ckpt/dir
2) You can save to the local disk and have Open MPI transfer the
files from local disk to stable storage in a two step process. There
are three MCA parameters you will need to set for this.
To set the directory to save on the local disk you want to set the
following MCA parameter:
crs_base_snapshot_dir=/tmp
Set the global directory where all of the local checkpoints should be
saved:
snapc_base_global_snapshot_dir=$HOME/my/ckpt/dir
Activate the two step process:
snapc_base_store_in_place=0
The C/R User Document on the wiki covers many of these and other
parameters in more detail. I would encourage you to look through
there as well.
Best,
Josh
On Sep 13, 2008, at 7:49 PM, arun dhakne wrote:
Hi,
I have blcr installed and I am able to dump checkpoints in the $HOME
using ompi-checkpoint, i was wondering whether there is some option or
something, so that I would be able to dump the checkpoints at my
customized location say in /tmp ??
--
Thanks and Regards,
Arun
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users