Hello,

There are a few things you need to do to build Open MPI with Checkpoint/Restart support. By default Open MPI is configured without checkpoint/restart support. 1) Make sure you have BLCR successfully installed and loaded on your system(s) 2) configure Open MPI with the "--with-ft=cr" option, which enables checkpoint/restart fault tolerance Note: you may also have to specify the install directory of BLCR with the "--with-blcr=/path/to/blcr"
3) make and make install

The resultant build will have support for checkpoint/restart and the tools (e.g., ompi-checkpoint, ompi-restart) will become available.

Looking at the documentation it doesn't seem to include these steps. I'll fix that later today, and post and updated file to the wiki. Sorry about that. :(

Hope this helps,
Josh

On Aug 21, 2007, at 1:09 PM, Hiep Bui Hoang wrote:

Hello,
I'm Hiep, I'm trying to use checkpoint/restart feature in Open MPI. I had read information about this feature in https://svn.open- mpi.org/trac/ompi/wiki/ProcessFT_CR and Open-MPI-FT-CR-Draft- v1.pdf. I had built Open MPI from "trunk" which gotten by Subversion. But I don't know how to enable checkpoint/restart fault tolerance in Open MPI.
So that, I get this error when I try this command: ompi-checkpoint.
       bash: ompi-checkpoint: command not found
I want to ask you how to build and use checkpoint/restart feature in Open MPI.
Please tell me in details, I'm a new user.
Thanks!
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to