Re: [OMPI users] mca_oob_tcp_peer_try_connect error when checkpoint and restart.

2007-10-03 Thread Hiep Bui Hoang
. -- There is only one local snapshot created on the computer where I run command mpirun and ompi-checkpoint, and after create that local snapshot the checkpoint is terminated with above error. Some body help me to solve that error! Thanks. On 10/2/07, Hiep Bui Hoang wrote: > > > Hi

[OMPI users] mca_oob_tcp_peer_try_connect error when checkpoint and restart.

2007-10-01 Thread Hiep Bui Hoang
Hi, I had setup Open MPI "trunk_16171" for 3 computers with Lan connection, and set environment parameters, ssh without typing password for each node. I use Red Hat Enterprise Linux 5. The program I tried is 'send_recv'. I run successful my 'send_recv' program in those 3 nodes. And checkpoint/resta

Re: [OMPI users] How to build and use checkpoint/restart fault tolerance in Open MPI.

2007-08-22 Thread Hiep Bui Hoang
later today, and post and updated file to the wiki. > Sorry about that. :( > > Hope this helps, > Josh > > On Aug 21, 2007, at 1:09 PM, Hiep Bui Hoang wrote: > > > Hello, > > I'm Hiep, I'm trying to use checkpoint/restart feature in Open MPI. > > I

[OMPI users] How to build and use checkpoint/restart fault tolerance in Open MPI.

2007-08-21 Thread Hiep Bui Hoang
Hello, I'm Hiep, I'm trying to use checkpoint/restart feature in Open MPI. I had read information about this feature in https://svn.open-mpi.org/trac/ompi/wiki/ProcessFT_CR and Open-MPI-FT-CR-Draft-v1.pdf. I had built Open MPI from "trunk" which gotten by Subversion. But I don't know how to enable