Not really - the person who wrote that code for his PhD thesis has since become a professor and rarely has time to respond on the mailing list, nor to maintain the code. So I'm afraid we don't have anyone who knows much about it any more.
I plan to rework the checkpoint support in upcoming months, but can't say when that will occur. On Sep 21, 2013, at 7:51 AM, basma a.azeem <basmaabdelaz...@hotmail.com> wrote: > Any Suggestions > > > From: basmaabdelaz...@hotmail.com > To: us...@open-mpi.org > Subject: FT problem > Date: Wed, 18 Sep 2013 16:42:29 +0200 > > i am using openmpi-1.6.1 > i need to try checkpoint restart ( self , blcr ) > after i installed openmpi i had the following in my installation folder : > > bin\ ompi-checkpoint > bin\ompi-restart > > lib\openmpi\mca_crs_self.la > lib\openmpi\mca_crs_self.so > lib\openmpi\mca_crs_blcr.la > lib\openmpi\mca_crs_blcr.so > > although i have: > > ompi_info | grep FT > FT Checkpoint support: yes (checkpoint thread: yes) > > ompi_info | grep crs > MCA crs: none (MCA v2.0, API v2.0, Component v1.6.1) > > when i try to use checkpoint it failed: > > basma@basma-Satellite-A500:~$ /OpenMP/openmpi-1.6.1/builddir/bin/mpirun -np > 3 -am ft-enable-cr /home/basma/NPB3.3/NPB3.3/NPB3.3-OMP/bin/lu.A > > > NAS Parallel Benchmarks (NPB3.3-OMP) - LU Benchmark > > Size: 64x 64x 64 > Iterations: 250 > Number of available threads: 4 > > NAS Parallel Benchmarks (NPB3.3-OMP) - LU Benchmark > > Size: 64x 64x 64 > Iterations: 250 > Number of available threads: 4 > > NAS Parallel Benchmarks (NPB3.3-OMP) - LU Benchmark > > Size: 64x 64x 64 > Iterations: 250 > Number of available threads: 4 > > Time step 1 > Time step 1 > Time step 1 > -------------------------------------------------------------------------- > mpirun noticed that process rank 0 with PID 2917 on node basma-Satellite-A500 > exited on signal 10 (User defined signal 1). > -------------------------------------------------------------------------- > basma@basma-Satellite-A500:~$ > > this resulted when i run this command from shell 2 : > basma@basma-Satellite-A500:~$ > /OpenMP/openmpi-1.6.1/builddir/bin/ompi-checkpoint 2916 > > what i did wrong? > > thank you > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users