Currently you have to do as Reuti mentioned (use the queuing system,
or create a script). We do have a feature request ticket open for this
feature if you are interested in following the progress:
https://svn.open-mpi.org/trac/ompi/ticket/1961
It has been open for a while, but the feature should
Am 23.07.2012 um 10:02 schrieb 陈松:
> How can I create ckpt files regularly? I mean, do checkpoint every 100
> seconds. Is there any options to do this? Or I have to write a script myself?
Yes, or use a queuing system which supports creation of a checkpoint in fixed
time intervals.
-- Reuti
>
Hi all,How can I create ckpt files regularly? I mean, do checkpoint every
100 seconds. Is there any options to do this? Or I have to write a script
myself?THANKS,---CHEN SongR&D DepartmentNational Supercomputer
Center in TianjinBinhai New Area, Tianjin, China
On Aug 27, 2010, at 3:52 AM, 陈文浩 wrote:
> Dear OMPI Users,
>
> I have installed BLCR(0.8.2) and OpenMPI(1.4.2) successfully. But now I met a
> problem when I take a checkpoint.
> I run CG NPB(NPROCS=16, two nodes: blade02 & blade04, CLASS=C, NFS: $HOME &
> /opt are shared)
>
> BLCR configur
Dear OMPI Users,
I have installed BLCR(0.8.2) and OpenMPI(1.4.2) successfully. But now I met
a problem when I take a checkpoint.
I run CG NPB(NPROCS=16, two nodes: blade02 & blade04, CLASS=C, NFS: $HOME &
/opt are shared)
BLCR configure: ./configure �Cprefix=/opt/blcr �Cenable-static
Open
Well,
as you've suggested i've installed latest version of OpenMPi nigthly:
1.4a1r19370 version.
Now, checkpoint procedure works well, and related restart files are
correctly created, but process restart fails. After restart command, the
process starts, but remains frozen doing nothing, and die.
Hello,
Three things...
1) Josh, the main developer for checkpoint/restart, has been away for
a few weeks
and has just returned. I suspect he will get unburied from e-mail in
another day or two.
2) The 1.4 (and 1.3) branch is very much under rapid development, and
there will be times
when basic fu
There was a bug that caused ompi-checkpoint not to find the correct
place in the session directory for mpirun's contact file. This was
fixed in r19265, so you should no longer have a problem.
On Aug 20, 2008, at 2:11 AM, Matthias Hovestadt wrote:
Hi Gabriele!
In this case, mpirun works w
Hi Gabriele!
In this case, mpirun works well, but the checkpoint procedure fails:
ompi-checkpoint 20109
[node0316:20134] Error: Unable to get the current working directory
[node0316:20134] [[42404,0],0] ORTE_ERROR_LOG: Not found in file
orte-checkpoint.c at line 395
[node0316:20134] HNP with PI
Dear OpenMPI developers,
i'm testing checkpoint and restart with OpenMPI 1.4 nightly. Test machine is
IBM Blade System over Infiniband with 4 processors every communication node.
At the moment, I have some problems. My application is a simply
communication ring between processors, with parametric
10 matches
Mail list logo