Re: [OMPI users] openmpi-1.3a1r18241 ompi-restart issue

Sharon Brunett Tue, 29 Apr 2008 14:07:54 -0400


Josh Hursey wrote:

On Apr 29, 2008, at 12:55 AM, Sharon Brunett wrote:
I'm finding that using ompi-checkpoint on an application which isvery cpu bound takes a very very long time. For example, trying tocheckpoint a 4 or 8 way Pallas MPI Benchmark application can takemore than an hour. The problem is not where I'm dumping checkpoints(I've tried local and an nfs mount with plenty of space, and cpuintensive apps checkpoint quickly).
I'm using BLCR_VERSION=0.6.5 and openmpi-1.3a1r18241.
Is this condition common and if so, are there possibly mca paramterswhich could help?
It depends on how you configured Open MPI with checkpoint/restart.There are two modes of operation: No threads, and with a checkpointthread. They are described a bit more in the Checkpoint/Restart FaultTolerance User's Guide on the wiki:
   https://svn.open-mpi.org/trac/ompi/wiki/ProcessFT_CR
By default we compile without the checkpoint thread. The restrictionhe is that all processes must be in the MPI library in order to makeprogress on the global checkpoint. For CPU intensive applications thismay cause quite a delay in the time to start, and subsequently finish,a checkpoint. I'm guessing that this is what you are seeing.
If you configure with the checkpoint thread (add '--enable-mpi-threads---enable-ft-thread' to ./configure) then Open MPI will create a threadthat runs with each application process. This thread is fairly lightweight and will make sure that a checkpoint progresses even when theprocess is not in the Open MPI library.
Try enabling the checkpoint thread and see if that helps improve thecheckpoint time.


Josh,

First...please pardon the blunder in my earlier mail. Comms bound appsare the ones taking a while to checkpoint, not cpu bound. In any case, Itried configuring with the above two configure options but still no luckon improving checkpointing times or gaining completion on larger mpitask runs being checkpointed.

It looks like the checkpointing is just hanging. For example, I cancheckpoint a 2 way comms bound code (1 task on two nodes) ok. When I askfor a 4 way run on 2 nodes, 30 minutes after the ompi-checkpoint PIDonly see 1 ckpt directory with data in it!



/home/sharon/ompi_global_snapshot_25400.ckpt/0
-bash-2.05b$ ls -l *
opal_snapshot_0.ckpt:
total 0

opal_snapshot_1.ckpt:
total 0

opal_snapshot_2.ckpt:
total 0

opal_snapshot_3.ckpt:
total 1868

-rw------- 1 sharon shc-support 1907476 2008-04-29 10:49ompi_blcr_context.1850

-rw-r--r--  1 sharon shc-support      33 2008-04-29 10:49 snapshot_meta.data
-bash-2.05b$ pwd

The file system getting the checkpoints is local. I've tried /scratchand others as well.

I can checkpoint some codes (like xhpl) just fine across 8 mpi tasks ( tnodes), dumping 254M total. Thus, the very long/stuck checkpointingseems rather application dependent.


Here's how I configured openmpi

./configure --prefix=/nfs/ds01/support/sharon/openmpi-1.3a1r18241--enable-mpi-threads --enable-ft-thread --with-ft=cr --enable-shared--enable-mpi-threads=posix --enable-libgcj-multifile--enable-languages=c,c++,objc,java,f95,ada --enable-java-awt=gtk--with-mvapi=/usr/mellanox --with-blcr=/opt/blcr




Thanks for any further insights you may have.
Sharon

Re: [OMPI users] openmpi-1.3a1r18241 ompi-restart issue

Reply via email to