hi team,
Now I can checkpoint NPB by decreasing the process numbers like this:
before: mpirun -am ft-enable-cr -np 128 is.C.128
after: mpirun -am ft-enable-cr -np 2 is.C.2
but when I checkpoint hpl, it still keep hanging there(I waited more than
12 hours, but it still hang the
hi team:
I have a question about the checkpoint/restart of openmpi, I hope you can
help.
I can use openmpi to checkpoint the example program 'hello' which is
provided in the builddir.
But when I checkpoint other applications like NPB or hpl, when I execute
the command ompi-checkpoin