Let's chat off-list about it - I don't see exactly how this works, but it may
be similar enough.
On Aug 27, 2011, at 8:30 AM, Joshua Hursey wrote:
> There is a 'self' checkpointer (CRS component) that does application level
> checkpointing - exposed at the MPI level. I don't know how differen
There is a 'self' checkpointer (CRS component) that does application level
checkpointing - exposed at the MPI level. I don't know how different what you
are working on is, but maybe something like that could be harnessed. Note that
I have not tested the 'self' checkpointer with the process migra
FWIW: I'm in the process of porting some code from a branch that allows apps to
do on-demand checkpoint/recovery style operations at the app level.
Specifically, it provides the ability to:
* request a "recovery image" - an application-level blob containing state info
required for the app to re
There are some great comments in this thread. Process migration (like
many topics in systems) can get complex fast.
The Open MPI process migration implementation is checkpoint/restart
based (currently using BLCR), and uses an 'eager' style of migration.
This style of migration stops a process comp
Don't know which SSI project you are referring to... I only know the
OpenSSI project, and I was one of the first who subscribed to its
mailing list (since 2001).
http://openssi.org/cgi-bin/view?page=openssi.html
I don't think those OpenSSI clusters are designed for tens of
thousands of nodes, and
Is anything done at the kernel level portable (e.g. to Windows)? It
*can* be, in principle at least (by putting appropriate #ifdef's in
the code), but I am wondering if it is in reality.
Also, in 2005 there was an attempt to implement SSI (Single System
Image) functionality to the then-current 2.6
Srinivas,
There's also Kernel-Level Checkpointing vs. User-Level Checkpointing -
if you can checkpoint an MPI task and restart it on a new node, then
this is also "process migration".
Of course, doing a checkpoint & restart can be slower than pure
in-kernel process migration, but the advantage is
It also depends on what part of migration interests you - are you wanting to
look at the MPI part of the problem (reconnecting MPI transports, ensuring
messages are not lost, etc.) or the RTE part of the problem (where to restart
processes, detecting failures, etc.)?
On Aug 24, 2011, at 7:04 A
Be aware that process migration is a pretty complex issue.
Josh is probably the best one to answer your question directly, but he's out
today.
On Aug 24, 2011, at 5:45 AM, srinivas kundaram wrote:
> I am final year grad student looking for my final year project in OpenMPI.We
> are group of 4
I am final year grad student looking for my final year project in OpenMPI.We
are group of 4 students.
I wanted to know about the "Process Migration" process of MPI processes in
OpenMPI.
Can anyone suggest me any ideas for project related to process migration in
OenMPI or other topics in Systems.
10 matches
Mail list logo