Re: [OMPI users] Exit Program Without Calling MPI_Finalize ForSpecial Case

Ralph Castain Thu, 4 Jun 2009 18:01:28 -0400

If it helps, note that Open MPI already includes hooks (and just addedsome more) to support this area of research. Note that Open MPI does -not- kill your job when a process dies or leaves without callingMPI_Finalize. What it actually does is call an Error Manager (denotedas "errmgr") in the underlying RTE, which then decides what action totake in response to that event.

It is true that the default errmgr which ships with Open MPI releaseskills the entire job, but that is by no means a requirement - it issimply the default. We deliberately designed the errmgr to be an MCAframework for exactly this reason - to allow anyone to write their ownerrmgr component and experiment with alternative fault responses.


You currently have two options you can pursue:

1. if you want to use 1.2.8 or 1.3.2 (the latter is a superiorplatform), you can write your own errmgr component and use it. Look atthe orte/mca/errmgr directory and you will see a "base" that containssome common functions for startup, and a "default" that contains thedefault errmgr component. Either add you own component (see the OpenMPI home page for a detailed writeup on how to do this), or modify thedefault component to suit your needs.

2. if you want to use the developer's trunk, additional capabilitiesto support FT research were just added to it. In particular, weimplemented an ability to register a callback function in the errmgrso that an application can receive a callback when a specified type oferror occurs - and can then take whatever action it desires. Second,we added a new "resilient mapper" component that automatically re-mapsfailed processes to other available nodes, and then restarts them. Youcould use these, for example, to write your own version of a "faulttolerant mpiexec" - an example of how to do this will be added to thedeveloper's trunk over the weekend.

Note that, in either case, you will still have to deal with all theMPI issues mentioned by Dick - all OMPI does for you is provide aninfrastructure so that you don't have to do all the nitty-gritty stuffof mapping process locations, launching the procs, detecting errors,etc.

Instead, you get to do the "simple" things, like figure out how todeal with failures in the middle of a collective! :-)


HTH
Ralph

On Jun 4, 2009, at 7:20 AM, Richard Treumann wrote:

Tee Wen Kai -
You asked "Just to find out more about the consequences for exitingMPI processes without calling MPI_Finalize, will it cause memoryleak or other fatal problem?"
Be aware that Jeff has offered you an OpenMPI implementationoriented answer rather than an MPI standard oriented answer.
When there is a communicator involving 2 or more tasks and any taskinvolved in that communicator goes down, all other tasks that aremembers of that communicator enter a state the MPI standard sayscannot be trusted. It is legitimate for the process that manages anMPI job as a single entity to recognize that the loss of a membertask has made the state of all connected tasks untrustworthy andbring down all previously connected tasks too.
When you use MPI_Comm_spawn, one result is an intercommunicatorconnecting the task that did the spawn to the task(s) that werespawned so the two sides are "connected". If you intend to use MPIto communicate between the spawn caller and the spawned tasks theymust remain connected. You can explicitly disconnect them and then afailure of the spawned task is harmless to the task that spawned itbut doing the disconnect costs you the communication path.
The MPI standard does not require that connected tasks be broughtdown but it is a valid MPI implementation behavior. This makes somesense when you consider the fact that there is no MPI mechanism bywhich the other tasks can see that the communicator involving thelost task is now broken and there is no way a collectivecommunication can work "correctly" on a communicator that has lost amember task.
For example, what would it mean to call MPI_Reduce on MPI_COMM_WORLDwhen a member of MPI_COMM_WORLD has been lost (especially if it isthe root that was lost)? If you had an MPI application that computedfor hours between the loss of one task and the next collective callon MPI_ COMM_WORLD, would you prefer to pay for hours of computationand then deadlock at the collective call or just abort ASAP afterthe job is recognizably broken.
There is a fault tolerance working group trying to define somethingfor MPI 3.0 but at this stage they are still trying to work out aproposal to bring before the MPI Forum. You might be interested ingetting involved in that effort. They try to address question like:- how would a task know it should not make collective calls on thebroken communicator?- should the communicator still support point to pointcommunications with remaining tasks?- If a task has posted a receive and the expected sender is thenlost, how should the posted receive act?- is there a clean way to "repair" the broken communicator byspawning a replacement task?
- is there a clean way to "shrink" the broken communicator
The Fault Tolerance Working Group has taken on a very tough problem.The list above is just a tiny sample of the challenges in making MPIfault tolerant.
Dick


Dick Treumann - MPI Team
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363
<graycol.gif>Jeff Squyres ---06/04/2009 07:32:25 AM---On Jun 4,2009, at 2:16 AM, Tee Wen Kai wrote: > Just to find out more aboutthe consequences for ex
<ecblank.gif>
From:   <ecblank.gif>
Jeff Squyres <jsquy...@cisco.com>
<ecblank.gif>
To:     <ecblank.gif>
"Open MPI Users" <us...@open-mpi.org>
<ecblank.gif>
Date:   <ecblank.gif>
06/04/2009 07:32 AM
<ecblank.gif>
Subject:        <ecblank.gif>
Re: [OMPI users] Exit Program Without Calling MPI_FinalizeForSpecial Case
<ecblank.gif>
Sent by:        <ecblank.gif>
users-boun...@open-mpi.org



On Jun 4, 2009, at 2:16 AM, Tee Wen Kai wrote:

> Just to find out more about the consequences for exiting MPI
> processes without calling MPI_Finalize, will it cause memory leak or
> other fatal problem?

If you're exiting the process, you won't cause any kind of problems --
the OS will clean up everything.

However, we might also have the orted clean up some things when MPI
processes unexpectedly die (e.g., filesystem temporary files in /
tmp).  So you might want to leave those around to clean themselves up
and die naturally.

--
Jeff Squyres
Cisco Systems

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Exit Program Without Calling MPI_Finalize ForSpecial Case

Reply via email to