Am 03.04.2011 um 22:57 schrieb Ralph Castain: > On Apr 3, 2011, at 2:00 PM, Laurence Marks wrote: > >>>> >>>> I am not using that computer. A scenario that I have come across is >>>> that when a msub job is killed because it has exceeded it's Walltime >>>> mpi tasks spawned by ssh may not be terminated because (so I am told) >>>> Torque does not know about them. >>> >>> Not true with OMPI. Torque will kill mpirun, which will in turn cause all >>> MPI procs to die. Yes, it's true that Torque won't know about the MPI procs >>> itself. However, OMPI is designed such that termination of mpirun by the >>> resource manager will cause all apps to die. >> >> How does Torque on NodeA know that an mpi launched on NodeB by ssh >> should be killed? > > Torque works at the job level. So if you get an interactive Torque session, > Torque can only kill your session - which means it automatically kills > everything started within that session, regardless of where it resides. > > Perhaps you don't fully understand how Torque works? As a brief recap, Torque > allocates the requested number of nodes. On one of the nodes, it starts a > "sister mom" that is responsible for that job. It also wires Torque daemons > on each of the other nodes to the "sister mom" to create, in effect, a > virtual machine. > > When the Torque session is completed, the "sister mom" notifies all the other > Torque daemons in the VM that the session shall be terminated. At that time, > all local procs belonging to that session are terminated. It doesn't matter > how those procs got there - by ssh, mpirun, whatever. They -all- are killed.
Is this a new feature? In the Torque clusters I saw they have cron jobs running on all nodes to remove processes which are not invoked by the TM interface of Torque, e.g. because they were started by ssh. When I get you right, you state that even with an ssh to a node you will still get a correct accounting. > What Torque cannot do is kill the actual mpi processes started by mpirun. See > below. > >> OMPI is designed (from what I can see) for all >> mpirun to be started from the same node, not distributed mpi launched >> independently from multiple nodes. > > Remember, mpirun launches its own set of daemons on each node. Each daemon > then locally spawns its set of mpi processes. So mpirun knows where > everything is and can kill it. > > To further ensure cleanup, each daemon monitors mpirun's existence. So Torque > only knows about mpirun, and Torque kills mpirun when (e.g.) walltime is > reached. OMPI's daemons see that mpirun has died and terminate their local > processes prior to terminating themselves. I thought Open MPI has a tight integration into Torque by using the TM interface? Hence Torque provides a correct accounting and can also kill all started orted's as it knows about them. http://www.open-mpi.org/faq/?category=tm -- Reuti > Torque cannot directly kill the mpi processes because it has no knowledge of > their existence and relationship to the job session. Instead, since Torque > knows about the ssh that started mpirun (since you executed it > interactively), it kills the ssh - which causes mpirun to die, which then > causes the mpi apps to die. >> I am not certain that killing the >> ssh on NodeA will in fact terminate a mpi launched on NodeB (i.e. by >> ssh NodeB mpirun AAA...) with OMPI. >> > > It most certainly will! That mpirun on nodeB is executing under the ssh from > nodeA, so when that ssh session is killed, it automatically kills everything > run underneath it. And when mpirun dies, so does the job it was running, as > per above. > > You can prove this to yourself rather easily. Just ssh to a remote node and > execute any command that lingers for awhile - say something simple like > "sleep". Then kill the ssh and do a "ps" on the remote node. I guarantee that > the command will have died. > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users