Am 03.04.2011 um 22:57 schrieb Ralph Castain:

> On Apr 3, 2011, at 2:00 PM, Laurence Marks wrote:
> 
>>>> 
>>>> I am not using that computer. A scenario that I have come across is
>>>> that when a msub job is killed because it has exceeded it's Walltime
>>>> mpi tasks spawned by ssh may not be terminated because (so I am told)
>>>> Torque does not know about them.
>>> 
>>> Not true with OMPI. Torque will kill mpirun, which will in turn cause all 
>>> MPI procs to die. Yes, it's true that Torque won't know about the MPI procs 
>>> itself. However, OMPI is designed such that termination of mpirun by the 
>>> resource manager will cause all apps to die.
>> 
>> How does Torque on NodeA know that an mpi launched on NodeB by ssh
>> should be killed?
> 
> Torque works at the job level. So if you get an interactive Torque session, 
> Torque can only kill your session - which means it automatically kills 
> everything started within that session, regardless of where it resides.
> 
> Perhaps you don't fully understand how Torque works? As a brief recap, Torque 
> allocates the requested number of nodes. On one of the nodes, it starts a 
> "sister mom" that is responsible for that job. It also wires Torque daemons 
> on each of the other nodes to the "sister mom" to create, in effect, a 
> virtual machine.
> 
> When the Torque session is completed, the "sister mom" notifies all the other 
> Torque daemons in the VM that the session shall be terminated. At that time, 
> all local procs belonging to that session are terminated. It doesn't matter 
> how those procs got there - by ssh, mpirun, whatever. They -all- are killed.

Is this a new feature? In the Torque clusters I saw they have cron jobs running 
on all nodes to remove processes which are not invoked by the TM interface of 
Torque, e.g. because they were started by ssh.

When I get you right, you state that even with an ssh to a node you will still 
get a correct accounting.


> What Torque cannot do is kill the actual mpi processes started by mpirun. See 
> below.
> 
>> OMPI is designed (from what I can see) for all
>> mpirun to be started from the same node, not distributed mpi launched
>> independently from multiple nodes.
> 
> Remember, mpirun launches its own set of daemons on each node. Each daemon 
> then locally spawns its set of mpi processes. So mpirun knows where 
> everything is and can kill it.
> 
> To further ensure cleanup, each daemon monitors mpirun's existence. So Torque 
> only knows about mpirun, and Torque kills mpirun when (e.g.) walltime is 
> reached. OMPI's daemons see that mpirun has died and terminate their local 
> processes prior to terminating themselves.

I thought Open MPI has a tight integration into Torque by using the TM 
interface? Hence Torque provides a correct accounting and can also kill all 
started orted's as it knows about them.

http://www.open-mpi.org/faq/?category=tm

-- Reuti


> Torque cannot directly kill the mpi processes because it has no knowledge of 
> their existence and relationship to the job session. Instead, since Torque 
> knows about the ssh that started mpirun (since you executed it 
> interactively), it kills the ssh - which causes mpirun to die, which then 
> causes the mpi apps to die.
>> I am not certain that killing the
>> ssh on NodeA will in fact terminate a mpi launched on NodeB (i.e. by
>> ssh NodeB mpirun AAA...) with OMPI.
>> 
> 
> It most certainly will! That mpirun on nodeB is executing under the ssh from 
> nodeA, so when that ssh session is killed, it automatically kills everything 
> run underneath it. And when mpirun dies, so does the job it was running, as 
> per above.
> 
> You can prove this to yourself rather easily. Just ssh to a remote node and 
> execute any command that lingers for awhile - say something simple like 
> "sleep". Then kill the ssh and do a "ps" on the remote node. I guarantee that 
> the command will have died.
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to