I can confirm that mpirun will not direct-launch the applications under Torque. This is done for wireup support - if/when Torque natively supports PMIx, then we could revisit that design.
Gilles: the benefit is two-fold: * Torque has direct visibility of the application procs. When we launch via orted, Torque only sees the orted’s and has no idea what is actually going on. This can be an issue for accounting, but also generally causes confusion over qsub options vs what mpirun does * one less daemon running on the node => less jitter and performance impact on the app Ralph > On Jun 7, 2016, at 5:49 PM, Gilles Gouaillardet <gil...@rist.or.jp> wrote: > > Ken, > > > iirc, and under torque when Open MPI is configure'd with --with -tm > > (this is the default, so assuming your torque headers/libs can be found, you > do not even have to specify --with-tm), mpirun does tm_spawn the orted daemon > on all nodes except the current one. > > then mpirun and orted will fork and exec sleep 60. > > > i do not think it is possible to have mpirun tm_spawn sleep 60. > > > generally speaking, mpirun is used to run MPI apps, and some wiring is needed > to correctly initialize MPI, hence the need for orted daemons. > > > direct launch is an option, but it does require some kind of support from the > batch manager. > > for example, under slurm > > srun --resv-ports a.out > > (i do not think that is possible any more though) > > or > > srun --mpi={pmi,pmi2,pmix(?)} a.out > > > but i am not aware of a PMIx server inside torque. > > > > out of curiosity, what would be the benefit of tm_spawn the tasks (sleep 60) > instead or orted ? > > > Cheers, > > > Gilles > > On 6/8/2016 9:01 AM, Ken Nielson wrote: >> I am using openmpi version 1.10.2 with Torque 6.0.1. >> >> I launch a job with the following syntax: >> >> qsub -L tasks=2:lprocs=2:maxtpn=1 -I >> >> This starts an interactive job which is using two nodes. >> >> I then use mpirun as follows from the command line of the interactive job. >> >> mpirun -np 4 sleep 60 >> >> What I would like to see happen is a call made to tm_spawn for each sleep >> for each node. That would be two per node. Instead I get a single tm_spawn >> request which launches mpirun and mpirun launches the two sleep processes. >> >> Is there a command line to direct mpi run to call tm_spawn for each count in >> np? >> >> >> >> -- >> >> <http://www.adaptivecomputing.com/> >> <http://twitter.com/AdaptiveMoab> >> <http://www.linkedin.com/company/448673?goback=.fcs_GLHD_adaptive+computing_false_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2&trk=ncsrch_hits> >> <http://www.youtube.com/adaptivecomputing> >> <https://plus.google.com/u/0/102155039310685515037/posts> >> <http://www.facebook.com/pages/Adaptive-Computing/314449798572695?fref=ts> >> >> <http://www.adaptivecomputing.com/feed> >> Ken Nielson Sr. Software Engineer >> +1 801.717.3700 office +1 801.717.3738 fax >> 1712 S. East Bay Blvd, Suite 300 Provo, UT 84606 >> <http://www.adaptivecomputing.com/>www.adaptivecomputing.com >> <http://www.adaptivecomputing.com/> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org <mailto:us...@open-mpi.org> >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users >> <https://www.open-mpi.org/mailman/listinfo.cgi/users> >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/06/29397.php >> <http://www.open-mpi.org/community/lists/users/2016/06/29397.php> > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/06/29398.php