Hi Joshua, On Aug 21, 2014, at 12:28 AM, Joshua Ladd <jladd.m...@gmail.com> wrote: > When launching with mpirun in a SLURM environment, srun is only being used to > launch the ORTE daemons (orteds.) Since the daemon will already exist on the > node from which you invoked mpirun, this node will not be included in the > list of nodes. SLURM's PMI library is not involved (that functionality is > only necessary if you directly launch your MPI application with srun, in > which case it is used to exchanged wireup info amongst slurmds.) This is the > expected behavior. > > ~/ompi-top-level/orte/mca/plm/plm_slurm_module.c +294 > /* if the daemon already exists on this node, then > * don't include it > */ > if (node->daemon_launched) { > continue; > } > > Do you have a frontend node that you can launch from? What happens if you set > "-np X" where X = 8*ppn. The alternative is to do a direct launch of the MPI > application with srun.
I understand the logic and I understand with orted in the first node is not needed. But since we use a batch system (SLURM) we do not want people to run their mpirun commands directly fon a front-end. Typical scenario: all compute node are running fine but we reboot all the login nodes to upgrade the linux image because of a security update the kernel. We can keep the login nodes offline potentially for hours without stop the system to work. From our perspective, a front-end node is an additional burden. Of course login node and front-end node can be two separated hosts but I am looking for a way to keep our setup as-it-is without introducing structural changes. Hi Ralph, On Aug 21, 2014, at 12:36 AM, Ralph Castain <r...@open-mpi.org> wrote: > Or you can add > > -nolocal|--nolocal Do not run any MPI applications on the local node > > to your mpirun command line and we won't run any application procs on the > node where mpirun is executing I tried but of course but mpirun complains. If it cannot run local (meaning on the first node, tesla121) then only 7 nodes remains and I request in total 8. So to use "--nolocal" I need to add another nodes. Since we allocate node exclusively and for some users we charge the usage real money... this is not ideal I am afraid. srun seems the only solution to go. I need to understand how to pass most of the --mca parameters to srun and to be sure I can pilot rmaps_lama_* options as flexible as I do with normal mpirun. Then there are mxm, fca, hcoll....I am not against srun in principle, my only stopping point it that the syntax is only different that we might receive lot (too many) complains our users in adopting this new way to submit because they are used to use classic mpirun inside a sbatch script. Most of them will probably not switch to a different method! So our hope to "silently" profile network, energy, I/O using SLURM plugins also using Open MPI is lost... F -- Mr. Filippo SPIGA, M.Sc. http://filippospiga.info ~ skype: filippo.spiga «Nobody will drive us out of Cantor's paradise.» ~ David Hilbert ***** Disclaimer: "Please note this message and any attachments are CONFIDENTIAL and may be privileged or otherwise protected from disclosure. The contents are not to be disclosed to anyone other than the addressee. Unauthorized recipients are requested to preserve this confidentiality and to advise the sender immediately of any error in transmission."