Hi, Filippo

When launching with mpirun in a SLURM environment, srun is only being used
to launch the ORTE daemons (orteds.)  Since the daemon will already exist
on the node from which you invoked mpirun, this node will not be included
in the list of nodes. SLURM's PMI library is not involved (that
functionality is only necessary if you directly launch your MPI application
with srun, in which case it is used to exchanged wireup info amongst
slurmds.) This is the expected behavior.

~/ompi-top-level/orte/mca/plm/plm_slurm_module.c +294
/* if the daemon already exists on this node, then
         * don't include it
         */
        if (node->daemon_launched) {
            continue;
        }

Do you have a frontend node that you can launch from? What happens if you
set "-np X" where X = 8*ppn. The alternative is to do a direct launch of
the MPI application with srun.


Best,

Josh



On Wed, Aug 20, 2014 at 6:48 PM, Filippo Spiga <spiga.fili...@gmail.com>
wrote:

> Dear Open MPI experts,
>
> I have a problem that is related to the integration of OpenMPI, slurm and
> PMI interface. I spent some time today with a colleague of mine trying to
> figure out why we were not able to obtain all H5 profile files (generated
> by acct_gather_profile) using Open MPI. When I say "all" I mean if I run
> using 8 nodes (e.g. tesla[121-128]) then I always systematically miss the
> file related to the first one (the first node in the allocation list, in
> this case tesla121).
>
> By comparing which processes are spawn on the compute nodes, I discovered
> that mpirun running on tesla121 calls srun only to spawn remotely new MPI
> processes to the other 7 nodes (maybe this is obvious, for me it was not)...
>
> fs395      617  0.0  0.0 106200  1504 ?        S    22:41   0:00 /bin/bash
> /var/spool/slurm-test/slurmd/job390044/slurm_script
> fs395      629  0.1  0.0 194552  5288 ?        Sl   22:41   0:00  \_
> mpirun -bind-to socket --map-by ppr:1:socket --host
> tesla121,tesla122,tesla123,tesla124,tesla125,tesla126,tes
> fs395      632  0.0  0.0 659740  9148 ?        Sl   22:41   0:00  |   \_
> srun --ntasks-per-node=1 --kill-on-bad-exit --cpu_bind=none --nodes=7
> --nodelist=tesla122,tesla123,tesla1
> fs395      633  0.0  0.0  55544  1072 ?        S    22:41   0:00  |   |
> \_ srun --ntasks-per-node=1 --kill-on-bad-exit --cpu_bind=none --nodes=7
> --nodelist=tesla122,tesla123,te
> fs395      651  0.0  0.0 106072  1392 ?        S    22:41   0:00  |   \_
> /bin/bash ./run_linpack ./xhpl
> fs395      654  295 35.5 120113412 23289280 ?  RLl  22:41   3:12  |   |
> \_ ./xhpl
> fs395      652  0.0  0.0 106072  1396 ?        S    22:41   0:00  |   \_
> /bin/bash ./run_linpack ./xhpl
> fs395      656  307 35.5 120070332 23267728 ?  RLl  22:41   3:19  |
> \_ ./xhpl
>
>
> The "xhpl" processes allocated on the first node of a job are not called
> by srun and because of this the SLURM profile plugin is not activated on
> the node!!! As result I always miss the first node profile information.
> Intel MPI does not have this behavior, mpiexec.hydra uses srun on the first
> node.
>
> I got to the conclusion that SLURM is configured properly, something is
> wrong in the way I lunch Open MPI using mpirun. If I disable SLURM support
> and I revert back to rsh (--mca plm rsh) everything work but there is not
> profiling because the SLURM plug-in is not activated. During the configure
> step, Open MPI 1.8.1 detects slurm and libmpi/libpmi2 correctly. Honestly,
> I would prefer to avoid to use srun as job luncher if possible...
>
> Any suggestion to get this sorted out is really appreciated!
>
> Best Regards,
> Filippo
>
> --
> Mr. Filippo SPIGA, M.Sc.
> http://filippospiga.info ~ skype: filippo.spiga
>
> «Nobody will drive us out of Cantor's paradise.» ~ David Hilbert
>
> *****
> Disclaimer: "Please note this message and any attachments are CONFIDENTIAL
> and may be privileged or otherwise protected from disclosure. The contents
> are not to be disclosed to anyone other than the addressee. Unauthorized
> recipients are requested to preserve this confidentiality and to advise the
> sender immediately of any error in transmission."
>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/08/25099.php
>

Reply via email to