Ah, that's even more fun. I know with Singularity you can launch MPI applications by calling MPI outside of the container and then having it link to the internal version: https://docs.sylabs.io/guides/3.3/user-guide/mpi.html  Not sure about docker though.

-Paul Edmon-

On 8/12/2024 10:30 AM, Jeffrey Layton wrote:
It's in a container. Specifically horovod/horovod on the Docker hub. I'm going into the container to investigate now (I think I have a link to the dockerfile as well).

Thanks!

Jeff


On Mon, Aug 12, 2024 at 10:01 AM Paul Edmon <ped...@cfa.harvard.edu> wrote:

    Certainly a strange setup. I would probably talk with who ever is
    providing MPI for you and ask them to build it against Slurm
    properly. As in order to get correct process binding you
    definitely want to have it integrated properly with slurm either
    via PMI2 or PMIx. If you just use the bare hostlist, your ranks
    may not end up properly bound to the specific cores they are
    supposed to be allocated. So definitely proceed with caution and
    validate your ranks are being laid out properly, as you will be
    relying on mpirun/mpiexec to bootstrap rather than the scheduler.

    -Paul Edmon-

    On 8/12/2024 9:55 AM, Jeffrey Layton wrote:
    Paul,

    I tend not to rely on the MPI being built with Slurm :) I find
    that the systems I use haven't done that. :( I'm not exactly sure
    why, but that is the way it is :)

    Up to now, using scontrol has always worked for me. However, a
    new system is not cooperating (it is running on the submittal
    host and not the compute nodes) and I'm trying to debug it. My
    first step was to check that the job was getting the compute
    nodes names (the list of nodes from Slurm is empty). This led to
    my question about the "canonical" way to get the hostlist (I'm
    checking using the hostlist and just relying on Slurm being
    integrated into the mpi - both don't work since the hostlist is
    empty).

    It looks like there is a canonical way to do it as you mentioned.
    FAQ worthy? Definitely for my own Slurm FAQ. Others will decide
    if it is worthy for Slurm docs :)

    Thanks everyone for your help!

    Jeff


    On Mon, Aug 12, 2024 at 9:36 AM Paul Edmon via slurm-users
    <slurm-users@lists.schedmd.com> wrote:

        Normally MPI will just pick up the host list from Slurm
        itself. You just need to build MPI against Slurm and it will
        just grab it. Typically this is transparent to the user.
        Normally you shouldn't need to pass a host list at all. See:
        https://slurm.schedmd.com/mpi_guide.html

        The canonical way to do it if you need to would be the
        scontrol show hostnames command against the
        $SLURM_JOB_NODELIST
        (https://slurm.schedmd.com/scontrol.html#OPT_hostnames). That
        will give you the list of hosts your job is set to run on.

        -Paul Edmon-

        On 8/12/2024 8:34 AM, Jeffrey Layton via slurm-users wrote:
        Thanks! I admit I'm not that experienced in Bash. I will
        give this a whirl as a test.

        In the meantime, let ask, what is the "canonical" way to
        create the host list? It would be nice to have this in the
        Slurm FAQ somewhere.

        Thanks!

        Jeff



        On Fri, Aug 9, 2024 at 1:32 PM Hermann Schwärzler via
        slurm-users <slurm-users@lists.schedmd.com> wrote:

            Hi Paul,

            On 8/9/24 18:45, Paul Edmon via slurm-users wrote:
            > As I recall I think OpenMPI needs a list that has an
            entry on each line,
            > rather than one seperated by a space. See:
            >
            > [root@holy7c26401 ~]# echo $SLURM_JOB_NODELIST
            > holy7c[26401-26405]
            > [root@holy7c26401 ~]# scontrol show hostnames
            $SLURM_JOB_NODELIST
            > holy7c26401
            > holy7c26402
            > holy7c26403
            > holy7c26404
            > holy7c26405
            >
            > [root@holy7c26401 ~]# list=$(scontrol show hostname
            $SLURM_NODELIST)
            > [root@holy7c26401 ~]# echo $list
            > holy7c26401 holy7c26402 holy7c26403 holy7c26404
            holy7c26405

            proper quoting does wonders here (please consult the
            man-page of bash).
            If you try

            echo "$list"

            you will see that you will get

            holy7c26401
            holy7c26402
            holy7c26403
            holy7c26404
            holy7c26405

            So you *can* pass this around in a variable if you use
            "$variable"
            whenever you provide it to a utility.

            Regards,
            Hermann

-- slurm-users mailing list -- slurm-users@lists.schedmd.com
            To unsubscribe send an email to
            slurm-users-le...@lists.schedmd.com



-- slurm-users mailing list -- slurm-users@lists.schedmd.com
        To unsubscribe send an email to
        slurm-users-le...@lists.schedmd.com
-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to