Re: [slurm-users] Running pyMPI on several nodes

Pär Lundö Mon, 12 Aug 2019 23:27:56 -0700

Hi!

I have now had the chance to look into to this matter more thoroughlyand it seems that the problem was due to the fact that the nodes arediskless and shared some data (e.g. "etc"-dir). I removed thatdependency and mounted each node to a unique set of folders, whichresolved the issue. Presumably, this can be done in other ways unknownto me, but it helped me and I can now run multiple nodes via MPI.


Thank you for your help!

Best regards,
Pälle L

On 2019-07-16 15:49, Benson Muite wrote:


Hi,

Does a regular MPI program run on two nodes? For example helloworld:

https://people.sc.fsu.edu/~jburkardt/c_src/hello_mpi/hello_mpi.c

https://people.sc.fsu.edu/~jburkardt/py_src/hello_mpi/hello_mpi.py

Benson

On 7/16/19 4:30 PM, Pär Lundö wrote:

Hi,
Thank you for your quick answer!

I’ll look into that, but they share the same hosts-file and theDHCP-server sets their hostname.

However I came across a setting in the slurm.conf-file ”Tmpfs” andthere were a note regarding it in the guide of mpi at the slurmswebpage. I implemented the proposed changes but still no luck.


Best regards,
Palle

------------------------------------------------------------------------
*From:* "slurm-users" <slurm-users-boun...@lists.schedmd.com>
*Sent:* 16 juli 2019 12:32
*To:* "Slurm User Community List" <slurm-users@lists.schedmd.com>
*Subject:* Re: [slurm-users] Running pyMPI on several nodes

srun: error: Application launch failed: Invalid node name specified

Hearns Law. All batch system problems are DNS problems.

Seriously though - check out your name resolution both on the headnode and the compute nodes.

On Tue, 16 Jul 2019 at 08:49, Pär Lundö < par.lu...@foi.se<mailto:par.lu...@foi.se>> wrote:


    Hi,

    I have now had the time to look at some of your suggestions.

    First I tried running "srun -N1 hostname" via a sbatch-script,
    while having two nodes up and running.
    "sinfo" yields that two nodes are up and idle prior to submitting
    the sbatch-script.
    After submitting the job, I receive an error stating that:

    "srun: error: Task launch for 86.0 failed on node lxclient11:
    Invalid node name specified.
    srun: error: Application launch failed: Invalid node name specified
    srun: Job step aborted: Waiting up to 32 seconds for job step to
    finish.
    srun: error: TImed out waiting for job step to complete"


    From the log file at the client I get a more detailed error:
    " Launching batch job 86 for UID 1000
    [86.batch] error: Invalid host_index -1 for job 86
    [86.batch] error: Host lxclient10 not in hostlist lxclient11
    [86.batch] task_pre_launch: Using sched_affinity for tasks
    rpc_launch_tasks: Invalid node list (lxclient10 not in lxclient11)"

    My two nodes are called lxclient10 and lxclient11.
    Why is my batch job launched with the UID 1000, shouldnt it be
    launched via the slurm-user (which in my case has the UID 64030)?
    What is meant by that the different nodes are not in the nodeslist?
    The two nodes and the server share the same setup of IP-addresses
    in the "/etc/hosts"-file.

    -> This was resolved due to that lxclient10 was noted as down.
    Getting it back up, the submitting of the same sbatch-script,
    resulted in no error.
    However running it on two nodes I get an error
    "srun: error: Job Step 88.0 aborted before step completely launched.
    srun: error: Job step aborted: Waiting up to 32 seconds for job
    step to finish.
    srun: error: task 1 launched failed: Unspecifed error
    srun: error: lxclient10: task 0: Killed"

    And in the slurmctld.log-file from the client I get an error
    similiar to that prevously stated, that the pmix cannot bind UNIX
    socket /var/spool/slurmd/stepd.slurm.pmix.88.0: Address already
    in use (98)

    I ran the lsof command, but I dont really know what I am looking
    after, I can see if I grep with the different nodenames that the
    two nodes have mounted the nfs-partition and that a link is
    established.

    "As an aside, you have checked that your username exists on that
    compue server?      getent passwd par
    Also that your home directory is mounted - or something
    substituting for your home directory?"
    Yes, the user slurm exists on both nodes and have the same uid.

    "Have you tried


            srun -N# -n# mpirun python3 ....


    Perhaps you have no MPI environment being setup for the
    processes?  There was no "--mpi" flag in your "srun" command and
    we don't know if you have a default value for that or not.

    "

    In my slurm.conf-file I do specify that "MpiDefault=pmix" (And it
    can be seen in the logfile that there is something wrong with
    pmix, that the address already in use.)

    One thing that struck my mind now is that I run these nodes as a
    pair of diskless nodes, whom boots and mounts the same filesystem
    which is supplied by a server. The run differen pids for
    different processes which should not affect one another(?), right?


    Best regards,

    Palle

    On 2019-07-12 19:34, Pär Lundö wrote:

        Hi,

        Thank you so much for your quick responses!
        It is much appreciated.
        I dont have access to the cluster until next week, but I’ll
        be sure to follow up on all of your suggestions and get back
        you next week.

        Have a nice weekend!
        Best regards
        Palle

        ------------------------------------------------------------------------
        *From:* "slurm-users" <slurm-users-boun...@lists.schedmd.com>
        <mailto:slurm-users-boun...@lists.schedmd.com>
        *Sent:* 12 juli 2019 17:37
        *To:* "Slurm User Community List"
        <slurm-users@lists.schedmd.com>
        <mailto:slurm-users@lists.schedmd.com>
        *Subject:* Re: [slurm-users] Running pyMPI on several nodes

        Par, by 'poking around' Crhis means to use tools such as
        netstat and lsof.
        Also I would look as ps -eaf --forest to make sure there are
        no 'orphaned' jusbs sitting on that compute node.

        Having said that though, I have a dim memory of a classic
        PBSPro error message which says something about a network
        connection,
        but really means that you cannot open a remote session on
        that compute server.

        As an aside, you have checked that your username exists on
        that compue server?      getent passwd par
        Also that your home directory is mounted - or something
        substituting for your home directory?


        On Fri, 12 Jul 2019 at 15:55, Chris Samuel <
        ch...@csamuel.org <mailto:ch...@csamuel.org>> wrote:

            On 12/7/19 7:39 am, Pär Lundö wrote:

            > Presumably, the first 8 tasks originates from the first
            node (in this
            > case the lxclient11), and the other node (lxclient10)
            response as
            > predicted.

            That looks right, it seems the other node has two
            processes fighting
            over the same socket and that's breaking Slurm there.

            > Is it neccessary to have passwordless ssh communication
            alongside the
            > munge authentication?

            No, srun doesn't need (or use) that at all.

            > In addition I checked the slurmctld-log from both the
            server and client
            > and found something (noted in bold):

            This is from the slurmd log on the client from the look
            of it.

            > *[2019-07-12T14:57:53.771][83.0] task_p_pre_launch:
            Using sched affinity
            > for tasks lurm.pmix.83.0: Address already in use[98]*
            > [2019-07-12T14:57:53.682][83.0] error: lxclient[0]
            /pmix.server.c:386
            > [pmix_stepd_init] mpi/pmix: ERROR: pmixp_usock_create_srv
            > [2019-07-12T14:57:53.683][83.0] error: (null) [0]
            /mpi_pmix:156
            > [p_mpi_hook_slurmstepd_prefork] mpi/pmix: ERROR:
            pmixp_stepd_init() failed

            That indicates that something else has grabbed the socket
            it wants and
            that's why the setup of the MPI ranks on the second node
            fails.

            You'll want to poke around there to see what's using it.

            Best of luck!
            Chris

-- Chris Samuel : http://www.csamuel.org/

            <http://www.csamuel.org/>  :  Berkeley, CA, USA

Re: [slurm-users] Running pyMPI on several nodes

Reply via email to