We have 12,000 nodes in our system, 9,600 of which are KNL.  We can
start a parallel application within a few seconds in most cases (when
the machine is dedicated to this task), even at full scale.  So I
don't think there is anything intrinsic to Slurm that would
necessarily be limiting you, though we have seen cases in the past
where arbitrary task distribution has caused contoller slow-down
issues as the detailed scheme was parsed.

Do you know if all the slurmstepd's are starting quickly on the
compute nodes?  How is the OS/Slurm/executable delivered to the node?
----
Doug Jacobsen, Ph.D.
NERSC Computer Systems Engineer
Acting Group Lead, Computational Systems Group
National Energy Research Scientific Computing Center
dmjacob...@lbl.gov

------------- __o
---------- _ '\<,_
----------(_)/  (_)__________________________


On Fri, Apr 26, 2019 at 7:40 AM Riebs, Andy <andy.ri...@hpe.com> wrote:
>
> Thanks for the quick response Doug!
>
> Unfortunately, I can't be specific about the cluster size, other than to say 
> it's got more than a thousand nodes.
>
> In a separate test that I had missed, even "srun hostname" took 5 minutes to 
> run. So there was no remote file system or MPI involvement.
>
> Andy
>
> -----Original Message-----
> From: slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] On Behalf Of 
> Douglas Jacobsen
> Sent: Friday, April 26, 2019 9:24 AM
> To: Slurm User Community List <slurm-users@lists.schedmd.com>
> Subject: Re: [slurm-users] job startup timeouts?
>
> How large is very large?  Where is the executable being started?  In
> the parallel filesystem/NFS?  If that is the case you may be able to
> trim start times by using sbcast to transfer the executable (and its
> dependencies if dynamically linked) into a node-local resource, such
> as /tmp or /dev/shm depending on your local configuration.
> ----
> Doug Jacobsen, Ph.D.
> NERSC Computer Systems Engineer
> Acting Group Lead, Computational Systems Group
> National Energy Research Scientific Computing Center
> dmjacob...@lbl.gov
>
> ------------- __o
> ---------- _ '\<,_
> ----------(_)/  (_)__________________________
>
>
> On Fri, Apr 26, 2019 at 5:34 AM Andy Riebs <andy.ri...@hpe.com> wrote:
> >
> > Hi All,
> >
> > We've got a very large x86_64 cluster with lots of cores on each node, and 
> > hyper-threading enabled. We're running Slurm 18.08.7 with Open MPI 4.x on 
> > CentOS 7.6.
> >
> > We have a job that reports
> >
> > srun: error: timeout waiting for task launch, started 0 of xxxxxx tasks
> > srun: Job step 291963.0 aborted before step completely launched.
> >
> > when we try to run it at large scale. We anticipate that it could take as 
> > long as 15 minutes for the job to launch, based on our experience with 
> > smaller numbers of nodes.
> >
> > Is there a timeout setting that we're missing that can be changed to 
> > accommodate a lengthy startup time like this?
> >
> > Andy
> >
> > --
> >
> > Andy Riebs
> > andy.ri...@hpe.com
> > Hewlett-Packard Enterprise
> > High Performance Computing Software Engineering
> > +1 404 648 9024
> > My opinions are not necessarily those of HPE
> >     May the source be with you!
>

Reply via email to