We have 12,000 nodes in our system, 9,600 of which are KNL. We can start a parallel application within a few seconds in most cases (when the machine is dedicated to this task), even at full scale. So I don't think there is anything intrinsic to Slurm that would necessarily be limiting you, though we have seen cases in the past where arbitrary task distribution has caused contoller slow-down issues as the detailed scheme was parsed.
Do you know if all the slurmstepd's are starting quickly on the compute nodes? How is the OS/Slurm/executable delivered to the node? ---- Doug Jacobsen, Ph.D. NERSC Computer Systems Engineer Acting Group Lead, Computational Systems Group National Energy Research Scientific Computing Center dmjacob...@lbl.gov ------------- __o ---------- _ '\<,_ ----------(_)/ (_)__________________________ On Fri, Apr 26, 2019 at 7:40 AM Riebs, Andy <andy.ri...@hpe.com> wrote: > > Thanks for the quick response Doug! > > Unfortunately, I can't be specific about the cluster size, other than to say > it's got more than a thousand nodes. > > In a separate test that I had missed, even "srun hostname" took 5 minutes to > run. So there was no remote file system or MPI involvement. > > Andy > > -----Original Message----- > From: slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] On Behalf Of > Douglas Jacobsen > Sent: Friday, April 26, 2019 9:24 AM > To: Slurm User Community List <slurm-users@lists.schedmd.com> > Subject: Re: [slurm-users] job startup timeouts? > > How large is very large? Where is the executable being started? In > the parallel filesystem/NFS? If that is the case you may be able to > trim start times by using sbcast to transfer the executable (and its > dependencies if dynamically linked) into a node-local resource, such > as /tmp or /dev/shm depending on your local configuration. > ---- > Doug Jacobsen, Ph.D. > NERSC Computer Systems Engineer > Acting Group Lead, Computational Systems Group > National Energy Research Scientific Computing Center > dmjacob...@lbl.gov > > ------------- __o > ---------- _ '\<,_ > ----------(_)/ (_)__________________________ > > > On Fri, Apr 26, 2019 at 5:34 AM Andy Riebs <andy.ri...@hpe.com> wrote: > > > > Hi All, > > > > We've got a very large x86_64 cluster with lots of cores on each node, and > > hyper-threading enabled. We're running Slurm 18.08.7 with Open MPI 4.x on > > CentOS 7.6. > > > > We have a job that reports > > > > srun: error: timeout waiting for task launch, started 0 of xxxxxx tasks > > srun: Job step 291963.0 aborted before step completely launched. > > > > when we try to run it at large scale. We anticipate that it could take as > > long as 15 minutes for the job to launch, based on our experience with > > smaller numbers of nodes. > > > > Is there a timeout setting that we're missing that can be changed to > > accommodate a lengthy startup time like this? > > > > Andy > > > > -- > > > > Andy Riebs > > andy.ri...@hpe.com > > Hewlett-Packard Enterprise > > High Performance Computing Software Engineering > > +1 404 648 9024 > > My opinions are not necessarily those of HPE > > May the source be with you! >