Also see "https://slurm.schedmd.com/slurm.conf.html" for MaxArraySize/MaxJobCount.
We just went through a user-requested adjustment to MaxArraySize to bump it from 1000 to 10000; as the documentation states, since each index of an array job is essentially "a job," you must be sure to also adjust MaxJobCount (from 10000 to 100000 in our case). Adjusting MaxJobCount requires a restart of slurmctld; though the documentation doesn't state it, so does adjustment of MaxArraySize (scontrol reconfigure will succeed but leave the previous limit in effect, see "https://bugs.schedmd.com/show_bug.cgi?id=6553"). The "MaxArraySize" is a bit of a misnomer since it's really 1 + the top of the valid range of indices -- "MaxArrayIndex" would be more apt. Our users were very happy with Grid Engine's allowance of any index range and striding that produces no more than "max_aj_tasks" indices; since moving to Slurm they're forced to come up with their own index-mapping functionality at times, but the relatively low MaxArraySize versus what we had in GridEngine (75000) has been especially frustrating for them. So far the 10000/100000 combo hasn't come close to exhausting resources on our slurmctld nodes; but we haven't actually submitted a couple 10000-index array jobs and enough other jobs to hit 100000 active jobs, so current memory usage isn't an adequate measure of usage under load. Since the slurm.conf documentation states: Performance can suffer with more than a few hundred thousand jobs. we're reluctant to increase MaxJobCount too much higher. > On Feb 26, 2019, at 3:18 AM, Ole Holm Nielsen <ole.h.niel...@fysik.dtu.dk> > wrote: > > On 2/26/19 9:07 AM, Marcus Wagner wrote: >> Does anyone know, why per default the number of array elements is limited to >> 1000? >> We have one user, who would like to have 100k array elements! >> What is more difficult for the scheduler, one array job with 100k elements >> or 100k non-array jobs? >> Where did you set the limit? Do your users use array jobs at all? > > Google is your friend :-) > > https://slurm.schedmd.com/job_array.html > >> A new configuration parameter has been added to control the maximum job >> array size: MaxArraySize. The smallest index that can be specified by a user >> is zero and the maximum index is MaxArraySize minus one. The default value >> of MaxArraySize is 1001. The maximum MaxArraySize supported in Slurm is >> 4000001. Be mindful about the value of MaxArraySize as job arrays offer an >> easy way for users to submit large numbers of jobs very quickly. > > /Ole > :::::::::::::::::::::::::::::::::::::::::::::::::::::: Jeffrey T. Frey, Ph.D. Systems Programmer V / HPC Management Network & Systems Services / College of Engineering University of Delaware, Newark DE 19716 Office: (302) 831-6034 Mobile: (302) 419-4976 ::::::::::::::::::::::::::::::::::::::::::::::::::::::