Also see "https://slurm.schedmd.com/slurm.conf.html"; for 
MaxArraySize/MaxJobCount.

We just went through a user-requested adjustment to MaxArraySize to bump it 
from 1000 to 10000; as the documentation states, since each index of an array 
job is essentially "a job," you must be sure to also adjust MaxJobCount (from 
10000 to 100000 in our case).  Adjusting MaxJobCount requires a restart of 
slurmctld; though the documentation doesn't state it, so does adjustment of 
MaxArraySize (scontrol reconfigure will succeed but leave the previous limit in 
effect, see "https://bugs.schedmd.com/show_bug.cgi?id=6553";).

The "MaxArraySize" is a bit of a misnomer since it's really 1 + the top of the 
valid range of indices -- "MaxArrayIndex" would be more apt.  Our users were 
very happy with Grid Engine's allowance of any index range and striding that 
produces no more than "max_aj_tasks" indices; since moving to Slurm they're 
forced to come up with their own index-mapping functionality at times, but the 
relatively low MaxArraySize versus what we had in GridEngine (75000) has been 
especially frustrating for them.

So far the 10000/100000 combo hasn't come close to exhausting resources on our 
slurmctld nodes; but we haven't actually submitted a couple 10000-index array 
jobs and enough other jobs to hit 100000 active jobs, so current memory usage 
isn't an adequate measure of usage under load.  Since the slurm.conf 
documentation states:


Performance can suffer with more than a few hundred thousand jobs. 


we're reluctant to increase MaxJobCount too much higher.




> On Feb 26, 2019, at 3:18 AM, Ole Holm Nielsen <ole.h.niel...@fysik.dtu.dk> 
> wrote:
> 
> On 2/26/19 9:07 AM, Marcus Wagner wrote:
>> Does anyone know, why per default the number of array elements is limited to 
>> 1000?
>> We have one user, who would like to have 100k array elements!
>> What is more difficult for the scheduler, one array job with 100k elements 
>> or 100k non-array jobs?
>> Where did you set the limit? Do your users use array jobs at all?
> 
> Google is your friend :-)
> 
> https://slurm.schedmd.com/job_array.html
> 
>> A new configuration parameter has been added to control the maximum job 
>> array size: MaxArraySize. The smallest index that can be specified by a user 
>> is zero and the maximum index is MaxArraySize minus one. The default value 
>> of MaxArraySize is 1001. The maximum MaxArraySize supported in Slurm is 
>> 4000001. Be mindful about the value of MaxArraySize as job arrays offer an 
>> easy way for users to submit large numbers of jobs very quickly.
> 
> /Ole
> 


::::::::::::::::::::::::::::::::::::::::::::::::::::::
Jeffrey T. Frey, Ph.D.
Systems Programmer V / HPC Management
Network & Systems Services / College of Engineering
University of Delaware, Newark DE  19716
Office: (302) 831-6034  Mobile: (302) 419-4976
::::::::::::::::::::::::::::::::::::::::::::::::::::::




Reply via email to