Hello
   We have another batch of new users and some more batches of large array jobs 
with very short runtimes due to errors in the jobs or just by design. Trying to 
deal with these issues, Setting ArrayTaskThrottle and user education, I had a 
thought that it would be very nice to have a limit on how many jobs can start 
in a given minute for users, so if they posted a 200000 array job with 15 
second tasks then the scheduler wouldn't launch more than a 100 or 200 per 
minute and be less likely to bog down, but if they had longer runtimes (1 hour 
+) it would take a few extra minutes to start using all the resources they are 
allowed to but not add much overall delay to the whole set of jobs.

I thought about adding something to our CLI filter, but usually these jobs are 
asking for a runtime of 3-4 hours even though they run for <30 seconds so the 
submit options don't indicate the problem jobs ahead of time.

We currently limit our users to %80 of the available resources which is way 
more than slurm needs to bog down with fast turnover jobs, but we have users 
who complain that they can't use that other 20% when the cluster is not busy so 
putting in lower default restrictions is not currently an option.

Has this already been discussed and isn't feasible for technical reasons? (Not 
finding anything like this yet searching the archives)

I think slurm used have a feature request severity on their bug submission 
site. Is there a severity level they prefer to have suggested requests like 
this?

Thanks

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to