I left out a critical let task+=1
in that pseudo-loop š griznog On Sun, Sep 15, 2024 at 6:44āÆPM John Hanks <griz...@gmail.com> wrote: > No ideas on fixing this in Slurm, but in userspace in the past when faced > with huge array jobs which had really short jobs like this I've nudged them > toward batching up array elements in each job to extend it. Say the user > wants to run 50000 tasks, 30 seconds each. Batching those up in groups of > 10 will make for 5 minute jobs so (off the top of my head pseudocode): > > #SBATCH --array=1-50000:10 > > starttask=${SLURM_ARRAY_TASK_ID} > endtask=$(( ${starttask} + (${SLURM_ARRAY_TASK_STEP} - 1) > task=${starttask} > while [[ $task -le $(( ${endtask} )) ]]; do > someapp -param=${task} > done > > griznog > > On Mon, Sep 9, 2024 at 1:58āÆPM Ransom, Geoffrey M. via slurm-users < > slurm-users@lists.schedmd.com> wrote: > >> >> >> Hello >> >> We have another batch of new users and some more batches of large >> array jobs with very short runtimes due to errors in the jobs or just by >> design. Trying to deal with these issues, Setting ArrayTaskThrottle and >> user education, I had a thought that it would be very nice to have a limit >> on how many jobs can start in a given minute for users, so if they posted a >> 200000 array job with 15 second tasks then the scheduler wouldnāt launch >> more than a 100 or 200 per minute and be less likely to bog down, but if >> they had longer runtimes (1 hour +) it would take a few extra minutes to >> start using all the resources they are allowed to but not add much overall >> delay to the whole set of jobs. >> >> >> >> I thought about adding something to our CLI filter, but usually these >> jobs are asking for a runtime of 3-4 hours even though they run for <30 >> seconds so the submit options donāt indicate the problem jobs ahead of time. >> >> >> >> We currently limit our users to %80 of the available resources which is >> way more than slurm needs to bog down with fast turnover jobs, but we have >> users who complain that they canāt use that other 20% when the cluster is >> not busy so putting in lower default restrictions is not currently an >> option. >> >> >> >> Has this already been discussed and isnāt feasible for technical reasons? >> (Not finding anything like this yet searching the archives) >> >> >> >> I think slurm used have a feature request severity on their bug >> submission site. Is there a severity level they prefer to have suggested >> requests like this? >> >> >> >> Thanks >> >> >> >> -- >> slurm-users mailing list -- slurm-users@lists.schedmd.com >> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com >> >
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com