I left out a critical

let task+=1

in that pseudo-loop šŸ˜

griznog

On Sun, Sep 15, 2024 at 6:44ā€ÆPM John Hanks <griz...@gmail.com> wrote:

> No ideas on fixing this in Slurm, but in userspace in the past when faced
> with huge array jobs which had really short jobs like this I've nudged them
> toward batching up array elements in each job to extend it. Say the user
> wants to run 50000 tasks, 30 seconds each. Batching those up in groups of
> 10 will make for 5 minute jobs so (off the top of my head pseudocode):
>
> #SBATCH --array=1-50000:10
>
> starttask=${SLURM_ARRAY_TASK_ID}
> endtask=$(( ${starttask} + (${SLURM_ARRAY_TASK_STEP}  - 1)
> task=${starttask}
> while [[ $task -le $(( ${endtask} )) ]]; do
>     someapp -param=${task}
> done
>
> griznog
>
> On Mon, Sep 9, 2024 at 1:58ā€ÆPM Ransom, Geoffrey M. via slurm-users <
> slurm-users@lists.schedmd.com> wrote:
>
>>
>>
>> Hello
>>
>>    We have another batch of new users and some more batches of large
>> array jobs with very short runtimes due to errors in the jobs or just by
>> design. Trying to deal with these issues, Setting ArrayTaskThrottle and
>> user education, I had a thought that it would be very nice to have a limit
>> on how many jobs can start in a given minute for users, so if they posted a
>> 200000 array job with 15 second tasks then the scheduler wouldnā€™t launch
>> more than a 100 or 200 per minute and be less likely to bog down, but if
>> they had longer runtimes (1 hour +) it would take a few extra minutes to
>> start using all the resources they are allowed to but not add much overall
>> delay to the whole set of jobs.
>>
>>
>>
>> I thought about adding something to our CLI filter, but usually these
>> jobs are asking for a runtime of 3-4 hours even though they run for <30
>> seconds so the submit options donā€™t indicate the problem jobs ahead of time.
>>
>>
>>
>> We currently limit our users to %80 of the available resources which is
>> way more than slurm needs to bog down with fast turnover jobs, but we have
>> users who complain that they canā€™t use that other 20% when the cluster is
>> not busy so putting in lower default restrictions is not currently an
>> option.
>>
>>
>>
>> Has this already been discussed and isnā€™t feasible for technical reasons?
>> (Not finding anything like this yet searching the archives)
>>
>>
>>
>> I think slurm used have a feature request severity on their bug
>> submission site. Is there a severity level they prefer to have suggested
>> requests like this?
>>
>>
>>
>> Thanks
>>
>>
>>
>> --
>> slurm-users mailing list -- slurm-users@lists.schedmd.com
>> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
>>
>
-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to