Here is where you may want to look into slurmdbd and sacct
Then you can create a qos that has MaxJobsPerUser to limit the total
number running on a per-user basis:
https://slurm.schedmd.com/resource_limits.html
Brian Andrus
On 8/27/2019 9:38 AM, Guillaume Perrault Archambault wrote:
Hi Paul,
Your comment confirms my worst fear, that I should either implement
job arrays or stick to a sequential for loop.
My problem with job arrays is that, as far as I understand them, they
cannot be used with singleton to set a max job limit.
I use singleton to limit the number of jobs a user can be running at a
time. For example if the limit is 3 jobs per user and the user
launches 10 jobs, the sbatch submissions via my scripts may look this:
sbatch --job-name=job1 [OPTIONS SET1] Dependency=singleton my.sbatch
sbatch --job-name=job2 [OTHER SET1] Dependency=singleton my.sbatch
sbatch --job-name=job3 [OTHER SET1] Dependency=singleton my.sbatch
sbatch --job-name=job1 [OTHER SET1 Dependency=singleton my.sbatch
sbatch --job-name=job2 [OTHER SET1 ] Dependency=singleton my.sbatch
sbatch --job-name=job3 [OTHER SET2] Dependency=singleton my.sbatch2
sbatch --job-name=job1 [OTHER SET2] Dependency=singleton my.sbatch2
sbatch --job-name=job2 [OTHER SET2 ] Dependency=singleton my.sbatch2
sbatch --job-name=job2 [OTHER SET2 ] Dependency=singleton my.sbatch2
sbatch --job-name=job1 [OTHER SET2 ] Dependency=singleton my.sbatch 2
This way, at most 3 jobs will run at a time (ie a job with name job1,
a job with name job2, and job with name job3).
Notice that my example has two option sets provided to sbatch, so the
example would be suitable for conversion to two Job Arrays.
This is the problem I can't obercome.
In the job array documentation, I see
A maximum number of simultaneously running tasks from the job array
may be specified using a "%" separator. For example "--array=0-15%4"
will limit the number of simultaneously running tasks from this job
array to 4.
But this '%' separator cannot specify a max number of tasks over two
(or more) separate job arrays, as far as I can tell.
And the job array element names cannot be made to modulo rotate in the
way they do in my above example.
Perhaps I need to play more with job arrays, and try harder to find a
solution to limit number of jobs across multiple arrays. Or ask this
question in a separate post, since it's a bit off topic.
In any case, thanks so much for answer my question. I think it answer
my original post perfectly :)
Regards,
Guillaume.
On Tue, Aug 27, 2019 at 10:08 AM Paul Edmon <ped...@cfa.harvard.edu
<mailto:ped...@cfa.harvard.edu>> wrote:
At least for our cluster we generally recommend that if you are
submitting large numbers of jobs you either use a job array or you
just for loop over the jobs you want to submit. A fork bomb is
definitely not recommended. For highest throughput submission a
job array is your best bet as in one submission it will generate
thousands of jobs which then the scheduler can handle sensibly.
So I highly recommend using job arrays.
-Paul Edmon-
On 8/27/19 3:45 AM, Guillaume Perrault Archambault wrote:
Hi Paul,
Thanks a lot for your suggestion.
The cluster I'm using has thousands of users, so I'm doubtful the
admins will change this setting just for me. But I'll mention it
to the support team I'm working with.
I was hoping more for something that can be done on the user end.
Is there some way for the user to measure whether the scheduler
is in RPC saturation? And then if it is, I could make sure my
script doesn't launch too many jobs in parallel.
Sorry if my question is too vague, I don't understand the backend
of the SLURM scheduler too well, so my questions are using the
limited terminology of a user.
My concern is just to make sure that my scripts don't send out
more commands (simultaneously) than the scheduler can handle.
For example, as an extreme scenario, suppose a user forks off
1000 sbatch commands in parallel, is that more than the scheduler
can handle? As a user, how can I know whether it is?
Regards,
Guillaume.
On Mon, Aug 26, 2019 at 10:15 AM Paul Edmon
<ped...@cfa.harvard.edu <mailto:ped...@cfa.harvard.edu>> wrote:
We've hit this before due to RPC saturation. I highly
recommend using max_rpc_cnt and/or defer for scheduling.
That should help alleviate this problem.
-Paul Edmon-
On 8/26/19 2:12 AM, Guillaume Perrault Archambault wrote:
Hello,
I wrote a regression-testing toolkit to manage large numbers
of SLURM jobs and their output (the toolkit can be found
here <https://github.com/gobbedy/slurm_simulation_toolkit/>
if anyone is interested).
To make job launching faster, sbatch commands are forked, so
that numerous jobs may be submitted in parallel.
We (the cluster admin and myself) are concerned that this
may cause unresponsiveness for other users.
I cannot say for sure since I don't have visibility over all
users of the cluster, but unresponsiveness doesn't seem to
have occurred so far. That being said, the fact that it
hasn't occurred yet doesn't mean it won't in the future. So
I'm treating this as a ticking time bomb to be fixed asap.
My questions are the following:
1) Does anyone have experience with large numbers of jobs
submitted in parallel? What are the limits that can be hit?
For example is there some hard limit on how many jobs a
SLURM scheduler can handle before blacking out / slowing down?
2) Is there a way for me to find/measure/ping this resource
limit?
3) How can I make sure I don't hit this resource limit?
From what I've observed, parallel submission can improve
submission time by a factor at least 10x. This can make a
big difference in users' workflows.
For that reason I would like to keep the option of launching
jobs sequentially as a last resort.
Thanks in advance.
Regards,
Guillaume.