Re: [slurm-users] speed / efficiency of sacct vs. scontrol

Sean Maxwell Thu, 23 Feb 2023 09:01:46 -0800

Hi David,

On Thu, Feb 23, 2023 at 10:50 AM David Laehnemann <david.laehnem...@hhu.de>
wrote:


> But from your comment I understand that handling these queries in
> batches would be less work for slurmdbd, right? So instead of querying
> each jobid with a separate database query, it would do one database
> query for the whole list? Is that really easier for the system, or
> would it end up doing a call for each jobid, anyway?
>

>From the perspective of avoiding RPC flood, it is much better to use a
batch query. That said, if you have an extremely large number of jobs in
the queue, you still wouldn't want to query the status too frequently.


> And just to be as clear as possible, a call to sacct would then look
> like this:
> sacct -X -P -n --format=JobIdRaw,State -j <jobid_1>,<jobid_2>,...
>

That would be one way to do it, but I think there are other approaches that
might be better. For example, there is no requirement for the job name to
be unique. So if the snakemake pipeline has a configurable instance
name="foo", and snakemake was configured to specify its own name as the job
when submitting jobs (e.g. sbatch -J foo ...) then the query for all jobs
in the pipeline is simply:

sacct --name=foo

Because we can of course rewrite the respective code section, so any
> insight on how to do this job accounting more efficiently (and better
> tailored to how Slurm does things) is appreciated.
>

I appreciate that you are interested in improving the integration to make
it more performant. We are seeing an increase in meta-scheduler use at our
site, so this is a worthwhile problem to tackle.

Thanks,

-Sean

Re: [slurm-users] speed / efficiency of sacct vs. scontrol

Reply via email to