Hi David, On Thu, Feb 23, 2023 at 10:50 AM David Laehnemann <david.laehnem...@hhu.de> wrote:
> But from your comment I understand that handling these queries in > batches would be less work for slurmdbd, right? So instead of querying > each jobid with a separate database query, it would do one database > query for the whole list? Is that really easier for the system, or > would it end up doing a call for each jobid, anyway? > >From the perspective of avoiding RPC flood, it is much better to use a batch query. That said, if you have an extremely large number of jobs in the queue, you still wouldn't want to query the status too frequently. > And just to be as clear as possible, a call to sacct would then look > like this: > sacct -X -P -n --format=JobIdRaw,State -j <jobid_1>,<jobid_2>,... > That would be one way to do it, but I think there are other approaches that might be better. For example, there is no requirement for the job name to be unique. So if the snakemake pipeline has a configurable instance name="foo", and snakemake was configured to specify its own name as the job when submitting jobs (e.g. sbatch -J foo ...) then the query for all jobs in the pipeline is simply: sacct --name=foo Because we can of course rewrite the respective code section, so any > insight on how to do this job accounting more efficiently (and better > tailored to how Slurm does things) is appreciated. > I appreciate that you are interested in improving the integration to make it more performant. We are seeing an increase in meta-scheduler use at our site, so this is a worthwhile problem to tackle. Thanks, -Sean