[slurm-users] Does Slurm store "time in current state" values anywhere ?

2019-10-03 Thread Kevin Buckley
Hi there, we're hoping to overcome an issue where some of our users are keen on writing their own meta-schedulers, so as to try and beat the actual scheduler, but can't seemingly do as good a job as a scheduler that's been developed by people who understand scheduling (no real surprises there!),

[slurm-users] Gracefully shutting down cluster

2019-10-03 Thread Will Dennis
Hi all, I want to be able to gracefully shut down Slurm and then the node itself with a command that affects the entire cluster. It is my current understanding that I can set the “RebootProgram” param in slum.conf to be a command, and then trigger the shutdown via “scontrol reboot_nodes” which

Re: [slurm-users] Does Slurm store "time in current state" values anywhere ?

2019-10-03 Thread David Rhey
Hi, What about scontrol show job to see various things like: SubmitTime, EligibleTime, AccrueTime etc? David On Thu, Oct 3, 2019 at 4:53 AM Kevin Buckley wrote: > Hi there, > > we're hoping to overcome an issue where some of our users are keen > on writing their own meta-schedulers, so as to

Re: [slurm-users] Slurm very rarely assigned an estimated start time to a job

2019-10-03 Thread David Rhey
We've been working to tune our backfill scheduler here. Here is a presentation some of you might have seen at a previous SLUG on tuning the backfill scheduler. HTH! https://slurm.schedmd.com/SUG14/sched_tutorial.pdf David On Wed, Oct 2, 2019 at 1:37 PM Mark Hahn wrote: > >(most likely in the n

[slurm-users] Slurm version 19.05.3 is now available

2019-10-03 Thread Tim Wickberg
Slurm version 19.05.3 is now available, and includes a series of fixes since 19.05.2 was released nearly two months ago. Downloads are available at https://www.schedmd.com/downloads.php . Release notes follow below. - Tim -- Tim Wickberg Chief Technology Officer, SchedMD LLC Commercial Slurm

Re: [slurm-users] Does Slurm store "time in current state" values anywhere ?

2019-10-03 Thread Kevin Buckley
On 2019/10/04 03:26, David Rhey wrote: Whilst we're not looking to provide succour to meta-scheduler writers, we can see a need for some way to present and/or make use of, a "job has been in state S for time T" or "job entered current state at time T" info.

[slurm-users] ReqGRES value is not valid

2019-10-03 Thread Uemoto, Tomoki
Hi, all I want to configure generic consumable resources(gpu) and confirm that the resources are assigned to jobs on each node. I executed the following settings. o gres.conf Name=gpu File=/dev/tty[0-3] CPUs=[0-24] Name=gpu File=/dev/tty[4-7] CPUs=[25-47] o slurm.conf TaskPlugin=task/af

Re: [slurm-users] ReqGRES value is not valid

2019-10-03 Thread Chris Samuel
On 3/10/19 10:23 pm, Uemoto, Tomoki wrote: I don't know why it return value of ReqGres is 0. Which version of Slurm are you on? Also there looks to be a typo, you've got "prun" not "srun" in your batch script. All the best, Chris -- Chris Samuel : http://www.csamuel.org/ : Berkeley, C

Re: [slurm-users] ReqGRES value is not valid

2019-10-03 Thread Uemoto, Tomoki
Thank you for your reply I'm sorry there were some mistakes. # srun --version slurm 18.08.6 # $ cat gresgpu.sh #!/bin/bash #SBATCH -J gresgpu# Job name #SBATCH --gres=gpu:2 #SBATCH -o job.%j.out # Name of stdout output file (%j expands to jobId) srun sleep 60 $ o gres.conf