[slurm-users] can't use GRES to increase Raw Usage

2021-01-07 Thread Erik Bryer
I have 2 partitions: PartitionName=scavenge Nodes=saga-test01,saga-test02 MaxTime=72:00:00 State=UP PriorityTier=0 PreemptMode=REQUEUE AllowQos=scavenge AllowAccounts=borrowed,gaia default=yes TRESBillingWeights="CPU=1.0,Mem=0.25G,GRES/foolsgold=20" PartitionName=scavtres Nodes=saga-test01,s

[slurm-users] can't allocate 1 gpu per job

2021-01-06 Thread Erik Bryer
I have 4 gres gpus called foolsgold that I am trying to allocate, 1-to-a-job. But allocating 1 gpu allocates all gpus to that job, it seems. My batch script is: #!/bin/bash #SBATCH --partition=scavenge #SBATCH --qos=scavenge #SBATCH --account=borrowed #SBATCH --nodes=1 #SBATCH --tasks=1 #SBATCH -

Re: [slurm-users] trying to add gres

2021-01-05 Thread Erik Bryer
users@lists.schedmd.com Subject: Re: [slurm-users] trying to add gres On 24/12/20 4:42 pm, Erik Bryer wrote: > I made sure my slurm.conf is synchronized across machines. My intention > is to add some arbitrary gres for testing purposes. Did you update your gres.conf on all the nodes to match? A

[slurm-users] trying to add gres

2020-12-24 Thread Erik Bryer
Hello List, I am trying to change: NodeName=saga-test02 CPUS=2 SocketsPerBoard=1 CoresPerSocket=2 ThreadsPerCore=1 RealMemory=1800 State=UNKNOWN to NodeName=saga-test02 CPUS=2 SocketsPerBoard=1 CoresPerSocket=2 ThreadsPerCore=1 RealMemory=1800 State=UNKNOWN Gres=gpu:foolsgold:4 But I get this er

[slurm-users] getting fairshare

2020-12-16 Thread Erik Bryer
$ sshare -a Account User RawShares NormSharesRawUsage EffectvUsage FairShare -- -- --- --- - -- root 0.00 158 1.00 root

Re: [slurm-users] gres names

2020-12-16 Thread Erik Bryer
I just found an error in my attempt. I ran on saga-test02 while I'd made the change to saga-test01. Things are working better now. Thanks, Erik From: Erik Bryer Sent: Wednesday, December 16, 2020 8:51 AM To: Slurm User Community List Subject: Re: [slurm-

Re: [slurm-users] gres names

2020-12-16 Thread Erik Bryer
ember 16, 2020 12:07 AM To: Slurm User Community List Subject: Re: [slurm-users] gres names Hi Erik, Erik Bryer writes: > Thanks for your reply. I can't find NVML in the logs going back to > 11/22. dmesg goes back to the last boot, but has no mention of > NVML. Regarding make one

Re: [slurm-users] gres names

2020-12-15 Thread Erik Bryer
___ From: slurm-users on behalf of Michael Di Domenico Sent: Tuesday, December 15, 2020 1:24 PM To: Slurm User Community List Subject: Re: [slurm-users] gres names you can either make them up on your own or they get spit out by NVML in the slurmd.log file On Tue, Dec 15, 2020 at 12:55 PM Er

[slurm-users] gres names

2020-12-15 Thread Erik Bryer
Hi, Where do I get the gres names, e.g. "rtx2080ti", to use for my gpus in my node definitions in slurm.conf? Thanks, Erik

Re: [slurm-users] FairShare

2020-12-02 Thread Erik Bryer
I read that link. If Fair Share is so rational (low users get high scores, and high users get low scores), then why do ajoel's and xtsao's Fair Share scores differ this much? Their Level Fair Share scores make more sense. >sray ajoel 10.05 42449

Re: [slurm-users] FairShare

2020-12-02 Thread Erik Bryer
I'm not talking about the Level Fair Share. That's easy to compute. I'm talking about Fair Share -- what sshare prints out on the rightmost side. From: slurm-users on behalf of Ryan Cox Sent: Wednesday, December 2, 2020 10:31 AM To: Slurm User Community List ; M

Re: [slurm-users] can't lengthen my jobs log

2020-11-12 Thread Erik Bryer
That worked pretty well in that I got more data than I ever have before by a lot. It only goes back about 18 days, but I'm not sure why. The slurmdbd.conf back then contained no directives on retaining logs, which is supposed to mean it defaults to retaining them indefinitely. On another test cl

[slurm-users] sprio not working

2020-10-27 Thread Erik Bryer
Hi, When I type $ sprio JOBID PARTITION PRIORITY SITEAGE I get the headers only. The result is the same if I type sprio --users=ebryer. If I type $ sprio -j 4014879 Unable to find jobs matching user/id(s) specified It is as though I typed an invalid jobid. I checked th

[slurm-users] fake_gpus.conf, spoofing GPUs

2020-10-27 Thread Erik Bryer
There is a section in in gres_gpu.c that refers to fake_gpus which appears to be for spoofing GPUs. I tried creating a file /etc/slurm/fake_gpus.conf containing "gtx1080ti|1|1||/dev/nvidia0", but "scontrol show node saga-test01" showed GRES=(null); it didn't work. Does anyone have experience wit