I have 2 partitions:
PartitionName=scavenge Nodes=saga-test01,saga-test02 MaxTime=72:00:00 State=UP
PriorityTier=0 PreemptMode=REQUEUE AllowQos=scavenge
AllowAccounts=borrowed,gaia default=yes
TRESBillingWeights="CPU=1.0,Mem=0.25G,GRES/foolsgold=20"
PartitionName=scavtres Nodes=saga-test01,s
I have 4 gres gpus called foolsgold that I am trying to allocate, 1-to-a-job.
But allocating 1 gpu allocates all gpus to that job, it seems. My batch script
is:
#!/bin/bash
#SBATCH --partition=scavenge
#SBATCH --qos=scavenge
#SBATCH --account=borrowed
#SBATCH --nodes=1
#SBATCH --tasks=1
#SBATCH -
users@lists.schedmd.com
Subject: Re: [slurm-users] trying to add gres
On 24/12/20 4:42 pm, Erik Bryer wrote:
> I made sure my slurm.conf is synchronized across machines. My intention
> is to add some arbitrary gres for testing purposes.
Did you update your gres.conf on all the nodes to match?
A
Hello List,
I am trying to change:
NodeName=saga-test02 CPUS=2 SocketsPerBoard=1 CoresPerSocket=2 ThreadsPerCore=1
RealMemory=1800 State=UNKNOWN
to
NodeName=saga-test02 CPUS=2 SocketsPerBoard=1 CoresPerSocket=2 ThreadsPerCore=1
RealMemory=1800 State=UNKNOWN Gres=gpu:foolsgold:4
But I get this er
$ sshare -a
Account User RawShares NormSharesRawUsage
EffectvUsage FairShare
-- -- --- ---
- --
root 0.00 158 1.00
root
I just found an error in my attempt. I ran on saga-test02 while I'd made the
change to saga-test01. Things are working better now.
Thanks,
Erik
From: Erik Bryer
Sent: Wednesday, December 16, 2020 8:51 AM
To: Slurm User Community List
Subject: Re: [slurm-
ember 16, 2020 12:07 AM
To: Slurm User Community List
Subject: Re: [slurm-users] gres names
Hi Erik,
Erik Bryer writes:
> Thanks for your reply. I can't find NVML in the logs going back to
> 11/22. dmesg goes back to the last boot, but has no mention of
> NVML. Regarding make one
___
From: slurm-users on behalf of Michael
Di Domenico
Sent: Tuesday, December 15, 2020 1:24 PM
To: Slurm User Community List
Subject: Re: [slurm-users] gres names
you can either make them up on your own or they get spit out by NVML
in the slurmd.log file
On Tue, Dec 15, 2020 at 12:55 PM Er
Hi,
Where do I get the gres names, e.g. "rtx2080ti", to use for my gpus in my node
definitions in slurm.conf?
Thanks,
Erik
I read that link. If Fair Share is so rational (low users get high scores, and
high users get low scores), then why do ajoel's and xtsao's Fair Share scores
differ this much? Their Level Fair Share scores make more sense.
>sray ajoel 10.05 42449
I'm not talking about the Level Fair Share. That's easy to compute. I'm talking
about Fair Share -- what sshare prints out on the rightmost side.
From: slurm-users on behalf of Ryan Cox
Sent: Wednesday, December 2, 2020 10:31 AM
To: Slurm User Community List ; M
That worked pretty well in that I got more data than I ever have before by a
lot. It only goes back about 18 days, but I'm not sure why. The slurmdbd.conf
back then contained no directives on retaining logs, which is supposed to mean
it defaults to retaining them indefinitely. On another test cl
Hi,
When I type
$ sprio
JOBID PARTITION PRIORITY SITEAGE
I get the headers only. The result is the same if I type sprio --users=ebryer.
If I type
$ sprio -j 4014879
Unable to find jobs matching user/id(s) specified
It is as though I typed an invalid jobid. I checked th
There is a section in in gres_gpu.c that refers to fake_gpus which appears to
be for spoofing GPUs. I tried creating a file /etc/slurm/fake_gpus.conf
containing "gtx1080ti|1|1||/dev/nvidia0", but "scontrol show node saga-test01"
showed GRES=(null); it didn't work. Does anyone have experience wit
14 matches
Mail list logo