Thanks Jeff ! We upgrade slurm to 18.08.4 and now work with Weight ! but the parameter its possible running with plugin priority/multifactor ?
Thanks in advance Regards El mar., 3 dic. 2019 a las 17:37, Sarlo, Jeffrey S (<jsa...@central.uh.edu>) escribió: > Which version of slurm are you using? I know in the early versions of > 18.08 prior to 18.08.04 there was a bug with weights not working. Once we > got past 18.08.04, then weights worked for us. > > > > Jeff > > University of Houston - HPC > > > > *From:* slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] *On > Behalf Of *Sistemas NLHPC > *Sent:* Tuesday, December 03, 2019 12:33 PM > *To:* Slurm User Community List > *Subject:* Re: [slurm-users] Slurm configuration, Weight Parameter > > > > Hi Renfro > > > > I am testing this configuration, test configuration and as clean as > possible: > > > > ==== > > > > NodeName=devcn050 RealMemory=3007 Features=3007MB Weight=200 State=idle > Sockets=2 CoresPerSocket=1 > NodeName=devcn002 RealMemory=3007 Features=3007MB Weight=1 State=idle > Sockets=2 CoresPerSocket=1 > NodeName=devcn001 RealMemory=2000 Features=2000MB Weight=500 State=idle > Sockets=2 CoresPerSocket=1 > > PartitionName=slims Nodes=devcn001,devcn002,devcn050 Default=yes > Shared=yes State=up > > > > === > > > > In your config is necessary one plugin extra or parameter for option > Weight? > > > > The configuration does not work as expected. > > > > Regards, > > > > El sáb., 30 nov. 2019 a las 10:30, Renfro, Michael (<ren...@tntech.edu>) > escribió: > > We’ve been using that weighting scheme for a year or so, and it works as > expected. Not sure how Slurm would react to multiple NodeName=DEFAULT lines > like you have, but here’s our node settings and a subset of our partition > settings. > > In our environment, we’d often have lots of idle cores on GPU nodes, since > those jobs tend to be GPU-bound rather than CPU-bound. So in one of our > interactive partitions, we let non-GPU jobs take up to 12 cores of a GPU > node. Additionally, we have three memory configurations in our main batch > partition. We want to bias jobs to running on the smaller-memory nodes by > default. And the same principle applies to our GPU partition, where the > smaller-memory GPU nodes get jobs before the larger-memory GPU node. > > ===== > > NodeName=gpunode[001-003] CoresPerSocket=14 RealMemory=382000 Sockets=2 > ThreadsPerCore=1 Weight=10011 Gres=gpu:2 > NodeName=gpunode004 CoresPerSocket=14 RealMemory=894000 Sockets=2 > ThreadsPerCore=1 Weight=10021 Gres=gpu:2 > NodeName=node[001-022] CoresPerSocket=14 RealMemory=62000 Sockets=2 > ThreadsPerCore=1 Weight=10201 > NodeName=node[023-034] CoresPerSocket=14 RealMemory=126000 Sockets=2 > ThreadsPerCore=1 Weight=10211 > NodeName=node[035-040] CoresPerSocket=14 RealMemory=254000 Sockets=2 > ThreadsPerCore=1 Weight=10221 > > PartitionName=any-interactive Default=NO MinNodes=1 MaxNodes=4 > MaxTime=02:00:00 AllowGroups=ALL PriorityJobFactor=3 PriorityTier=1 > DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 > PreemptMode=OFF ReqResv=NO DefMemPerCPU=2000 AllowAccounts=ALL AllowQos=ALL > LLN=NO MaxCPUsPerNode=12 ExclusiveUser=NO OverSubscribe=NO OverTimeLimit=0 > State=UP Nodes=node[001-040],gpunode[001-004] > > PartitionName=batch Default=YES MinNodes=1 MaxNodes=40 > DefaultTime=1-00:00:00 MaxTime=30-00:00:00 AllowGroups=ALL > PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO > Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO DefMemPerCPU=2000 > AllowAccounts=ALL AllowQos=ALL LLN=NO ExclusiveUser=NO OverSubscribe=NO > OverTimeLimit=0 State=UP Nodes=node[001-040] > > PartitionName=gpu Default=NO MinNodes=1 DefaultTime=1-00:00:00 > MaxTime=30-00:00:00 AllowGroups=ALL PriorityJobFactor=1 PriorityTier=1 > DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 > PreemptMode=OFF ReqResv=NO DefMemPerCPU=2000 AllowAccounts=ALL AllowQos=ALL > LLN=NO MaxCPUsPerNode=16 QoS=gpu ExclusiveUser=NO OverSubscribe=NO > OverTimeLimit=0 State=UP Nodes=gpunode[001-004] > > ===== > > > On Nov 29, 2019, at 8:09 AM, Sistemas NLHPC <siste...@nlhpc.cl> wrote: > > > > External Email Warning > > This email originated from outside the university. Please use caution > when opening attachments, clicking links, or responding to requests. > > Hi All, > > > > Thanks all for your posts > > > > Reading the documentation of Slurm and other sites like Niflheim > https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#node-weight (Ole > Holm Nielsen) the parameter "Weight" is to assign a value to the nodes, > with this you can have priority in the nodes. But I have not obtained > positive results. > > > > Thanks in advance > > > > Regards > > > > El sáb., 23 nov. 2019 a las 14:18, Chris Samuel (<ch...@csamuel.org>) > escribió: > > On 23/11/19 9:14 am, Chris Samuel wrote: > > > > > My gut instinct (and I've never tried this) is to make the 3GB nodes > be > > > in a separate partition that is guarded by AllowQos=3GB and have a QOS > > > called "3GB" that uses MinTRESPerJob to require jobs to ask for more > > > than 2GB of RAM to be allowed into the QOS. > > > > Of course there's nothing to stop a user requesting more memory than > > they need to get access to these nodes, but that's a social issue not a > > technical one. :-) > > > > -- > > Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA > > > >