Re: [slurm-users] Topology configuration questions:

2019-01-22 Thread Ryan Novosielski
Prentice (and others) — if the NodeWeight/topology plugin interaction bothers you, feel free to tack onto bug 6384. https://bugs.schedmd.com/show_bug.cgi?id=6384 > On Jan 22, 2019, at 1:15 PM, Prentice Bisbal wrote: > > Killian, > > Thanks for the input. Unfortunately, all of this information

Re: [slurm-users] Topology configuration questions:

2019-01-22 Thread Prentice Bisbal
Killian, Thanks for the input. Unfortunately, all of this information from you, Ryan and others, is really ruining my plans, since it makes it look like my plan to fix a problem wit my cluster will not be as easy to fix as I'd hoped. One of the issues with my "Frankencluster" is that I'd like

Re: [slurm-users] Topology configuration questions:

2019-01-22 Thread Prentice Bisbal
Ryan, Thanks for looking into this. I hadn't had a chance to revisit the documentation since posing my question. Thanks for doing that for me. Prentice Bisbal Lead Software Engineer Princeton Plasma Physics Laboratory http://www.pppl.gov On 1/18/19 2:58 PM, Ryan Novosielski wrote: The docume

Re: [slurm-users] Topology configuration questions:

2019-01-18 Thread Ryan Novosielski
The documentation indicates you need it everywhere: https://slurm.schedmd.com/topology.conf.html "Changes to the configuration file take effect upon restart of Slurm daemons, daemon receipt of the SIGHUP signal, or execution of the command "scontrol reconfigure" unless otherwise noted." I have

Re: [slurm-users] Topology configuration questions:

2019-01-18 Thread Ryan Novosielski
> On Jan 18, 2019, at 11:53 AM, Kilian Cavalotti > wrote: > > On Fri, Jan 18, 2019 at 6:31 AM Prentice Bisbal wrote: >>> Note that if you care about node weights (eg. NodeName=whatever001 >>> Weight=2, etc. in slurm.conf), using the topology function will disable it. >>> I believe I was pro

Re: [slurm-users] Topology configuration questions:

2019-01-18 Thread Kilian Cavalotti
On Fri, Jan 18, 2019 at 6:31 AM Prentice Bisbal wrote: > > Note that if you care about node weights (eg. NodeName=whatever001 > > Weight=2, etc. in slurm.conf), using the topology function will disable it. > > I believe I was promised a warning about that in the future in a > > conversation wit

Re: [slurm-users] Topology configuration questions:

2019-01-18 Thread Prentice Bisbal
On 01/17/2019 07:55 PM, Fulcomer, Samuel wrote: We use topology.conf to segregate architectures (Sandy->Skylake), and also to isolate individual nodes with 1Gb/s Ethernet rather than IB (older GPU nodes with deprecated IB cards). In the latter case, topology.conf had a switch entry for each no

Re: [slurm-users] Topology configuration questions:

2019-01-18 Thread Prentice Bisbal
On 01/17/2019 06:36 PM, Ryan Novosielski wrote: I don’t actually know the answer to this one, but we have it provisioned to all nodes. Note that if you care about node weights (eg. NodeName=whatever001 Weight=2, etc. in slurm.conf), using the topology function will disable it. I believe I

Re: [slurm-users] Topology configuration questions:

2019-01-17 Thread Fulcomer, Samuel
ent:* Thursday, January 17, 2019 5:58 PM > *To:* Slurm User Community List > *Subject:* Re: [slurm-users] Topology configuration questions: > > We use topology.conf to segregate architectures (Sandy->Skylake), and also > to isolate individual nodes with 1Gb/s Ethernet rather than IB (o

Re: [slurm-users] Topology configuration questions:

2019-01-17 Thread Nicholas McCollum
odes using the --constraint=whatever flag. Nicholas McCollum Alabama Supercomputer Authority From: "Fulcomer, Samuel" Sent: Thursday, January 17, 2019 5:58 PM To: Slurm User Community List Subject: Re: [slurm-users] Topology configuration questions: We

Re: [slurm-users] Topology configuration questions:

2019-01-17 Thread Fulcomer, Samuel
We use topology.conf to segregate architectures (Sandy->Skylake), and also to isolate individual nodes with 1Gb/s Ethernet rather than IB (older GPU nodes with deprecated IB cards). In the latter case, topology.conf had a switch entry for each node. It used to be the case that SLURM was unhappy wi

Re: [slurm-users] Topology configuration questions:

2019-01-17 Thread Ryan Novosielski
I don’t actually know the answer to this one, but we have it provisioned to all nodes. Note that if you care about node weights (eg. NodeName=whatever001 Weight=2, etc. in slurm.conf), using the topology function will disable it. I believe I was promised a warning about that in the future in a

Re: [slurm-users] Topology configuration questions:

2019-01-17 Thread Ryan Novosielski
> On Jan 17, 2019, at 4:49 PM, Prentice Bisbal wrote: > > From https://slurm.schedmd.com/topology.html: > >> Note that compute nodes on switches that lack a common parent switch can be >> used, but no job will span leaf switches without a common parent (unless the >> TopologyParam=TopoOptional

Re: [slurm-users] Topology configuration questions:

2019-01-17 Thread Prentice Bisbal
And a follow-up question: Does topology.conf need to be on all the nodes, or just the slurm controller? It's not clear from that web page. I would assume only the controller needs it. Prentice On 1/17/19 4:49 PM, Prentice Bisbal wrote: From https://slurm.schedmd.com/topology.html: Note that

[slurm-users] Topology configuration questions:

2019-01-17 Thread Prentice Bisbal
From https://slurm.schedmd.com/topology.html: Note that compute nodes on switches that lack a common parent switch can be used, but no job will span leaf switches without a common parent (unless the TopologyParam=TopoOptional option is used). For example, it is legal to remove the line "Switch