Re: [slurm-users] Controller / backup controller q's

2018-05-29 Thread Patrick Goetz
On 05/25/2018 11:19 AM, Will Dennis wrote: Not yet time for us... There's problems with U18.04 that render it unusable for our environment. What problems have you run in to with 18.04?

Re: [slurm-users] Controller / backup controller q's

2018-05-25 Thread Will Dennis
On Friday, May 25, 2018 5:31 AM, Pär Lindfors wrote: > Time to start upgrading to Ubuntu 18.04 now then? :-) Not yet time for us... There's problems with U18.04 that render it unusable for our environment. > For a 10 node cluster it might make more sense to run slurmctld and slurmdbd > on the

Re: [slurm-users] Controller / backup controller q's

2018-05-25 Thread Will Dennis
list. If the workloads were Dockerized, I’d probably run them via Kubernetes rather than Slurm... -Will From: slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] On Behalf Of John Hearns Sent: Friday, May 25, 2018 5:44 AM To: Slurm User Community List Subject: Re: [slurm-users] Controller

Re: [slurm-users] Controller / backup controller q's

2018-05-25 Thread John Hearns
Will, I know I will regret chiming in here. Are you able to say what cluster manager or framework you are using? I don't see a problem in running two different distributions. But as Per says look at your development environment. For my part, I would ask have you thought about containerisation? ie

Re: [slurm-users] Controller / backup controller q's

2018-05-25 Thread Pär Lindfors
Hi Will, On 05/24/2018 05:43 PM, Will Dennis wrote: > (we were using CentOS 7.x > originally, now the compute nodes are on Ubuntu 16.04.) Currently, we > have a single controller (slurmctld) node, an accounting db node> (slurmdbd), > and 10 compute/worker nodes (slurmd.) Time to start upgrading

Re: [slurm-users] Controller / backup controller q's

2018-05-25 Thread Benjamin Redling
Am 24.05.2018 um 17:43 schrieb Will Dennis: > 3)  What are the steps to replace a primary controller, given that a > backup controller exists? (Hopefully this is already documented > somewhere that I haven’t found yet) Why not drive such a small cluster with a single primary controller in a mig

[slurm-users] Controller / backup controller q's

2018-05-24 Thread Will Dennis
Hi all, We are building out a new Slurm cluster for a research group here; unfortunately this has taken place over a long period of time, and there's been some architectural changes made in the middle, most importantly the host OS on the Slurm nodes (we were using CentOS 7.x originally, now the