Re: [slurm-users] Database cluster

Daniel L'Hommedieu Tue, 23 Jan 2024 05:40:21 -0800

Hi Diego.

In our setup, the database is critical.  We have some wrapper scripts that 
consult the database for information, and we also set environment variables on 
login, based on user/partition associations.  If the database is down, none of 
those things work.


I doubt there is appetite in the organization to change the way our setup 
works, but if we can improve database reliability, that would be a good 
solution.  Mostly I am interested in protecting from hardware failure, and 
that’s why I’m interested in a cluster solution such as XtraDB.

Thanks.

Daniel

> On Jan 23, 2024, at 03:23, Diego Zuccato <diego.zucc...@unibo.it> wrote:
> 
> IIUC the database is not "critical": if it goes down, you lose access to some 
> statistics. But job data gets cached anyway and the db will be updated when 
> it comes back online.
> 
> Diego
> 
> Il 22/01/2024 18:23, Daniel L'Hommedieu ha scritto:
>> Community:
>> What do you do to ensure database reliability in your SLURM environment?  We 
>> can have multiple controllers and multiple slurmdbds, but my understanding 
>> is that slurmdbd can be configured with a single MySQL server, so what do 
>> you do?  Do you have that “single MySQL server” be a cluster, such as 
>> Percona XtraDB?  Do you use MySQL replication, then manually switch to 
>> slurmdbd to a replication slave if the master goes down?  Do you do 
>> something else?
>> Thanks.
>> Daniel
> 
> -- 
> Diego Zuccato
> DIFA - Dip. di Fisica e Astronomia
> Servizi Informatici
> Alma Mater Studiorum - Università di Bologna
> V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
> tel.: +39 051 20 95786
>

Re: [slurm-users] Database cluster

Reply via email to