IIUC the database is not "critical": if it goes down, you lose access to
some statistics. But job data gets cached anyway and the db will be
updated when it comes back online.
Diego
Il 22/01/2024 18:23, Daniel L'Hommedieu ha scritto:
Community:
What do you do to ensure database reliability in your SLURM environment? We
can have multiple controllers and multiple slurmdbds, but my understanding is
that slurmdbd can be configured with a single MySQL server, so what do you do?
Do you have that “single MySQL server” be a cluster, such as Percona XtraDB?
Do you use MySQL replication, then manually switch to slurmdbd to a replication
slave if the master goes down? Do you do something else?
Thanks.
Daniel
--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786