On 3/21/19 6:56 PM, Ryan Novosielski wrote:
On Mar 21, 2019, at 12:21 PM, Loris Bennett <loris.benn...@fu-berlin.de> wrote:
Our last cluster only hit around 2.5 million jobs after
around 6 years, so database conversion was never an issue. For sites
with a higher-throughput things may be different, but I would hope that
at those places, the managers would know the importance of planned
updates and testing.
I’d be curious about any database tuning you might have done, or anyone else
here. SchedMD’s guidance is minimal.
I’ve ever been impressed with the performance on ours, and I’ve also seen other
sites reporting >24 hour database conversion times.
Database tuning is actually documented by SchedMD, but you have to find
the appropriate pages first ;-)
I have collected Slurm database information in my Wiki page
https://wiki.fysik.dtu.dk/niflheim/Slurm_database. You may want to
focus on these sections:
* MySQL configuration (Innodb configuration)
* Setting database purge parameters (prune unwanted old database entries)
* Backup and restore of database (hopefully everyone does this already)
* Upgrade of MySQL/MariaDB (MySQL versions)
* Migrate the slurmdbd service to another server (I decided to do that
recently)
I hope this sheds some light on what needs to be considered.
/Ole