Ole,
Fair enough, it is actually slurmctld that does the caching. Technical
typo on my part there.
Just trying to let the user know, there is a window that they have to
ensure no information is lost during a database outage.
Brian Andrus
On 11/1/2022 1:43 AM, Ole Holm Nielsen wrote:
Hi Brian,
On 11/1/22 05:28, Brian Andrus wrote:
It caches up to a point. As I understand it, that is about an hour
(depending on size and how busy the cluster is, as well as available
memory, etc).
Have you found any documentation of slurmdbd caching? It's well-known
that slurmctld caches information while slurmdbd is down, see for
example page 30 in the talk "Field Notes Mark 2: Random Musings From
Under A New Hat"[1] by Tim Wickberg, SchedMD:
For slurmdbd, the critical element in the failure domain is
MySQL, not slurmdbd. slurmdbd itself is stateless.
● slurmctld will cache accounting records (up to a limit) if
slurmdbd is unavailable. This can be hours+ to days+
depending on your system without data loss.
The statelessness of slurmdbd makes me think that it can't cache any
data.
Thanks,
Ole
[1] https://slurm.schedmd.com/publications.html
On 10/31/2022 9:20 PM, Richard Chang wrote:
Hi,
Just for my info, I would like to know what happens when SlurmDBD
loses connection to the backend Database, for ex, MariaDB.
Does it cache the accounting info and keep them till the DB comes
back up ?, or does it panic and shut down ?