On 2018-05-16 06:04 -0500, Dave Sherohman wrote: > I'm running a galera cluster of three mariadb servers. It's been > brought down twice because one node has detected an inconsistency, > killed itself, and then systemd automatically restarted it. This is all > good so far. > > The problem comes in when it tries to restart and, because it shut down > in an inconsistent state, it wants to do a full state transfer to get > back in sync with the rest of the cluster, which involves copying > (potentially) 24G of data. Performing this transfer takes long enough > that systemd times out, assumes the restart failed, and kills it, so > it's not possible to bring the node back online without bypassing > systemd and running mysqld_safe manually. > > Based on some web searches, I've tried using `systemctl edit mysql` to > set "TimeoutStartUSec=infinity", but this does not appear to actually > have any effect, even after reloading the systemd daemon: > > # systemctl edit mysql > <add the setting using vim> > # systemctl show mysql -p TimeoutStartUSec > TimeoutStartUSec=1min 30s > # systemctl daemon-reload > # systemctl show mysql -p TimeoutStartUSec > TimeoutStartUSec=1min 30s > # systemctl daemon-reexec > # systemctl show mysql -p TimeoutStartUSec > TimeoutStartUSec=1min 30s > > What do I need to do to make this actually work?
Have you tried setting TimeoutStartSec rather than TimeoutStartUSec? Though I have to admit that I did not perform a web search but cheated by looking at the systemd.service(5) manpage, which mentions the former but not the latter. Cheers, Sven