On Fri, May 17, 2019 at 6:28 AM Mick <michaelkintz...@gmail.com> wrote: > > Count yourself lucky. You could have discovered your disk wouldn't spin up > again, your PSU packed up, or even the MoBo chipset decided to retire from > active service. Eventually, any of these hardware problems would manifest > themselves, but a reboot could reveal their demise sooner and hopefully at a > point where you were somewhat prepared for it. >
++ You can't completely prevent reboots (not unless you are willing to spend big and go mainframe or something like that - and those create a different set of issues). What you can do is take steps to reduce the risk that an unplanned reboot will cause problems. One of the best ways to ensure you're prepared for disaster is to make disaster routine. Regular reboots can be a part of this, because you can do them at a time when you have time to deal with problems, and when you're looking for problems. This is why I've made the move to containers largely. I still have a few processes running on my host because, but almost everything has moved into containers that do one thing. When I update a container I take a snapshot, run updates, shut it down, take another snapshot, start it up, and test the service it runs. Since each container only does one thing, I know exactly what to test. If it works I'm good, and if it doesn't work I can roll it back and not worry about what that might break on the 47 other services running on the same host. Every update involves an effective reboot for that one service, so I know that in the event of a host reboot they will generally all come up fine. I of course update the host regularly and reboot that for kernel updates, which seem to come about twice a week these days anyway. Obviously I don't run updates the day before I leave on vacation, unless they are security critical, and then I exercise some care. The downside is that I end up with a lot more hosts to keep up to date, because I can't just run emerge -u world once on one host and have every service I run updated. However, I gladly accept the extra work because the work itself becomes much simpler and predictable. If I'm updating my imapd container and imapd still works, then I'm fine. I don't have to worry about suddenly realizing two days later that postgrey is bouncing a ton of mail or whatever. If something obscure like a text editor breaks in my imapd container which I didn't catch, that might be an annoyance but it doesn't really impact me much since it isn't critical for the operation of that container. -- Rich