On 2018-04-17 17:32:45 -0400, Bruce Momjian wrote: > On Mon, Apr 9, 2018 at 03:42:35PM +0200, Tomas Vondra wrote: > > That doesn't seem like a very practical way. It's better than nothing, > > of course, but I wonder how would that work with containers (where I > > think you may not have access to the kernel log at all). Also, I'm > > pretty sure the messages do change based on kernel version (and possibly > > filesystem) so parsing it reliably seems rather difficult. And we > > probably don't want to PANIC after I/O error on an unrelated device, so > > we'd need to understand which devices are related to PostgreSQL.
You can certainly have access to the kernel log in containers. I'd assume such a script wouldn't check various system logs but instead tail /dev/kmsg or such. Otherwise the variance between installations would be too big. There's not *that* many different type of error messages and they don't change that often. If we'd just detect error for the most common FSs we'd probably be good. Detecting a few general storage layer message wouldn't be that hard either, most things have been unified over the last ~8-10 years. > Replying to your specific case, I am not sure how we would use a script > to check for I/O errors/space-exhaustion if the postgres user doesn't > have access to it. Not sure what you mean? Space exhaustiion can be checked when allocating space, FWIW. We'd just need to use posix_fallocate et al. > Does O_DIRECT work in such container cases? Yes. Greetings, Andres Freund