On Sun, Oct 31, 2021 at 10:59 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > The general policy at the moment is that a normally-functioning server > should emit *no* log traffic by default (other than a few messages > at startup and shutdown). log_checkpoints is a particularly poor > candidate for an exception to that policy, because it would produce so > much traffic. No DBA would be likely to consider it as anything but > log spam.
That's absolutely false. On any system where there's anything actually happening, there's going to be tons of stuff in the log because there are going to be failed connection attempts, queries that result in errors, and all kinds of other things like that. By any reasonable metric, the log volume of log_checkpoints=on is tiny. It can't log anything more than once per checkpoint interval, which means you're not even talking about 1 message per minute. You can easily have hundreds or thousands of errors or other log messages from user activity *per second* and even on a relatively quiet system that stuff is going to completely dwarf what you get from log_checkpoints. If log_checkpoints=on is your biggest source of log output, what you need isn't so much 'log_checkpoints=off' as 'pg_ctl stop'. > > It seems the checkpoint stats, that are emitted to server logs when > > the GUC log_checkpoints is enabled, are so important that a postgres > > database provider would ever want to disable the GUC. > > This statement seems ridiculous on its face. If users need to wait > with bated breath for a checkpoint report, we have something else > we need to fix. Besides appearing to be unwarranted mockery of what Bharath wrote, this statement also seems to reflect a complete lack of understanding of what is involved with maintaining a production system. Most of the time, things just work, so you don't need to look at the logs at all. But when things do go wrong, then you need some way to figure out what the problem is. System-wide performance problems not linked to an individual query are most often caused by either of two things: checkpoints, and big autovacuum jobs. If you've set up logging for those things in advance, then if and when the problem happens, you will have a chance of being able to understand it and solve it quickly. If you have not, the very first piece of advice you're going to get from me is to (1) check whether there are any autovacuum workers still running and if so figure out what they're doing and (2) set log_checkpoints=on and log_autovacuum_min_duration to the smallest value that isn't going to fill up your logs with too much garbage. I can tell you from long experience with this sort of situation that users do not love it. It means that they often cannot troubleshoot problems of this type that happened in the past, because there's simply no information in the log file that is of any use, and they have to wait for the problem to recur... and I don't think it is overstating anything to say that some of them probably DO wait with bated breath, because they'd like whatever the issue is to get fixed! I think shipping with log_checkpoints=on and log_autovacuum_min_duration=10m or so would be one of the best things we could possibly do to allow ex-post-facto troubleshooting of system-wide performance issues. The idea that users care more about the inconvenience of a handful of extra log messages than they do about being able to find problems when they have them matches no part of my experience. I suspect that many users would be willing to incur *way more* useless log messages than those settings would ever generate if it meant that they could actually find problems when and if they have them. And these messages would in fact be the most valuable thing in the log for a lot of users. What reasonable DBA cares more about the fact that the application attempted an insert that violated a foreign key constraint than they do about a checkpoint that took 20 minutes to fsync everything? The former is expected; if you thought that foreign key violations would never occur, you wouldn't need to incur the expense of having the system enforce them. The latter is unexpected and basically undiscoverable with default settings. -- Robert Haas EDB: http://www.enterprisedb.com