Hi John, Many thanks for all the great feedback!
> Okay, the changes look good. To go further, I think we need to combine into > two patches, one with 0001-0003 and one with 0004: > > 1. Correct false statements about "shutdown" etc. This should contain changes > that can safely be patched all the way to v11. > 2. Change bad advice (single-user mode) into good advice. We can target head > first, and then try to adopt as far back as we safely can (and test). Done. > > > In swapping this topic back in my head, I also saw [2] where Robert > > > suggested > > > > > > "that old prepared transactions and stale replication > > > slots should be emphasized more prominently. Maybe something like: > > > > > > HINT: Commit or roll back old prepared transactions, drop stale > > > replication slots, or kill long-running sessions. > > > Ensure that autovacuum is progressing, or run a manual database-wide > > > VACUUM." > > > > It looks like the hint regarding replication slots was added at some > > point. Currently we have: > > > > ``` > > errhint( [...] > > "You might also need to commit or roll back old prepared > > transactions, or drop stale replication slots."))); > > ``` > > Yes, the exact same text as it appeared in the [2] thread above, which > prompted Robert's comment I quoted. I take the point to mean: All of these > things need to be taken care of *first*, before vacuuming, so the hint should > order things so that it is clear. > > > Please let me know if you think > > we should also add a suggestion to kill long-running sessions, etc. > > +1 for also adding that. OK, done. I included this change as a separate patch. It can be squashed with another one if necessary. While on it, I noticed that multixact.c still talks about a "shutdown". I made corresponding changes in 0001. > - errmsg("database is not accepting commands to avoid wraparound data loss in > database \"%s\"", > + errmsg("database is not accepting commands that generate new XIDs to avoid > wraparound data loss in database \"%s\"", > > I'm not quite on board with the new message, but welcome additional opinions. > For one, it's a bit longer and now ambiguous. I also bet that "generate XIDs" > doesn't really communicate anything useful. The people who understand > exactly what that means, and what the consequences are, are unlikely to let > the system get near wraparound in the first place, and might even know enough > to ignore the hint. > > I'm thinking we might need to convey something about "writes". While it's > less technically correct, I imagine it's more useful. Remember, many users > have it drilled in their heads that they need to drop immediately to > single-user mode. I'd like to give some idea of what they can and cannot do. This particular wording was chosen for consistency with multixact.c: ``` errmsg("database is not accepting commands that generate new MultiXactIds to avoid wraparound data loss in database \"%s\"", ``` The idea of using "writes" is sort of OK, but note that the same message will appear for a query like: ``` SELECT pg_current_xact_id(); ``` ... which doesn't do writes. The message will be misleading in this case. On top of that, although a PostgreSQL user may not be aware of MultiXactIds, arguably there are many users that are aware of XIDs. Not to mention the fact that XIDs are well documented. I didn't make this change in v4 since it seems to be controversial and probably not the highest priority at the moment. I suggest we discuss it separately. > I propose for discussion that 0004 should show in the docs all the queries > for finding prepared xacts, repl slots etc. If we ever show the info at > runtime, we can dispense with the queries, but there seems to be no urgency > for that... Good idea. > + Previously it was required to stop the postmaster and VACUUM the > database > + in a single-user mode. There is no need to use a single-user mode > anymore. > > I think we need to go further and actively warn against it: It's slow, > impossible to monitor, disables replication and disables safeguards against > wraparound. (Other bad things too, but these are easily understandable for > users) > > Maybe mention also that it's main use in wraparound situations is for a way > to perform DROPs and TRUNCATEs if that would help speed up resolution. Fixed. -- Best regards, Aleksander Alekseev
v4-0001-Correct-the-docs-and-messages-about-preventing-XI.patch
Description: Binary data
v4-0002-Don-t-recommend-running-VACUUM-in-a-single-user-m.patch
Description: Binary data
v4-0003-Modify-the-hints-about-preventing-XID-wraparound.patch
Description: Binary data