On Thu, Dec 9, 2021 at 8:56 PM Andres Freund <and...@anarazel.de> wrote: > I think we should move *away* from single user mode, rather than the > opposite. It's a substantial code burden and it's hard to use.
Yes. This thread seems to be largely devoted to the topic of making single-user vacuum work better, but I don't see anyone asking the question "why do we have a message that tells people to vacuum in single user mode in the first place?". It's basically bad advice, with one small exception that I'll talk about in a minute. Suppose we had a message in the tree that said "HINT: Consider angering a live anaconda to fix this problem." If that were so, the correct thing to do wouldn't be to add a section to our documentation explaining how to deal with angry anacondas. The correct thing to do would be to remove the hint as bad advice that we never should have offered in the first place. And so here. We should not try to make vacuum in single user-mode work better or differently, or at least that shouldn't be our primary objective. We should just stop telling people to do it. We should probably add messages and documentation *discouraging* the use of single user mode for recovering from wraparound trouble, exactly the opposite of what we do now. There's nothing we can do in single-user mode that we can't do equally well in multi-user mode. If people try to fix wraparound problems in multi-user mode, they still have read-only access to their database, they can use parallelism, they can use command line utilities like vacuumdb, and they can use psql which has line editing and allows remote access and is a way nicer user experience than running postgres --single. We need a really compelling reason to tell people to give up all those advantages, and there is no such reason. It makes just as much sense as telling people to deal with wraparound problems by angering a live anaconda. I did say there was an exception, and it's this: the last time I studied this issue back in 2019,[1] vacuum insisted on trying to truncate tables even when the system is in wraparound danger. Then it would fail, because truncating the table required allocating an XID, which would fail if we were short on XIDs. By putting the system in single user mode, you could continue to allocate XIDs and thus VACUUM would work. However, if you think about this for even 10 seconds, you can see that it's terrible. If we're so short of XIDs that we are scared to allocate them for fear of causing an actual wraparound, putting the system into a mode where that protection is bypassed is a super-terrible idea. People will be able to run vacuum, yes, but if they have too many tables, they will actually experience wraparound and thus data loss before they process all the tables they have. What we ought to do to solve this problem is NOT TRUNCATE when the number of remaining XIDs is small, so that we don't consume any of the remaining XIDs until we get the system out of wraparound danger. I think the "failsafe" stuff Peter added in v14 fixes that, though. If not, we should adjust it so it does. And then we should KILL WITH FIRE the message telling people to use single user mode -- and once we do that, the question of what the behavior ought to be when someone does run VACUUM in single user mode becomes a lot less important. This problem is basically self-inflicted. We have given people bad advice (use single user mode) and then they suffer when they take it. Ameliorating the suffering isn't the worst idea ever, but it's basically fixing the wrong problem. -- Robert Haas EDB: http://www.enterprisedb.com [1] http://postgr.es/m/CA+Tgmob1QCMJrHwRBK8HZtGsr+6cJANRQw2mEgJ9e=d+z7c...@mail.gmail.com