Artur Zakirov <artur.zaki...@adjust.com> writes: > I attached the patch which fixes it in a different way. It calls > SignalBackends() in AtCommit_Notify(). It is possible to call SignalBackends() > outside of ProcessCompletedNotifies() after the commit > 51004c7172b5c35afac050f4d5849839a06e8d3b, which removes necessity of checking > of SignalBackends()'s result.
Hm. So that forecloses back-patching this to earlier than v13. On the other hand, given that we've been ignoring the bug for awhile, maybe a fix that only works in v13+ is good enough. (Or maybe by now it'd be safe to back-patch the v13-era async.c changes? Don't really want to, though.) The larger problem with this patch is exactly what Andres said: if a replication worker or other background process is sending notifies, but no normal backend is listening, then nothing will ever call asyncQueueAdvanceTail() so the message queue will bloat until it overflows and things start failing. That's not OK. Perhaps it could be fixed by moving the "if (backendTryAdvanceTail)" stanza into AtCommit_Notify. That's not ideal, because really it's best to not do that till after we've read our own notifies, but it might be close enough. At that point ProcessCompletedNotifies's only responsibility would be to call asyncQueueReadAllNotifications, which would allow some simplifications. There are some sort-of-cosmetic-but-important-for-future-proofing issues too: * I think that it's safe to move these actions to AtCommit_Notify, given where that is called in the CommitTransaction sequence, but there is nothing in xact.c suggesting that that call is in any way ordering-critical (because today, it isn't). I think we need some comments there to prevent somebody from breaking this in future. Maybe about like smgrDoPendingDeletes(true); + /* + * Send out notification signals to other backends (and do other + * post-commit NOTIFY cleanup). This must not happen until after + * our transaction is fully done from the viewpoint of other + * backends. + */ AtCommit_Notify(); + /* + * Everything after this should be purely-internal-to-this-backend + * cleanup. + */ AtEOXact_GUC(true, 1); * You need to spend more effort on the comments in async.c too. Some of these changes are wrong plus there are places that should be changed and weren't. Also, postgres.c's comment about ProcessCompletedNotifies is invalidated by this patch. * There is some verbiage about NOTIFY in bgworker.sgml, which looks like it may be wrong now, and it certainly will be wrong after this patch. We don't want to be encouraging bgworkers to call ProcessCompletedNotifies. Maybe we should rename it, to force the issue? regards, tom lane