On Tue, 28 Jan 2020 at 16:01, Michael Paquier <mich...@paquier.xyz> wrote: > > On Tue, Jan 21, 2020 at 02:07:30PM +0900, Michael Paquier wrote: > > On Mon, Jan 20, 2020 at 11:00:14AM -0800, Andres Freund wrote: > >> Hm. I'm think testing this with real LSNs is a better idea. What if the > >> node actually already is past FF/FFFFFFFF at this point? Quite unlikely, > >> I know, but still. I.e. why not get the current LSN after the INSERT, > >> and assert that the slot, after the restart, is that? > > > > Sure. If not disabling autovacuum in the test, we'd just need to make > > sure if that advancing is at least ahead of the INSERT position. > > Actually, as the advancing happens only up to this position we just > need to make sure that the LSN reported by the slot is the same as the > position advanced to. I have switched the test to just do that > instead of using a fake LSN. > > > Anyway, I am still not sure if we should got down the road to just > > mark the slot as dirty if any advancing is done and let the follow-up > > checkpoint to the work, if the advancing function should do the slot > > flush, or if we choose one and make the other an optional choice (not > > for back-branches, obviously. Based on my reading of the code, my > > guess is that a flush should happen at the end of the advancing > > function. > > I have been chewing on this point for a couple of days, and as we may > actually crash between the moment the slot is marked as dirty and the > moment the slot information is made consistent, we still have a risk > to have the slot go backwards even if the slot information is saved. > The window is much narrower, but well, the docs of logical decoding > mention that this risk exists. And the patch becomes much more > simple without changing the actual behavior present since the feature > has been introduced for logical slots. There could be a point in > having a new option to flush the slot information, or actually a > separate function to flush the slot information, but let's keep that > for a future possibility. > > So attached is an updated patch which addresses the problem just by > marking a physical slot as dirty if any advancing is done. Some > documentation is added about the fact that an advance is persistent > only at the follow-up checkpoint. And the tests are fixed to not use > a fake LSN but instead advance to the latest LSN position produced. > > Any objections?
LGTM. Thankyou. -- Craig Ringer http://www.2ndQuadrant.com/ 2ndQuadrant - PostgreSQL Solutions for the Enterprise