On Sun, Nov 17, 2019 at 12:56:52PM +0100, Dmitry Dolgov wrote: > > On Wed, Jul 31, 2019 at 11:51:17PM -0700, Noah Misch wrote: > > > Also, if I understand the data-loss hazard properly, it's what you > > > said in the other thread: the latest_page_number could advance after > > > we make our decision about what to truncate, and then maybe we could > > > truncate newly-added data. We surely don't want to lock out the > > > operations that can advance last_page_number, so how does serializing > > > vac_truncate_clog help? > > > > > > I wonder whether what we need to do is add some state in shared > > > memory saying "it is only safe to create pages before here", and > > > make SimpleLruZeroPage or its callers check that. The truncation > > > logic would advance that value, but only after completing a scan. > > > (Oh ..., hmm, maybe the point of serializing truncations is to > > > ensure that we know that we can safely advance that?) > > > > vac_truncate_clog() already records "it is only safe to create pages before > > here" in ShmemVariableCache->xidWrapLimit, which it updates after any > > unlinks. > > The trouble comes when two vac_truncate_clog() run in parallel and you get a > > sequence of events like this: > > > > vac_truncate_clog() instance 1 starts, considers segment ABCD eligible to > > unlink > > vac_truncate_clog() instance 2 starts, considers segment ABCD eligible to > > unlink > > vac_truncate_clog() instance 1 unlinks segment ABCD > > vac_truncate_clog() instance 1 calls SetTransactionIdLimit() > > vac_truncate_clog() instance 1 finishes > > some backend calls SimpleLruZeroPage(), creating segment ABCD > > vac_truncate_clog() instance 2 unlinks segment ABCD > > > > Serializing vac_truncate_clog() fixes that. > > I'm probably missing something, so just wanted to clarify. Do I > understand correctly, that thread [1] and this one are independent, and > it is assumed in the scenario above that we're at the end of XID space, > but not necessarily having rounding issues? I'm a bit confused, since > the reproduce script does something about cutoffPage, and I'm not sure > if it's important or not.
I think the repro recipe contained an early fix for the other thread's bug. While they're independent in principle, I likely never reproduced this bug without having a fix in place for the other thread's bug. The bug in the other thread was easier to hit. > > > Can you post whatever script you used to detect/reproduce the problem > > > in the first place? > > > > I've attached it, but it's pretty hackish. Apply the patch on commit > > 7170268, > > make sure your --bindir is in $PATH, copy "conf-test-pg" to your home > > directory, and run "make trunc-check". This incorporates xid-burn > > acceleration code that Jeff Janes shared with -hackers at some point. > > I've tried to use it to understand the problem better, but somehow > couldn't reproduce via suggested script. I've applied it to 7170268 > (tried also apply rebased to the master with the same results) with the > conf-test-pg in place, but after going through all steps there are no > errors like: That is unfortunate. > Is there anything extra one needs to do to reproduce the problem, maybe > adjust delays or something? It wouldn't surprise me. I did all my testing on one or two systems; the hard-coded delays suffice there, but I'm sure there exist systems needing different delays. Though I did reproduce this bug, I'm motivated by the abstract problem more than any particular way to reproduce it. Commit 996d273 inspired me; by removing a GetCurrentTransactionId(), it allowed the global xmin to advance at times it previously could not. That subtly changed the concurrency possibilities. I think safe, parallel SimpleLruTruncate() is difficult to maintain and helps too rarely to justify such maintenance. That's why I propose eliminating the concurrency. > [1]: > https://www.postgresql.org/message-id/flat/20190202083822.GC32531%40gust.leadboat.com