On Tue, Mar 8, 2022 at 11:59 PM Tomas Vondra <tomas.von...@enterprisedb.com> wrote: > > On 3/7/22 22:25, Tomas Vondra wrote: > >> > >> Interesting. I can think of one reason that might cause this - we log > >> the first sequence increment after a checkpoint. So if a checkpoint > >> happens in an unfortunate place, there'll be an extra WAL record. On > >> slow / busy machines that's quite possible, I guess. > >> > > > > I've tweaked the checkpoint_interval to make checkpoints more aggressive > > (set it to 1s), and it seems my hunch was correct - it produces failures > > exactly like this one. The best fix probably is to just disable decoding > > of sequences in those tests that are not aimed at testing sequence decoding. > > > > I've pushed a fix for this, adding "include-sequences=0" to a couple > test_decoding tests, which were failing with concurrent checkpoints. > > Unfortunately, I realized we have a similar issue in the "sequences" > tests too :-( Imagine you do a series of sequence increments, e.g. > > SELECT nextval('s') FROM generate_sequences(1,100); > > If there's a concurrent checkpoint, this may add an extra WAL record, > affecting the decoded output (and also the data stored in the sequence > relation itself). Not sure what to do about this ... >
I am also not sure what to do for it but maybe if in some way we can increase checkpoint timeout or other parameters for these tests then it would reduce the chances of such failures. The other idea could be to perform checkpoint before the start of tests to reduce the possibility of another checkpoint. -- With Regards, Amit Kapila.