On Mon, Apr 18, 2022 at 11:19 AM Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Sat, Apr 16, 2022 at 10:42 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > > > > My pet dinosaur prairiedog just failed in the contrib/test_decoding > > tests [1]: > > > > diff -U3 > /Users/buildfarm/bf-data/HEAD/pgsql.build/contrib/test_decoding/expected/stream.out > /Users/buildfarm/bf-data/HEAD/pgsql.build/contrib/test_decoding/results/stream.out > > --- > /Users/buildfarm/bf-data/HEAD/pgsql.build/contrib/test_decoding/expected/stream.out > 2022-04-15 07:59:17.000000000 -0400 > > +++ > /Users/buildfarm/bf-data/HEAD/pgsql.build/contrib/test_decoding/results/stream.out > 2022-04-15 09:06:36.000000000 -0400 > > @@ -77,10 +77,12 @@ > > streaming change for transaction > > streaming change for transaction > > streaming change for transaction > > + closing a streamed block for transaction > > + opening a streamed block for transaction > > streaming change for transaction > > closing a streamed block for transaction > > committing streamed transaction > > -(13 rows) > > +(15 rows) > > > > Looking at the postmaster log, it's obvious where this extra transaction > > came from: auto-analyze ran on pg_type concurrently with the test step > > just before this one. That could only happen if the tests ran long > enough > > for autovacuum_naptime to elapse, but prairiedog is a pretty slow > machine. > > (And I hasten to point out that some other animals, such as those running > > valgrind or CLOBBER_CACHE_ALWAYS, are even slower.) > > > > We've seen this sort of problem before [2], and attempted to fix it [3] > > by making these tests ignore empty transactions. But of course > > auto-analyze's transaction wasn't empty, so that didn't help. > > > > The possible reason here is that this extra (auto-analyze) transaction > causes the logical decoding work mem to reach before the last change > of the test's transaction. As can be seen from the logs, it just > closed the stream before the last change and then opened a new stream > for the last change. Now, it is true that the auto-analyze changes > won't be decoded as they don't perform DML operation on any > non-catalog table but it could generate some invalidation message > which needs to be processed even though we won't send anything related > to it to the downstream. > This analysis seems right to me. > This needs to be verified once by doing some manual testing as it may > not be easily reproducible every time. If this happens to be true then > I think your suggestion related to increasing autovacuum_naptime would > work. > I will try to reproduce this, maybe by reducing the autovacuum_naptime or parallelly running some script that continuously performs DDL-only transactions. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com