On Tue, Oct 13, 2020 at 9:25 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > > Amit Kapila <amit.kapil...@gmail.com> writes: > > I have pushed this but it failed in one of the BF. See > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=florican&dt=2020-10-13%2003%3A07%3A25 > > The failure is shown below and I am analyzing it. See, if you can > > provide any insights. > > It's not very clear what spill_count actually counts (and the > documentation sure does nothing to clarify that), but if it has anything > to do with WAL volume, the explanation might be that florican is 32-bit. > All the animals that have passed that test so far are 64-bit. >
It is based on the size of the change. In this case, it is the size of the tuples inserted. See ReorderBufferChangeSize() know how we compute the size of each change. Once the total_size for changes crosses logical_decoding_work_mem (64kB) in this case, we will spill. So 'spill_count' is the number of times the size of changes in that transaction crossed the threshold and which lead to a spill of the corresponding changes. > > The reason for this problem could be that there is some transaction > > (say by autovacuum) which happened interleaved with this transaction > > and committed before this one. > > I can believe that idea too, but would it not have resulted in a > diff in spill_txns as well? > We count that 'spill_txns' once for a transaction that is ever spilled. I think the 'spill_txns' wouldn't vary for this particular test even if the autovacuum transaction happens-before the main transaction of the test because in that case, wait_for_decode_stats won't finish until it sees the main transaction ('spill_txns' won't be positive by that time) > In short, I'm not real convinced that a stable result is possible in this > test. Maybe you should just test for spill_txns and spill_count being > positive. > Yeah, that seems like the best we can do here. -- With Regards, Amit Kapila.