On Wed, May 6, 2020 at 2:28 AM Masahiko Sawada <masahiko.saw...@2ndquadrant.com> wrote: > I've attached the patch fixes this issue. > > With this patch, we don't skip only index cleanup phase when > performing an aggressive vacuum. The reason why I don't skip only > index cleanup phase is that index vacuum phase can be called multiple > times, which takes a very long time. Since the purpose of this index > cleanup is to process recyclable pages it's enough to do only index > cleanup phase.
That's only true in nbtree, though. The way that I imagined we'd want to fix this is by putting control in each index access method. So, we'd revise the way that amvacuumcleanup() worked -- the amvacuumcleanup routine for each index AM would sometimes be called in a mode that means "I don't really want you to do any cleanup, but if you absolutely have to for your own reasons then you can". This could be represented using something similar to IndexVacuumInfo.analyze_only. This approach has an obvious disadvantage: the patch really has to teach *every* index AM to do something with that state (most will simply do no work). It seems logical to have the index AM control what happens, though. This allows the logic to live inside _bt_vacuum_needs_cleanup() in the case of nbtree, so there is only one place where we make decisions like this. Most other AMs don't have this problem. GiST has a similar issue with recyclable pages, except that it doesn't use 32-bit XIDs so it doesn't need to care about this stuff at all. Besides, it seems like it might be a good idea to do other basic maintenance of the index when we're "skipping" index cleanup. We probably should always do things that require only a small, fixed amount of work. Things like maintaining metadata in the metapage. There may be practical reasons why this approach isn't suitable for backpatch even if it is a superior approach. What do you think? Also, what do you think about this Robert and Andres? -- Peter Geoghegan