On Wed, Feb 10, 2021 at 4:12 PM Peter Geoghegan <p...@bowt.ie> wrote: > > On Tue, Feb 9, 2021 at 6:14 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > Thanks. I think that's very good if we resolve this recycling stuff > > first then try the new approach to skip index vacuum in more cases. > > That way, even if the vacuum strategy stuff took a very long time to > > get committed over several major versions, we would not be affected by > > deleted nbtree page recycling problem (at least for built-in index > > AMs). Also, the approach of 6655a7299d8 itself is a good improvement > > and seems straightforward to me. > > I'm glad that you emphasized this issue, because I came up with a > solution that turns out to not be very invasive. At the same time it > has unexpected advantages, liking improving amcheck coverage for > deleted pages.
Sorry for the late response. I've attached the patch that adds a check whether or not to do index vacuum (and heap vacuum) if 1% of all heap pages have LP_DEAD line pointer. While developing this feature, I realized the following two things: 1. Whereas skipping index vacuum and heap vacuum is a very attractive improvement, if we skip that by default I wonder if we need a way to disable it. Vacuum plays a role in cleaning and diagnosing tables in practice. So in a case where the table is bad state and the user wants to clean all heap pages, it would be good to have a way to disable this skipping behavior. One solution would be that index_cleanup option has three different behaviors: on, auto (or smart), and off. We enable this skipping behavior by default in ‘auto’ mode, but specifying "INDEX_CLEANUP true” means to enforce index vacuum and therefore disabling it. --- 2. @@ -1299,6 +1303,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats, { lazy_record_dead_tuple(dead_tuples, &(tuple.t_self)); all_visible = false; + has_dead_tuples = true; continue; } I added the above change in the patch to count the number of heap pages having at least one LP_DEAD line pointer. But it's weird to me that we have never set has_dead_tuples true when we found an LP_DEAD line pointer. Currently, we set it to false true in 'tupgone' case but it seems to me that we should do that in this case as well since we use this flag in the following check: else if (PageIsAllVisible(page) && has_dead_tuples) { elog(WARNING, "page containing dead tuples is marked as all-visible in relation \"%s\" page %u", vacrelstats->relname, blkno); PageClearAllVisible(page); MarkBufferDirty(buf); visibilitymap_clear(onerel, blkno, vmbuffer, VISIBILITYMAP_VALID_BITS); } Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
skip_index_vacuum.patch
Description: Binary data