On Sun, Feb 21, 2021 at 10:28 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > Sorry for the late response.
Me too! > 1. Whereas skipping index vacuum and heap vacuum is a very attractive > improvement, if we skip that by default I wonder if we need a way to > disable it. Vacuum plays a role in cleaning and diagnosing tables in > practice. So in a case where the table is bad state and the user wants > to clean all heap pages, it would be good to have a way to disable > this skipping behavior. One solution would be that index_cleanup > option has three different behaviors: on, auto (or smart), and off. We > enable this skipping behavior by default in ‘auto’ mode, but > specifying "INDEX_CLEANUP true” means to enforce index vacuum and > therefore disabling it. Sounds reasonable to me. Maybe users should express the skipping behavior that they desire in terms of the *proportion* of all heap blocks with LP_DEAD line pointers that we're willing to have while still skipping index vacuuming + lazy_vacuum_heap() heap scan. In other words, it can be a "scale" type GUC/param (though based on heap blocks *at the end* of the first heap scan, not tuples at the point the av launcher considers launching AV workers). > @@ -1299,6 +1303,7 @@ lazy_scan_heap(Relation onerel, VacuumParams > *params, LVRelStats *vacrelstats, > { > lazy_record_dead_tuple(dead_tuples, &(tuple.t_self)); > all_visible = false; > + has_dead_tuples = true; > continue; > } > > I added the above change in the patch to count the number of heap > pages having at least one LP_DEAD line pointer. But it's weird to me > that we have never set has_dead_tuples true when we found an LP_DEAD > line pointer. I think that you're right. However, in practice it isn't harmful because has_dead_tuples is only used when "all_visible = true", and only to detect corruption (which should never happen). I think that it should be fixed as part of this work, though. Lots of stuff in this area is kind of weird already. Sometimes this is directly exposed to users, even. This came up recently, when I was working on VACUUM VERBOSE stuff. (This touched the precise piece of code you've patched in the quoted diff snippet, so perhaps you know some of the story I will tell you now already.) I recently noticed that VACUUM VERBOSE can report a very low tups_vacuumed/"removable heap tuples" when run against tables where most pruning is opportunistic pruning rather than VACUUM pruning (which is very common), provided there are no HOT updates (which is common but not very common). This can be very confusing, because VACUUM VERBOSE will report a "tuples_deleted" for the heap relation that is far far less than the "tuples_removed" it reports for indexes on the same table -- even though both fields have values that are technically accurate (e.g., not very affected by concurrent activity during VACUUM, nothing like that). This came to my attention when I was running BenchmarkSQL for the 64-bit XID deleted pages patch. One of the BenchmarkSQL tables (though only one -- the table whose UPDATEs are not HOT safe, which is unique among BenchmarkSQL/TPC-C tables). I pushed a commit with comment changes [1] to make that aspect of VACUUM VERBOSE a little less confusing. (I was actually running a quick-and-dirty hack that made log_autovacuum show VACUUM VERBOSE index stuff -- I would probably have missed the weird difference between heap tups_vacuumed and index tuples_removed without this custom log_autovacuum hack.) Just to be clear (I think you agree already): we should base any triggering logic for skipping index vacuuming/lazy_vacuum_heap() on logic that does not care *when* heap pages first contained LP_DEAD line pointers (could be that they were counted in tups_vacuumed due to being pruned during this VACUUM operation, could be from an earlier opportunistic pruning, etc). > Currently, we set it to false true in 'tupgone' case but > it seems to me that we should do that in this case as well since we > use this flag in the following check: > > else if (PageIsAllVisible(page) && has_dead_tuples) > { > elog(WARNING, "page containing dead tuples is marked as > all-visible in relation \"%s\" page %u", > vacrelstats->relname, blkno); > PageClearAllVisible(page); > MarkBufferDirty(buf); > visibilitymap_clear(onerel, blkno, vmbuffer, > VISIBILITYMAP_VALID_BITS); > } The "tupgone = true"/HEAPTUPLE_DEAD-race case is *extremely* weird. It has zero test coverage according to coverage.postgresql.org [2], despite being very complicated. 3 points on the "tupgone = true" weirdness (I'm writing this as a record for myself, almost): 1. It is the reason why lazy_vacuum_heap() must be prepared to set tuples LP_UNUSED that are not already LP_DEAD. So when lazy_vacuum_page() says "the first dead tuple for this page", that doesn't necessarily mean LP_DEAD items! (Though the other cases are not even tested, I think -- the lack of "tupgone = true" test coverage also means we don't cover corresponding lazy_vacuum_page() cases.) 2. This is also why we need XLOG_HEAP2_CLEANUP_INFO records (i.e. why XLOG_HEAP2_CLEAN records are not sufficient to all required recovery conflicts during VACUUM). 3. And it's also why log_heap_clean() is needed for both lazy_scan_heap()'s pruning and lazy_vacuum_heap() unused-marking. Many years ago, Noah Misch tried to clean this up -- that included renaming lazy_vacuum_heap() to lazy_heap_clear_dead_items(), which would only deal with LP_DEAD items: https://www.postgresql.org/message-id/flat/20130108024957.GA4751%40tornado.leadboat.com Of course, this effort to eliminate the "tupgone = true"/XLOG_HEAP2_CLEANUP_INFO special case didn't go anywhere at the time. [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7cde6b13a9b630e2f04d91e2f17dedc2afee21c6 [2] https://coverage.postgresql.org/src/backend/access/heap/vacuumlazy.c.gcov.html -- Peter Geoghegan