Hi David, Thanks for taking a look! On Tue, Aug 29, 2023 at 5:07 AM David Geier <geidav...@gmail.com> wrote: > > Hi Melanie, > > On 8/29/23 01:49, Melanie Plageman wrote: > > While working on a set of patches to combine the freeze and visibility > > map WAL records into the prune record, I wrote the attached patches > > reusing the tuple visibility information collected in heap_page_prune() > > back in lazy_scan_prune(). > > > > heap_page_prune() collects the HTSV_Result for every tuple on a page > > and saves it in an array used by heap_prune_chain(). If we make that > > array available to lazy_scan_prune(), it can use it when collecting > > stats for vacuum and determining whether or not to freeze tuples. > > This avoids calling HeapTupleSatisfiesVacuum() again on every tuple in > > the page. > > > > It also gets rid of the retry loop in lazy_scan_prune(). > > How did you test this change?
I didn't add a new test, but you can confirm some existing test coverage if you, for example, set every HTSV_Result in the array to HEAPTUPLE_LIVE and run the regression tests. Tests relying on vacuum removing the right tuples may fail. For example, select count(*) from gin_test_tbl where i @> array[1, 999]; in src/test/regress/sql/gin.sql fails for me since it sees a tuple it shouldn't. > Could you measure any performance difference? > > If so could you provide your test case? I created a large table and then updated a tuple on every page in the relation and vacuumed it. I saw a consistent slight improvement in vacuum execution time. I profiled a bit with perf stat as well. The difference is relatively small for this kind of example, but I can work on a more compelling, realistic example. I think eliminating the retry loop is also useful, as I have heard that users have had to cancel vacuums which were in this retry loop for too long. - Melanie