On Sat, 15 Aug 2020 at 12:19, Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Fri, Aug 14, 2020 at 4:06 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > On Mon, Aug 10, 2020 at 10:24 AM Masahiko Sawada > > <masahiko.saw...@2ndquadrant.com> wrote: > > > > > > It's true that heap_page_is_all_visible() is called from only > > > lazy_vacuum_page() but I'm concerned it would lead misleading since > > > it's not actually removing tuples but just checking after vacuum. I > > > guess that the errcontext should show what the process is actually > > > doing and therefore help the investigation, so I thought VACUUM_HEAP > > > might not be appropriate for this case. But I see also your point. > > > Other vacuum error context phases match with vacuum progress > > > information phrases. So in heap_page_is_all_visible (), I agree it's > > > better to update the offset number and keep the phase VACUUM_HEAP > > > rather than do nothing. > > > > > > > Okay, I have changed accordingly and this means that the offset will > > be displayed for the vacuum phase as well. Apart from this, I have > > fixed all the comments raised by me in the attached patch. One thing > > we need to think is do we want to set offset during heap_page_prune > > when called from lazy_scan_heap? I think the code path for > > heap_prune_chain is quite deep, so an error can occur in that path. > > What do you think? > > > > The reason why I have not included heap_page_prune related change in > the patch is that I don't want to sprinkle this in every possible > function (code path) called via vacuum especially if the probability > of an error in that code path is low. But, I am fine if you and or > others think that it is a good idea to update offset in > heap_page_prune as well.
I agree to not try sprinkling it many places than necessity. Regarding heap_page_prune(), I'm concerned a bit that heap_page_prune() is typically the first function to check the tuple visibility within the vacuum code. I've sometimes observed an error with the message like "DETAIL: could not open file “pg_xact/00AB”: No such file or directory" due to a tuple header corruption. I suspect this message was emitted while checking tuple visibility in heap_page_prune(). So I guess the likelihood of an error in that code is not so low. On the other hand, if we want to update the offset number in heap_page_prune() we will need to expose some vacuum structs defined as a static including LVRelStats. But I don't think it's a good idea. The second idea I came up with is that we set another errcontext for heap_page_prune(). Since heap_page_prune() could be called also by a regular page scanning it would work fine for both cases, although there will be extra overheads for both. What do you think? Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services