On Fri, 7 Aug 2020 at 13:04, Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Fri, Aug 7, 2020 at 8:10 AM Masahiko Sawada > <masahiko.saw...@2ndquadrant.com> wrote: > > > > On Fri, 7 Aug 2020 at 10:49, Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > > > On Thu, Aug 6, 2020 at 7:51 PM Justin Pryzby <pry...@telsasoft.com> wrote: > > > > > > > > On Thu, Aug 06, 2020 at 07:39:21PM +0530, Amit Kapila wrote: > > > > > On Wed, Jul 29, 2020 at 1:09 AM Justin Pryzby <pry...@telsasoft.com> > > > > > wrote: > > > > > > > > > > > > > > > > > > lazy_check_needs_freeze iterates over blocks and this patch changes > > > > > > it to > > > > > > update vacrelstats. I think it should do what > > > > > > lazy_{vacuum/cleanup}_heap/page/index do and call > > > > > > update_vacuum_error_info() at > > > > > > its beginning (even though only the offset is changed), and then > > > > > > restore_vacuum_error_info() at its end (to "revert back" to the > > > > > > item number it > > > > > > started with). > > > > > > > > > > > > > > > > I see that Mahendra has changed patch as per this suggestion but I am > > > > > not convinced that it is a good idea to sprinkle > > > > > update_vacuum_error_info()/restore_vacuum_error_info() at places more > > > > > than required. I see that it might look a bit clean from the > > > > > perspective that if tomorrow we use the function > > > > > lazy_check_needs_freeze() for a different purpose then we don't need > > > > > to worry about the wrong phase information. If we are worried about > > > > > that then we should have an assert in that function to ensure that the > > > > > current phase is VACUUM_ERRCB_PHASE_SCAN_HEAP. > > > > > > > > The motivation was to restore the offnum, which is set to Invalid at > > > > the start > > > > of lazy_scan_heap(), and then set valid within lazy_check_needs_freeze, > > > > but > > > > should be restored or re-set to Invalid when returns to > > > > lazy_scan_heap(). If > > > > you think it's important, we could just set vacrelstats->offnum = > > > > Invalid > > > > before returning, > > > > > > > > > > Yeah, I would prefer that and probably a comment to indicate why we > > > are doing that. > > > > +1 > > > > I'm concerned that we call the update and restore in > > heap_page_is_all_visible(). Unlike lazy_check_needs_freeze(), this > > function is called for every vacuumed page. I commented that if we > > want to update the offset number during iterating tuples in the > > function we should change the phase to SCAN_HEAP at the beginning of > > the function because it's actually not vacuuming. > > > > AFAICS, heap_page_is_all_visible() is called from only one place and > that is lazy_vacuum_page(), so I think if there is any error in > heap_page_is_all_visible(), it should be considered as VACUUM_HEAP > phase error.
It's true that heap_page_is_all_visible() is called from only lazy_vacuum_page() but I'm concerned it would lead misleading since it's not actually removing tuples but just checking after vacuum. I guess that the errcontext should show what the process is actually doing and therefore help the investigation, so I thought VACUUM_HEAP might not be appropriate for this case. But I see also your point. Other vacuum error context phases match with vacuum progress information phrases. So in heap_page_is_all_visible (), I agree it's better to update the offset number and keep the phase VACUUM_HEAP rather than do nothing. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services