Thanks Justin. On Sat, 1 Aug 2020 at 11:47, Justin Pryzby <pry...@telsasoft.com> wrote: > > On Fri, Jul 31, 2020 at 04:55:14PM -0500, Justin Pryzby wrote: > > Bcc: > > Subject: Re: display offset along with block number in vacuum errors > > Reply-To: > > In-Reply-To: > > <cakytnapljjaarw0uebby6g1o0lrzks7ra5n46bfh+nfwsoy...@mail.gmail.com> > > whoops > > > On Wed, Jul 29, 2020 at 12:35:17AM +0530, Mahendra Singh Thalor wrote: > > > > Here: > > > > > > > > @@ -1924,14 +1932,22 @@ lazy_vacuum_page(Relation onerel, BlockNumber > > > > blkno, Buffer buffer, > > > > BlockNumber tblk; > > > > OffsetNumber toff; > > > > ItemId itemid; > > > > + LVSavedErrInfo loc_saved_err_info; > > > > > > > > tblk = > > > > ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]); > > > > if (tblk != blkno) > > > > break; /* past end of > > > > tuples for this block */ > > > > toff = > > > > ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]); > > > > + > > > > + /* Update error traceback information */ > > > > + update_vacuum_error_info(vacrelstats, > > > > &loc_saved_err_info, VACUUM_ERRCB_PHASE_VACUUM_HEAP, > > > > + blkno, > > > > toff); > > > > itemid = PageGetItemId(page, toff); > > > > ItemIdSetUnused(itemid); > > > > unused[uncnt++] = toff; > > > > + > > > > + /* Revert to the previous phase information for error > > > > traceback */ > > > > + restore_vacuum_error_info(vacrelstats, > > > > &loc_saved_err_info); > > > > } > > > > > > > > I'm not sure why you use restore_vacuum_error_info() at all. It's > > > > already > > > > called at the end of lazy_vacuum_page() (and others) to allow functions > > > > to > > > > clean up after their own state changes, rather than requiring callers > > > > to do it. > > > > I don't think you should use it in a loop, nor introduce another > > > > LVSavedErrInfo. > > > > > > > > Since phase and blkno are already set, I think you should just set > > > > vacrelstats->offnum = toff, rather than calling > > > > update_vacuum_error_info(). > > > > Similar to whats done in lazy_vacuum_heap(): > > > > > > > > tblk = > > > > ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]); > > > > vacrelstats->blkno = tblk; > > > > > > Fixed. > > > > I rearead this thread and I think the earlier suggestion from Masahiko was > > right. The loop around dead_tuples only does ItemIdSetUnused() which > > updates > > the page, which has already been read from disk. On my suggestion, your v2 > > patch sets offnum directly, but now I think it's not useful to set at all, > > since the whole page is manipulated by PageRepairFragmentation() and > > log_heap_clean(). An error there would misleadingly say "..at offset number > > MM", but would always show the page's last offset, and not the offset where > > an > > error occured. > > This makes me question whether offset numbers are ever useful during > VACUUM_HEAP, since the real work is done a page at a time (not tuple) or by > internal functions that don't update vacrelstats->offno. Note that my initial > problem report that led to the errcontext implementation was an ERROR in heap > *scan* (not vacuum). So an offset number at that point would've been > sufficient. > https://www.postgresql.org/message-id/20190808012436.gg11...@telsasoft.com > > I mentioned that lazy_check_needs_freeze() should save and restore the > errinfo, > so an error in heap_page_prune() (for example) doesn't get the wrong offset > associated with it. > > So see the attached changes on top of your v2 patch.
Actually I was waiting for review comments from committer and other people also and was planning to send a patch after that. I already fixed your comments in my offline patch and was waiting for more comments. Anyway, thanks for delta patch. Here, attaching v3 patch for review. > > postgres=# DROP TABLE tt; CREATE TABLE tt(a int) WITH (fillfactor=90); INSERT > INTO tt SELECT generate_series(1,99999); VACUUM tt; UPDATE tt SET a=1 WHERE > ctid='(345,10)'; UPDATE pg_class SET relfrozenxid=(relfrozenxid::text::int + > (1<<30))::int::text::xid WHERE oid='tt'::regclass; VACUUM tt; > ERROR: found xmin 1961 from before relfrozenxid 1073743785 > CONTEXT: while scanning block 345 of relation "public.tt", item offset 205 > > Hmm.. is it confusing that the block number is 0-indexed but the offset is > 1-indexed ? > > -- > Justin -- Thanks and Regards Mahendra Singh Thalor EnterpriseDB: http://www.enterprisedb.c
v03_0001-Added-offset-with-block-number-in-vacuum-errcontext.patch
Description: Binary data