Re: HOT chain validation in verify_heapam()

Andres Freund Thu, 23 Mar 2023 10:26:22 -0700

Hi,

On 2023-03-23 11:41:52 -0400, Robert Haas wrote:
> On Wed, Mar 22, 2023 at 5:56 PM Andres Freund <and...@anarazel.de> wrote:
> > I also think it's not quite right that some of checks inside if
> > (ItemIdIsRedirected()) continue in case of corruption, others don't. While
> > there's a later continue, that means the corrupt tuples get added to the
> > predecessor array.  Similarly, in the non-redirect portion, the successor
> > array gets filled with corrupt tuples, which doesn't seem quite right to me.
> 
> I'm not entirely sure I agree with this. I mean, we should skip
> further checks of the data is so corrupt that further checks are not
> sensible e.g. if the line pointer is just gibberish we can't validate
> anything about the tuple, because we can't even find the tuple.
> However, we don't want to skip further checks as soon as we see any
> kind of a problem whatsoever. For instance, even if the tuple data is
> utter gibberish, that should not and does not keep us from checking
> whether the update chain looks sane. If the tuple header is garbage
> (e.g. xmin and xmax are outside the bounds of clog) then at least some
> of the update-chain checks are not possible, because we can't know the
> commit status of the tuples, but garbage in the tuple data isn't
> really a problem for update chain validation per se.


The cases I was complaining about where metadata that's important. We
shouldn't enter a tuple into lp_valid[] or successor[] if it failed validity
checks - the subsequent error reports that that generates won't be helpful -
or we'll crash.

E.g. continuing after:

                                rditem = PageGetItemId(ctx.page, rdoffnum);
                                if (!ItemIdIsUsed(rditem))
                                        report_corruption(&ctx,
                                                                          
psprintf("line pointer redirection to unused item at offset %u",
                                                                                
           (unsigned) rdoffnum));

means we'll look into the tuple in the "update chain validation" loop for
unused items. Where it afaict will lead to a crash or bogus results, because:
                                /* Can only redirect to a HOT tuple. */
                                next_htup = (HeapTupleHeader) 
PageGetItem(ctx.page, next_lp);
                                if (!HeapTupleHeaderIsHeapOnly(next_htup))
                                {
                                        report_corruption(&ctx,
                                                                          
psprintf("redirected line pointer points to a non-heap-only tuple at offset %u",
                                                                                
           (unsigned) nextoffnum));
                                }

will just dereference the unused item.



> -                 * Redirects are created by updates, so successor should be
> -                 * the result of an update.
> +                 * Redirects are created by HOT updates, so successor should
> +                 * be the result of an HOT update.
> +                 *
> +                 * XXX: HeapTupleHeaderIsHeapOnly() should always imply
> +                 * HEAP_UPDATED. This should be checked even when the tuple
> +                 * isn't a target of a redirect.
> 
> Hmm, OK. So the question is where to put this check. Maybe inside
> check_tuple_header(), making it independent of the update chain
> validation stuff?

Yes, check_tuple_header sounds sensible to me.

Greetings,

Andres Freund

Re: HOT chain validation in verify_heapam()

Reply via email to