On Mon, Oct 11, 2021 at 10:46 AM Mark Dilger <mark.dil...@enterprisedb.com> wrote: > > On Oct 11, 2021, at 10:10 AM, Peter Geoghegan <p...@bowt.ie> wrote: > > This mostly looks good to me. Just one thing occurs to me: I suspect > > that we don't need to call pg_is_in_recovery() from SQL at all. What's > > wrong with just letting verify_heapam() (the C function from amcheck > > proper) show those notice messages where appropriate? > > I thought a big part of the debate upthread was over exactly this point, that > pg_amcheck should not attempt to check (a) temporary relations, (b) indexes > that are invalid or unready, and (c) unlogged relations during recovery.
Again, I consider (a) and (b) very similar to each other, but very dissimilar to (c). Only (a) and (b) are *inherently* not verifiable by amcheck. To me, giving pg_amcheck responsibility for only calling amcheck functions when (a) and (b) are sane is akin to expecting pg_amcheck to only call bt_index_check() with a B-Tree index. Giving pg_amcheck these responsibilities is not a case of "pg_amcheck presuming to know what's best for the user, or too much about amcheck", because amcheck itself pretty clearly expects this from the user (and always has). The user is no worse off for having used pg_amcheck rather than calling amcheck functions from SQL themselves. pg_amcheck is literally just fulfilling basic expectations held by amcheck, that are pretty much documented as such. Sure, the user might not be happy with --parent-check throwing an error on a replica. But in practice most users won't want to do that anyway. Even on a primary it's usually not possible as a practical matter, because the locking implications are *bad* -- it's just too disruptive, for too little extra coverage. And so when --parent-check fails on a replica, it really is very likely that the user should just not do that. Which is easy: just remove --parent-check, and try again. Most scenarios where --parent-check is useful involve the user already knowing that there is some corruption. In other words, scenarios where almost nothing could be considered overkill. Presumably this is very rare. > I don't like having pg_amcheck parse the error message that comes back from > amcheck. > How shall we proceed? What's the problem with just having pg_amcheck pass through the notice to the user, without it affecting anything else? Why should a simple notice message need to affect its return code, or anything else? It's not like I feel very strongly about this question. Ultimately it probably doesn't matter very much -- if pg_amcheck just can't deal with these notice messages for some reason, then I can let it go. But what's the reason? If there is a good reason, then maybe we should just not have the notice messages (so we would just remove the existing one from verify_nbtree.c, while still interpreting the case in the same way -- index has no storage to check, and so is trivially verified). -- Peter Geoghegan