On Mon, Oct 4, 2021 at 3:36 PM Mark Dilger <mark.dil...@enterprisedb.com> wrote: > >> I believe this is a bug in amcheck's btree checking functions. Peter, can > >> you take a look? > > > > Why do you say that? > > Because REINDEX CONCURRENTLY and the bt_index_parent_check() function seem to > have lock upgrade hazards that are unrelated to pg_amcheck.
The problem with that argument is that the bt_index_parent_check() function isn't doing anything particularly special, apart from dropping the lock. That has been its behavior for many years now. > On to pg_amcheck's behavior.... > > I see no evidence in the OP's complaint that pg_amcheck is misbehaving. It > launches a worker to check each relation, prints for the user's benefit any > errors those checks raise, and finally returns 0 if they all pass and 2 > otherwise. Since not all relations could be checked, 2 is returned. > Returning 0 would be misleading, as it implies everything was checked and > passed, and it can't honestly say that. The return value 2 does not mean > that anything failed. It means that not all checks passed. When a 2 is > returned, the user is expected to read the output and decide what, if > anything, they want to do about it. In this case, the user might decide to > wait until the reindex finishes and check again, or they might decide they > don't care. > > It is true that pg_amcheck is calling bt_index_parent_check() on an invalid > index, but so what? If it chose not to do so, it would still need to print a > message about the index being unavailable for checking, and it would still > have to return 2. Why would it have to print such a message? You seem to be presenting this as if there is some authoritative, precise, relevant definition of "the relations that pg_amcheck sees". But to me the relevant details look arbitrary at best. > It can't return 0, and it is unhelpful to leave the user in the dark about > the fact that not all indexes are in the right state for checking. Why is that unhelpful? More to the point, *why* would this alternative behavior constitute "leaving the user in the dark"? What about the case where the pg_class entry isn't visible to our MVCC snapshot? Why is "skipping" such a relation not just as unhelpful? > I think this bug report is really a feature request. The OP appears to want > an option to toggle on/off the printing of such information, perhaps with not > printing it as the default. And I don't understand why you think that clearly-accidental implementation details (really just bugs) should be treated as axiomatic truths about how pg_amcheck must work. Should we now "fix" pg_dump so that it matches pg_amcheck? All of the underlying errors are cases that were clearly intended to catch user error -- every single one. But apparently pg_amcheck is incapable of error, by definition. Like HAL 9000. -- Peter Geoghegan