On Mon, Apr 20, 2020 at 12:40 PM Mark Dilger <mark.dil...@enterprisedb.com> wrote: > Ok, I'll work in that direction and repost when I have something along those > lines.
Great, thanks! It also occurs to me that the B-Tree checks that amcheck already has have one remaining blindspot: While the heapallindexed verification option has the ability to detect the absence of an index tuple that the dummy CREATE INDEX that we perform under the hood says should be in the index, it cannot do the opposite: It cannot detect the presence of a malformed tuple that shouldn't be there at all, unless the index tuple itself is corrupt. That could miss an inconsistent page image when a few tuples have been VACUUMed away, but still appear in the index. In order to do that, we'd have to have something a bit like the validate_index() heap scan that CREATE INDEX CONCURRENTLY uses. We'd have to get a list of heap TIDs that any index tuple might be pointing to, and then make sure that there were no TIDs in the index that were not in that list -- tuples that were pointing to nothing in the heap at all. This could use the index_bulk_delete() interface. This is the kind of verification option that I might work on for debugging purposes, but not the kind of thing I could really recommend to ordinary users outside of exceptional cases. This is the kind of thing that argues for more or less providing all of the verification functionality we have through both high level and low level interfaces. This isn't likely to be all that valuable most of the time, and users shouldn't have to figure that out for themselves the hard way. (BTW, I think that this could be implemented in an index-AM-agnostic way, I think, so perhaps you can consider adding it too, if you have time.) One last thing for now: take a look at amcheck's bt_tuple_present_callback() function. It has comments about HOT chain corruption that you may find interesting. Note that this check played a role in the "freeze the dead" corruption bug [1] -- it detected that our initial fix for that was broken. It seems like it would be a good idea to go back through the reproducers we've seen for some of the more memorable corruption bugs, and actually make sure that your tool detects them where that isn't clear. History doesn't repeat itself, but it often rhymes. [1] https://postgr.es/m/cah2-wznm4rcrhfaiwkpwtpew2bxdtgrozk7jwwgucxeh3d1...@mail.gmail.com -- Peter Geoghegan