On Mon, Oct 5, 2020 at 5:24 PM Mark Dilger <mark.dil...@enterprisedb.com> wrote: > > I don't see how verify_heapam will avoid raising an error during basic > > validation from PageIsVerified(), which will violate the guarantee > > about not throwing errors. I don't see that as a problem myself, but > > presumably you will. > > My concern is not so much that verify_heapam will stop with an error, but > rather that it might trigger a panic that stops all backends. Stopping with > an error merely because it hits corruption is not ideal, as I would rather it > completed the scan and reported all corruptions found, but that's minor > compared to the damage done if verify_heapam creates downtime in a production > environment offering high availability guarantees. That statement might seem > nuts, given that the corrupt table itself would be causing downtime, but that > analysis depends on assumptions about table access patterns, and there is no > a priori reason to think that corrupt pages are necessarily ever being > accessed, or accessed in a way that causes crashes (rather than merely wrong > results) outside verify_heapam scanning the whole table.
That seems reasonable to me. I think that it makes sense to never take down the server in a non-debug build with verify_heapam. That's not what I took away from your previous remarks on the issue, but perhaps it doesn't matter now. -- Peter Geoghegan