Re: Online verification of checksums

Tomas Vondra Thu, 07 Mar 2019 03:54:27 -0800

On 3/6/19 6:42 PM, Andres Freund wrote:
>

...

To me the right way seems to be to IO lock the page via PG after such a
failure, and then retry. Which should be relatively easily doable for
the basebackup case, but obviously harder for the pg_verify_checksums
case.

Actually, what do you mean by "IO lock the page"? Just waiting for thecurrent IO to complete (essentially BM_IO_IN_PROGRESS)? Or essentiallyacquiring a lock and holding it for the duration of the check?

The former does not really help, because there might be another I/Orequest initiated right after, interfering with the retry.

The latter might work, assuming the check is fast (which it probablyis). I wonder if this might cause issues due to loading possiblycorrupted data (with invalid checksums) into shared buffers. But thenagain, we could just hack a special version of ReadBuffer_common() whichwould just

(a) check if a page is in shared buffers, and if it is then consider thechecksum correct (because in memory it may be stale, and it was readsuccessfully so it was OK at that moment)

(b) if it's not in shared buffers already, try reading it and verify thechecksum, and then just evict it right away (not to spoil sb)


Or did you have something else in mind?


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Online verification of checksums

Reply via email to