On 22.10.2020 04:25, Michael Paquier wrote:
On Thu, Oct 22, 2020 at 12:47:03AM +0300, Anastasia Lubennikova wrote:
We can also read such pages via shared buffers to be 100% sure.
Yeah, but this has its limits as well. One can use
ignore_checksum_failure, but this can actually be very dangerous as
you can finish by loading into shared buffers a page that has a header
thought as sane but with a large portion of its page randomly
corrupted, spreading corruption around and leading to more fancy
logic failures in Postgres, with more panic from customers. Not using
ignore_checksum_failure is safer, but it makes an analyze of the
damages for a given file harder as things would stop at the first
failure of a file with a seqscan. pg_prewarm can help here, but
that's not the goal of the tool to do that either.
I was thinking about applying this only to pages with LSN > startLSN.
Most of such pages are valid and already in memory, because they were
changed just recently, so no need for pg_prewarm here. If such LSN
appeared because of a data corruption, page verification from inside
ReadBuffer() will report an error first. In proposed function, we can
handle this error in any fashion we want. Something like:
if (PageGetLSN(page) > startptr)
{
if (!read_page_via_buffercache())
//throw a warning about corrupted page
//handle checksum error as needed
else
//page is valid. No worries
}
--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company