On Mon, Nov 02, 2020 at 11:34:57AM -0800, Andres Freund wrote: > On 2020-11-02 12:35:30 -0500, Robert Haas wrote: >> On Thu, Oct 29, 2020 at 2:17 PM Andres Freund <and...@anarazel.de> wrote: >>> I think this needs to be quickly reworked or reverted. > > I think it's fairly clear by now that revert is appropriate for now.
Yep, that's clear. I'll deal with that tomorrow. That's more than a simple rework. >> I don't like this patch, either. In addition to what Andres mentioned, >> CheckBuffer() is a completely-special case mechanism which can't be >> reused by anything else. In particular, the amcheck for heap stuff >> which I recently committed (866e24d47db1743dfcff5bd595b57e3a143f2cb1) >> would really like a way to examine a buffer without risking an error >> if PageIsVerified() should happen to fail, but this patch is of >> absolutely no use in getting there, because CheckBuffer() doesn't give >> the caller any way to access the contents of the buffer. It can only >> do the checks that it knows how to do, and that's it. That doesn't >> seem like a good design. > > Wouldn't this be better served by having a ReadBufferExtended() flag, > preventing erroring out and zeroing the buffer? I'm not sure that > handling both the case where the buffer contents need to be valid and > the one where it doesn't will make for a good API. If you grep for ReadBuffer_common() is some of the emails I sent.. I was rather interested in something like that. >> I don't like the fact that CheckBuffer() silently skips dirty buffers, >> either. The comment should really say that it checks the state of a >> buffer without loading it into shared buffers, except sometimes it >> doesn't actually check it. > > Yea, I don't see a good reason for that either. There's reasons for > dirty buffers that aren't WAL logged - so if the on-disk page is broken, > a standby taken outside pg_basebackup would possibly still end up with a > corrupt on-disk page. Similar with a crash restart. Er, if you don't skip dirty buffers, wouldn't you actually report some pages as broken if attempting to run those in a standby who may have some torn pages from a previous base backup? You could still run into problems post-promotion, until the first checkpoint post-recovery happens, no? >> If the buffer is in shared buffers, we could take a share-lock >> on the buffer and copy the contents of the page as it exists on disk, >> and then still check it. > > Don't think we need a share lock. That still allows the buffer to be > written out (and thus a torn read). What we need is to set > BM_IO_IN_PROGRESS on the buffer in question - only one backend can set > that. And then unset that again, without unsetting > BM_DIRTY/BM_JUST_DIRTIED. If that can work, we could make use of some of that for base backups for a single retry of a page that initially failed. -- Michael
signature.asc
Description: PGP signature