On Mon, Apr 25, 2022 at 9:54 PM Michael Paquier <mich...@paquier.xyz> wrote: > On Mon, Apr 25, 2022 at 10:11:10AM -0500, David Christensen wrote: > > On Mon, Apr 25, 2022 at 1:11 AM Michael Paquier <mich...@paquier.xyz> wrote: > >> I don't think that there is any need to rely on a new logic if there > >> is already some code in place able to do the same work. See > >> verify_dir_is_empty_or_create() in pg_basebackup.c, as one example, > >> that relies on pg_check_dir(). I think that you'd better rely at > >> least on what pgcheckdir.c offers. > > > > Yeah, though I am tending towards what another user had suggested and > > just accepting an existing directory rather than requiring it to be > > empty, so thinking I might just skip this one. (Will review the file > > you've pointed out to see if there is a relevant function though.) > > OK. FWIW, pg_check_dir() is used in initdb and pg_basebackup because > these care about the behavior to use when specifying a target path. > You could reuse it, but use a different policy depending on its > returned result for the needs of what you see fit in this case.
I have a new version of the patch (pending tests) that uses pg_check_dir's return value to handle things appropriately, so at least some code reuse now. It did end up simplifying a lot. > >> + PageSetLSN(page, record->ReadRecPtr); > >> + /* if checksum field is non-zero then we have > >> checksums enabled, > >> + * so recalculate the checksum with new LSN (yes, this > >> is a hack) > >> + */ > >> Yeah, that looks like a hack, but putting in place a page on a cluster > >> that has checksums enabled would be more annoying with > >> zero_damaged_pages enabled if we don't do that, so that's fine by me > >> as-is. Perhaps you should mention that FPWs don't have their > >> pd_checksum updated when written. > > > > Can make a mention; you thinking just a comment in the code is > > sufficient, or want there to be user-visible docs as well? > > I was thinking about a comment, at least. New patch version has significantly more comments. > > This was to make the page as extracted equivalent to the effect of > > applying the WAL record block on replay (the LSN and checksum both); > > since recovery calls this function to mark the LSN where the page came > > from this is why I did this as well. (I do see perhaps a case for > > --raw output that doesn't munge the page whatsoever, just made > > comparisons easier.) > > Hm. Okay. The argument goes both ways, I guess, depending on what we > want to do with those raw pages. Still you should not need pd_lsn if > the point is to be able to stick the pages back in place to attempt to > get back as much data as possible when loading it back to the shared > buffers? Yeah, I can see that too; I think there's at least enough of an argument for a flag to apply the fixups or just extract only the raw page pre-modification. Not sure which should be the "default" behavior; either `--raw` or `--fixup-metadata` or something could work. (Naming suggestions welcomed.) David