On Thu, Sep 30, 2021 at 12:49:36PM +0900, Michael Paquier wrote: > On Wed, Sep 29, 2021 at 07:43:41PM -0500, Justin Pryzby wrote: > > Forking this thread in which Thomas implemented syncfs for the startup > > process > > (61752afb2). > > https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BSG9jSW3ekwib0cSdC0yD-jReJ21X4bZAmqxoWTLTc2A%40mail.gmail.com > > > > Is there any reason that initdb/pg_basebackup/pg_checksums/pg_rewind > > shouldn't > > use syncfs() ? > > That makes sense. > > > do_syncfs() is in src/backend/ so would need to be duplicated^Wimplemented > > in > > common. > > The fd handling in the backend makes things tricky if trying to plug > in a common interface, so I'd rather do that as this is frontend-only > code. > > > They can't use the GUC, so need to add an cmdline option or look at an > > environment variable. > > fsync_pgdata() is going to manipulate many inodes anyway, because > that's a code path designed to do so. If we know that syncfs() is > just going to be better, I'd rather just call it by default if > available and not add new switches to all the frontend tools in need > of flushing the data folder, switches that are not documented in your > patch.
It is a draft/POC, after all. The argument against using syncfs by default is that it could be worse than recursive fsync if a tiny 200MB postgres instance lives on a shared filesystem along with other, larger applications (maybe a larger postgres instance). There's also an argument that syncfs might be unreliable in the case of a write error. (But I agreed with Thomas' earlier assessment: that claim caries little weight since fsync() itself wasn't reliable for 20some years). I didn't pursue this patch, as it's easier for me to use /bin/sync -f. Someone should adopt it if interested. -- Justin