Re: should frontend tools use syncfs() ?

Justin Pryzby Wed, 13 Apr 2022 04:54:29 -0700

On Thu, Sep 30, 2021 at 12:49:36PM +0900, Michael Paquier wrote:
> On Wed, Sep 29, 2021 at 07:43:41PM -0500, Justin Pryzby wrote:
> > Forking this thread in which Thomas implemented syncfs for the startup 
> > process
> > (61752afb2).
> > https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BSG9jSW3ekwib0cSdC0yD-jReJ21X4bZAmqxoWTLTc2A%40mail.gmail.com
> > 
> > Is there any reason that initdb/pg_basebackup/pg_checksums/pg_rewind 
> > shouldn't
> > use syncfs()  ?
> 
> That makes sense.
> 
> > do_syncfs() is in src/backend/ so would need to be duplicated^Wimplemented 
> > in
> > common.
> 
> The fd handling in the backend makes things tricky if trying to plug
> in a common interface, so I'd rather do that as this is frontend-only
> code.
> 
> > They can't use the GUC, so need to add an cmdline option or look at an
> > environment variable.
> 
> fsync_pgdata() is going to manipulate many inodes anyway, because
> that's a code path designed to do so.  If we know that syncfs() is
> just going to be better, I'd rather just call it by default if
> available and not add new switches to all the frontend tools in need
> of flushing the data folder, switches that are not documented in your
> patch.


It is a draft/POC, after all.

The argument against using syncfs by default is that it could be worse than
recursive fsync if a tiny 200MB postgres instance lives on a shared filesystem
along with other, larger applications (maybe a larger postgres instance).

There's also an argument that syncfs might be unreliable in the case of a write
error.  (But I agreed with Thomas' earlier assessment: that claim caries little
weight since fsync() itself wasn't reliable for 20some years).

I didn't pursue this patch, as it's easier for me to use /bin/sync -f.  Someone
should adopt it if interested.

-- 
Justin

Re: should frontend tools use syncfs() ?

Reply via email to