On Sun, Jan 27, 2019 at 1:59 PM Michael Paquier <mich...@paquier.xyz> wrote:
> On Sat, Jan 26, 2019 at 01:45:46PM +0100, Magnus Hagander wrote: > > One workaround you could perhaps look at here is to run pg_basebackup > > with --no-sync. That way there will be no fsyncs issued while running. > You > > will then of course have to take care of syncing all the files to disk > > after it's done, but a network filesystem might be happier in dealing > with > > a large "batch-sync" like that rather than piece-by-piece sync. > > Hm. Aren't we actually wrong in letting the WAL receive method use > the value of do_sync depending on the command line arguments, with > true being the default for pg_basebackup? In plain format, we flush > the full data directory anyway when the backup ends. In tar format, > each individual tar file is flushed one-by-one after being received, > and we issue a final sync on the parent directory at the end. So > what's missing is just to make sure that the fully generated > pg_wal.tar is synced once completed. This would be way cheaper than > letting the stream process issue syncs for each segments, which does > not matter much in the event of a host crash because the base backup > may finish in an inconsistent state, and one should not use it. > Yeah, that could be done without giving up any of the guarantees -- we only give the guarantee at the end of the completed backup. I wouldn't necessarily say we're wrong now, but it could definitely be a nice performance improvement. And for plain format, we'd do the same -- sync after each file segment, and then a final one of the directory when done, right? -- Magnus Hagander Me: https://www.hagander.net/ <http://www.hagander.net/> Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>