Re: AdvanceXLInsertBuffers() vs wal_sync_method=open_datasync

Heikki Linnakangas Fri, 10 Nov 2023 07:17:06 -0800

On 10/11/2023 05:54, Andres Freund wrote:

In this case I had used wal_sync_method=open_datasync - it's often faster and
if we want to scale WAL writes more we'll have to use it more widely (you
can't have multiple fdatasyncs in progress and reason about which one affects
what, but you can have multiple DSYNC writes in progress at the same time).

Not sure I understand that. If you issue an fdatasync, it will sync allwrites that were complete before the fdatasync started. Right? If youhave multiple fdatasyncs in progress, that's true for each fdatasync. Oris there a bottleneck in the kernel with multiple in-progress fdatasyncsor something?

After a bit of confused staring and debugging I figured out that the problem
is that the RequestXLogSwitch() within the code for starting a basebackup was
triggering writing back the WAL in individual 8kB writes via
GetXLogBuffer()->AdvanceXLInsertBuffer(). With open_datasync each of these
writes is durable - on this drive each take about 1ms.

I see. So the assumption in AdvanceXLInsertBuffer() is that XLogWrite()is relatively fast. But with open_datasync, it's not.

To fix this, I suspect we need to make
GetXLogBuffer()->AdvanceXLInsertBuffer() flush more aggressively. In this
specific case, we even know for sure that we are going to fill a lot more
buffers, so no heuristic would be needed. In other cases however we need some
heuristic to know how much to write out.


+1. Maybe use the same logic as in XLogFlush().

I wonder if the 'flexible' argument to XLogWrite() is too inflexible. Itwould be nice to pass a hard minimum XLogRecPtr that it must write upto, but still allow it to write more than that if it's convenient.


--
Heikki Linnakangas
Neon (https://neon.tech)

Re: AdvanceXLInsertBuffers() vs wal_sync_method=open_datasync

Reply via email to