On Mon, Jun 7, 2021 at 8:48 PM Bossart, Nathan <bossa...@amazon.com> wrote: > > On 12/25/20, 12:09 PM, "Andres Freund" <and...@anarazel.de> wrote: > > When running write heavy transactional workloads I've many times > > observed that one needs to run the benchmarks for quite a while till > > they get to their steady state performance. The most significant reason > > for that is that initially WAL files will not get recycled, but need to > > be freshly initialized. That's 16MB of writes that need to synchronously > > finish before a small write transaction can even start to be written > > out... > > > > I think there's two useful things we could do: > > > > 1) Add pg_wal_preallocate(uint64 bytes) that ensures (bytes + > > segment_size - 1) / segment_size WAL segments exist from the current > > point in the WAL. Perhaps with the number of bytes defaulting to > > min_wal_size if not explicitly specified? > > > > 2) Have checkpointer (we want walwriter to run with low latency to flush > > out async commits etc) occasionally check if WAL files need to be > > pre-allocated. > > > > Checkpointer already tracks the amount of WAL that's expected to be > > generated till the end of the checkpoint, so it seems like it's a > > pretty good candidate to do so. > > > > To keep checkpointer pre-allocating when idle we could signal it > > whenever a record has crossed a segment boundary. > > > > > > With a plain pgbench run I see a 2.5x reduction in throughput in the > > periods where we initialize WAL files. > > I've been exploring this independently a bit and noticed this message. > Attached is a proof-of-concept patch for a separate "WAL allocator" > process that maintains a pool of WAL-segment-sized files that can be > claimed whenever a new segment file is needed. An early version of > this patch attempted to spread the I/O like non-immediate checkpoints > do, but I couldn't point to any real benefit from doing so, and it > complicated things quite a bit. > > I like the idea of trying to bake this into an existing process such > as the checkpointer. I'll admit that creating a new process just for > WAL pre-allocation feels a bit heavy-handed, but it was a nice way to > keep this stuff modularized. I can look into moving this > functionality into the checkpointer process if this is something that > folks are interested in.
Thanks for posting the patch, the patch no more applies on Head: Applying: wal segment pre-allocation error: patch failed: src/backend/access/transam/xlog.c:3283 error: src/backend/access/transam/xlog.c: patch does not apply Can you rebase the patch and post, it might help if someone is picking it up for review. Regards, Vignesh