On Tue, Jan 21, 2025 at 03:31:27AM +0000, Andy Fan wrote: > Come from [0] and thanks for working on this. Here are some design > review/question after my first going through the patch.
Thanks for taking a look. > 1. walwriter vs checkpointer? I prefer to walwriter for now because.. > > a. checkpointer is hard to do it in a timely manner either because > checkpoint itself may take a long time or the checkpoint_timeout > is much bigger than commit_delay. but walwriter could do this timely. > I think this is an important consideration for this feature. > > b. We want walwriter to run with low latency to flush out async > commits. This is true, but preallocating a wal doesn't increase the > latency too much. After all, even user uses the aysnc commit, the walfile > allocating is done by walwriter already in our current implementation. I attempted to deal with this by having pre-allocation requests set the checkpointer's latch and performing the pre-allocation within the checkpointer's main loop and during write delays. However, checkpointing does a number of other things that could just as easily delay pre-allocation, so it's probably worth considering the WAL writer. > 2. How many xlogfile should be preallocated by checkpointer/walwriter > once. In your patch it is controled by wal-preallocate-max-size. How > about just preallocate *the next one* xlogfile for the simplification > purpose? We could probably start with something like that. IIRC it was difficult to create workloads where you'd need more than 1-2 at a time, provided whatever is pre-allocating refills the pool quickly. > 3. Why is the purpose of preallocated_segments directory? what in my > mind is we just prellocate the normal filename so that XLogWrite could > open it directly. This is same as what wal_recycle does and we can reuse > the same strategy to clean up them if they are not needed anymore. The purpose is to limit the use of pre-allocated segments to only situations where WAL recycling is not sufficient. Basically, if writing a record would require a new segment to be created, we can quickly pull a pre-allocated one instead of creating it ourselves. Besides simplifying matters, this prevents a lot of unnecessary pre-allocation, since many workloads will almost never need anything beyond the recycled segments. -- nathan