On Tue, Jan 21, 2025 at 03:31:27AM +0000, Andy Fan wrote:
> Come from [0] and thanks for working on this. Here are some design
> review/question after my first going through the patch.

Thanks for taking a look.

> 1. walwriter vs checkpointer?  I prefer to walwriter for now because.. 
> 
> a. checkpointer is hard to do it in a timely manner either because
> checkpoint itself may take a long time or the checkpoint_timeout
> is much bigger than commit_delay. but walwriter could do this timely.
> I think this is an important consideration for this feature. 
> 
> b. We want walwriter to run with low latency to flush out async
> commits. This is true, but preallocating a wal doesn't increase the
> latency too much. After all, even user uses the aysnc commit, the walfile
> allocating is done by walwriter already in our current implementation.

I attempted to deal with this by having pre-allocation requests set the
checkpointer's latch and performing the pre-allocation within the
checkpointer's main loop and during write delays.  However, checkpointing
does a number of other things that could just as easily delay
pre-allocation, so it's probably worth considering the WAL writer.

> 2. How many xlogfile should be preallocated by checkpointer/walwriter
> once. In your patch it is controled by wal-preallocate-max-size. How
> about just preallocate *the next one* xlogfile for the simplification
> purpose?

We could probably start with something like that.  IIRC it was difficult to
create workloads where you'd need more than 1-2 at a time, provided
whatever is pre-allocating refills the pool quickly.

> 3. Why is the purpose of preallocated_segments directory? what in my
> mind is we just prellocate the normal filename so that XLogWrite could
> open it directly. This is same as what wal_recycle does and we can reuse
> the same strategy to clean up them if they are not needed anymore.

The purpose is to limit the use of pre-allocated segments to only
situations where WAL recycling is not sufficient.  Basically, if writing a
record would require a new segment to be created, we can quickly pull a
pre-allocated one instead of creating it ourselves.  Besides simplifying
matters, this prevents a lot of unnecessary pre-allocation, since many
workloads will almost never need anything beyond the recycled segments.

-- 
nathan


Reply via email to