Missed this question!

> On Mar 25, 2025, at 09:56, Phillip Diffley <phillip6...@gmail.com> wrote:
> But when processing data from a replication slot, we confirm rows that have 
> been processed and can be deleted from the WAL based on the LSN (eg. with 
> pg_replication_slot_advance). How does postgres identify what parts of the 
> WAL can be freed?

Basically, if no part of the system "needs" a particular LSN position, the 
segments that include that LSN position and earlier can be free.

The various things that can "need" a particular LSN point are:

1. Replication slots, if the other side has not confirmed that it has received 
it (under whatever synchronous commit rules that slot is operating under).
2. The wal_keep_size setting.
3. The max_wal_size setting.
4. The archive_command, if a WAL segment hasn't been successfully archived yet.

One thing to remember is that the WAL does *not* contain contiguous blocks of 
operations for a single transaction.  The operations are written to the WAL by 
every session as they do operations, so the WAL is a jumble of different 
transactions.  One of the jobs of the logical replication framework is to sort 
that out so it can present only the operations that belong to committed 
transactions to the output plugin.  (This is why there's an internal structure 
called the "reorder buffer": it reorders WAL operations into transaction 
blocks.)

Reply via email to