Hi, On 2026-03-04 08:38:24 +0100, Anthonin Bonnefoy wrote: > From ad0a3cfe10bdd2cccc4274849c4a77898b06e13c Mon Sep 17 00:00:00 2001 > From: Anthonin Bonnefoy <[email protected]> > Date: Wed, 2 Jul 2025 09:58:52 +0200 > Subject: Don't keep closed WAL segments in page cache after replay > > On a standby, the recovery process reads the WAL segments, applies > changes and closes the segment. When closed, the segments will still be > in page cache memory until they are evicted due to inactivity. The > segments may be re-read if archive_mode is set to always, wal_summarizer > is enabled or if the standby is used for replication and has an active > walsender. > > The presence of a replication slots is also a likely indicator that > a walsender will be started, and need to read the WAL segments. > > Outside of those circumstances, the WAL segments won't be re-read and > keeping them in the page cache generates unnecessary memory pressure. > A POSIX_FADV_DONTNEED is sent before closing a replayed WAL segment to > immediately free any cached pages.
I am quite sceptical that this is a good idea. Have you actually measured benefits? I skimmed the thread and didn't see anything. It's pretty cheap for the kernel to replace a clean page from the page cache with different content. If you [crash-]restart the replica this will make it way more expensive. If you have twophase commits where we need to read 2PC details from the WAL, this will make it more expensive. If somebody takes a base backup, this ... I think you'd have to have pretty convincing benchmarks showing that this is a good idea before we should even remotely consider applying this. Greetings, Andres Freund
