On Thu, 26 Jan 2023 at 19:45, Peter Geoghegan <p...@bowt.ie> wrote: > > On Thu, Jan 26, 2023 at 9:53 AM Andres Freund <and...@anarazel.de> wrote: > > I assume the case you're thinking of is that pruning did *not* do any > > changes, > > but in the process of figuring out that nothing needed to be pruned, we did > > a > > MarkBufferDirtyHint(), and as part of that emitted an FPI? > > Yes. > > > > That's going to be very significantly more aggressive. For example > > > it'll impact small tables very differently. > > > > Maybe it would be too aggressive, not sure. The cost of a freeze WAL record > > is > > relatively small, with one important exception below, if we are 99.99% sure > > that it's not going to require an FPI and isn't going to dirty the page. > > > > The exception is that a newer LSN on the page can cause the ringbuffer > > replacement to trigger more more aggressive WAL flushing. No meaningful > > difference if we modified the page during pruning, or if the page was > > already > > in s_b (since it likely won't be written out via the ringbuffer in that > > case), > > but if checksums are off and we just hint-dirtied the page, it could be a > > significant issue. > > Most of the overhead of FREEZE WAL records (with freeze plan > deduplication and page-level freezing in) is generic WAL record header > overhead. Your recent adversarial test case is going to choke on that, > too. At least if you set checkpoint_timeout to 1 minute again.
Could someone explain to me why we don't currently (optionally) include the functionality of page freezing in the PRUNE records? I think they're quite closely related (in that they both execute in VACUUM and are required for long-term system stability), and are even more related now that we have opportunistic page-level freezing. I think adding a "freeze this page as well"-flag in PRUNE records would go a long way to reducing the WAL overhead of aggressive and more opportunistic freezing. -Matthias