Hi, The ringbuffers we use for seqscans, vacuum, copy etc can cause very drastic slowdowns (see e.g. [1]), an can cause some workloads to practically never end up utilizing shared buffers. ETL workloads e.g. regularly fight with that problem.
While I think there's a number of improvements[2] we could make to the ringbuffer logic, I think we should also just allow to make them configurable. I think that'll allow a decent number of systems perform better (especially on slightly bigger systems the the current ringbuffers are *way* too small) , make the thresholds more discoverable (e.g. the NBuffers / 4 threshold is very confusing), and will make it easier to experiment with better default values. I think it would make sense to have seqscan_ringbuffer_threshold, {bulkread,bulkwrite,vacuum}_ringbuffer_size. I think they often sensibly are set in proportion of shared_buffers, so I suggest defining them as floats, where negative values divide shared_buffers, whereas positive values are absolute sizes, and 0 disables the use of ringbuffers. I.e. to maintain the current defaults, seqscan_ringbuffer_threshold would be -4.0, but could be also be set to an absolute 4GB (converted to pages). Probably would want a GUC show function that displays proportional values in a nice way. We probably should also just increase all the ringbuffer sizes by an order of magnitude or two, especially the one for VACUUM. Greetings, Andres Freund [1] https://postgr.es/m/20190507201619.lnyg2nyhmpxcgeau%40alap3.anarazel.de [2] The two most important things imo: a) Don't evict buffers when falling off the ringbuffer as long as there unused buffers on the freelist. Possibly just set their usagecount to zero as long that is the case. b) The biggest performance pain comes from ringbuffers where it's likely that buffers are dirty (vacuum, copy), because doing so requires that the corresponding WAL be flushed. Which often ends up turning many individual buffer evictions into an fdatasync, slowing things down to a crawl. And the contention caused by that is a significant concurrency issue too. By doing writes, but not flushes, shortly after the insertion, we can reduce the cost significantly.