Tom Lane wrote: > Fujii Masao <masao.fu...@gmail.com> writes: > > When VACUUM tries to truncate the trailing empty pages, it scans > > shared_buffers > > to invalidate the pages-to-truncate during holding an AccessExclusive lock > > on > > the relation. So if shared_buffers is huge, other transactions need to wait > > for > > a very long time before accessing to the relation. Which would cause the > > response-time spikes, for example, I observed such spikes several times on > > the server with shared_buffers = 300GB while running the benchmark. > > Therefore, I'm thinking to propose $SUBJECT and enable it to avoid such > > spikes > > for that relation. > > I think that the real problem here is having to do a scan of all of shared > buffers. VACUUM's not the only thing that has to do that, there's also > e.g. DROP and TRUNCATE. So rather than a klugy solution that only fixes > VACUUM (and not very well, requiring user intervention and an unpleasant > tradeoff), we ought to look at ways to avoid needing a whole-pool scan to > find the pages belonging to one relation. In the past we've been able to > skate by without a decent solution for that because shared buffers were > customarily not all that big. But if we're going to start considering > huge buffer pools to be a case we want to have good performance for, > that's got to change.
Andres was working on a radix tree structure to fix this problem, but that seems to be abandoned now, and it seems a major undertaking. While I agree that the proposed solution is a wart, it seems much better than no solution at all. Can we consider Fujii's proposal as a temporary measure until we fix shared buffers? I'm +1 on it myself. We've seen this problem also affecting a production workload pretty severely, though shared_buffers is not as big. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services