Hi, While benchmarking on hydra (c.f. http://archives.postgresql.org/message-id/20160406104352.5bn3ehkcsceja65c%40alap3.anarazel.de), which has quite slow IO, I was once more annoyed by how incredibly long the vacuum at the the end of a pgbench -i takes.
The issue is that, even for an entirely shared_buffers resident scale, essentially no data is cached in shared buffers. The COPY to load data uses a 16MB ringbuffer. Then vacuum uses a 256KB ringbuffer. Which means that copy immediately writes and evicts all data. Then vacuum reads & writes the data in small chunks; again evicting nearly all buffers. Then the creation of the ringbuffer has to read that data *again*. That's fairly idiotic. While it's not easy to fix this in the general case, we introduced those ringbuffers for a reason after all, I think we at least should add a special case for loads where shared_buffers isn't fully used yet. Why not skip using buffers from the ringbuffer if there's buffers on the freelist? If we add buffers gathered from there to the ringlist, we should have few cases that regress. Additionally, maybe we ought to increase the ringbuffer sizes again one of these days? 256kb for VACUUM is pretty damn low. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers