Thank you Andres, I see, It combines an array that is fast for few buffers and a hash that in theory scales well for very large number of buffers. And avoids using an array that would be fast but would multiply the memory usage by the number of backends.
> Index prefetching patch: > uncorrelated: 228.936 ms > correlated: 71.684 ms I did some tests > Possible improvements to refcount tracking: > > - increase REFCOUNT_ARRAY_ENTRIES - there's a very significant cliff at 8 > right now, and with vectorized lookup it might not hurt too much to go to 16 > or so Yes, that is true, but only up to 16, the index prefetch test I was doing was getting to 90 or so, and that was clipped by max_pinned_buffers. Also, I noticed a commit 3 months ago that removed the mid-loop return that effectively will add the first few pins right to left instead of left to right. Maybe this works well with vectorisation, but I see an optimization for the for the (pin/unpin)+ sequence, what about the pin(pin/unpin)+ sequence. The previous code would always find the buffers on the first or the second iteration, the new implementation will have to go to the 7th or 8th iteration, (or I am not missing something very important). > - To make the cliff at REFCOUNT_ARRAY_ENTRIES smaller, replace dynahash with > simplehash. That should reduce the perf penalty a good bit. This is also true, even remove the refcount array completely. > Unfortunately it's not just the refcount tracking, it's also resowner > management that gets more expensive. I didn't read this sentence until I came back to reply. It is exactly what I noticed. Once we fix the reference counting the resowner still puts a floor. And that is even more important when a buffer is pinned multiple times because the resowner will add one entry to the buffer for each pin. There is another problem, ResOwnerReleaseBuffer unlocks buffers, even if it is not the owner of the lock. I think this deserves a separate thread.
