On Thu, Apr 21, 2022 at 6:58 PM Yura Sokolov <y.soko...@postgrespro.ru> wrote: > At the master state: > - SharedBufHash is not declared as HASH_FIXED_SIZE > - get_hash_entry falls back to element_alloc too fast (just if it doesn't > found free entry in current freelist partition). > - get_hash_entry has races. > - if there are small number of spare items (and NUM_BUFFER_PARTITIONS is > small number) and HASH_FIXED_SIZE is set, it becomes contended and > therefore slow. > > HASH_REUSE solves (for shared buffers) most of this issues. Free list > became rare fallback, so HASH_FIXED_SIZE for SharedBufHash doesn't lead > to performance hit. And with fair number of spare items, get_hash_entry > will find free entry despite its races.
Hmm, I see. The idea of trying to arrange to reuse entries rather than pushing them onto a freelist and immediately trying to take them off again is an interesting one, and I kind of like it. But I can't imagine that anyone would commit this patch the way you have it. It's way too much action at a distance. If any ereport(ERROR,...) could happen between the HASH_REUSE operation and the subsequent HASH_ENTER, it would be disastrous, and those things are separated by multiple levels of call stack across different modules, so mistakes would be easy to make. If this could be made into something dynahash takes care of internally without requiring extensive cooperation with the calling code, I think it would very possibly be accepted. One approach would be to have a hash_replace() call that takes two const void * arguments, one to delete and one to insert. Then maybe you propagate that idea upward and have, similarly, a BufTableReplace operation that uses that, and then the bufmgr code calls BufTableReplace instead of BufTableDelete. Maybe there are other better ideas out there... -- Robert Haas EDB: http://www.enterprisedb.com