Peter Eisentraut <peter.eisentr...@2ndquadrant.com> writes: > I can't detect any performance improvement with the patch applied to > current master, using the test case from Yoshikazu Imai (2019-03-19).
FWIW, I tried this patch against current HEAD (959d00e9d). Using the test case described by Amit at <be25cadf-982e-3f01-88b4-443a6667e...@lab.ntt.co.jp> I do measure an undeniable speedup, close to 35%. However ... I submit that that's a mighty extreme test case. (I had to increase max_locks_per_transaction to get it to run at all.) We should not be using that sort of edge case to drive performance optimization choices. If I reduce the number of partitions in Amit's example from 8192 to something more real-world, like 128, I do still measure a performance gain, but it's ~ 1.5% which is below what I'd consider a reproducible win. I'm accustomed to seeing changes up to 2% in narrow benchmarks like this one, even when "nothing changes" except unrelated code. Trying a standard pgbench test case (pgbench -M prepared -S with one client and an -s 10 database), it seems that the patch is about 0.5% slower than HEAD. Again, that's below the noise threshold, but it's not promising for the net effects of this patch on workloads that aren't specifically about large and prunable partition sets. I'm also fairly concerned about the effects of the patch on sizeof(LOCALLOCK) --- on a 64-bit machine it goes from 72 to 88 bytes, a 22% increase. That's a lot if you're considering cases with many locks. On the whole I don't think there's an adequate case for committing this patch. I'd also point out that this is hardly the only place where we've seen hash_seq_search on nearly-empty hash tables become a bottleneck. So I'm not thrilled about attacking that with one-table-at-time patches. I'd rather see us do something to let hash_seq_search win across the board. I spent some time wondering whether we could adjust the data structure so that all the live entries in a hashtable are linked into one chain, but I don't quite see how to do it without adding another list link to struct HASHELEMENT, which seems pretty expensive. I'll sketch the idea I had, just in case it triggers a better idea in someone else. Assuming we are willing to add another pointer to HASHELEMENT, use the two pointers to create a doubly-linked circular list that includes all live entries in the hashtable, with a list header in the hashtable's control area. (Maybe we'd use a dlist for this, but it's not essential.) Require this list to be organized so that all entries that belong to the same hash bucket are consecutive in the list, and make each non-null hash bucket header point to the first entry in the list for its bucket. To allow normal searches to detect when they've run through their bucket, add a flag to HASHELEMENT that is set only in entries that are the first, or perhaps last, of their bucket (so you'd detect end-of-bucket by checking the flag instead of testing for a null pointer). Adding a bool field is free due to alignment considerations, at least on 64-bit machines. Given this, I think normal hash operations are more-or-less the same cost as before, while hash_seq_search just has to follow the circular list. I tried to figure out how to do the same thing with a singly-linked instead of doubly-linked list, but it doesn't quite work: if you need to remove the first element of a bucket, you have no cheap way to find its predecessor in the overall list (which belongs to some other bucket, but you don't know which one). Maybe we could just mark such entries dead (there's plenty of room for another flag bit) and plan to clean them up later? But it's not clear how to ensure that they'd get cleaned up in any sort of timely fashion. Another issue is that probably none of this works for the partitioned hash tables we use for some of the larger shared-memory hashes. But I'm not sure we care about hash_seq_search for those, so maybe we just say those are a different data structure. regards, tom lane