2014-01-27 Stephen Frost <sfr...@snowman.net> > * Simon Riggs (si...@2ndquadrant.com) wrote: > > I don't see anything for 9.4 in here now. > > Attached is what I was toying with (thought I had attached it previously > somewhere.. perhaps not), but in re-testing, it doesn't appear to do > enough to move things in the right direction in all cases. I did play > with this a fair bit yesterday and while it improved some cases by 20% > (eg: a simple join between pgbench_accounts and pgbench_history), when > we decide to *still* hash the larger side (as in my 'test_case2.sql'), > it can cause a similairly-sized decrease in performance. Of course, if > we can push that case to hash the smaller side (which I did by hand with > cpu_tuple_cost), then it goes back to being a win to use a larger number > of buckets. > > I definitely feel that there's room for improvment here but it's not an > easily done thing, unfortunately. To be honest, I was pretty surprised > when I saw that the larger number of buckets performed worse, even if it > was when we picked the "wrong" side to hash and I plan to look into that > more closely to try and understand what's happening. My first guess > would be what Tom had mentioned over the summer- if the size of the > bucket array ends up being larger than the CPU cache, we can end up > paying a great deal more to build the hash table than it costs to scan > through the deeper buckets that we end up with as a result (particularly > when we're scanning a smaller table). Of course, choosing to hash the > larger table makes that more likely.. >
This topic is interesting - we found very bad performance with hashing large tables with high work_mem. MergeJoin with quicksort was significantly faster. I didn't deeper research - there is a possibility of virtualization overhead. Regards Pavel > > Thanks, > > Stephen >