Stephen Frost <sfr...@snowman.net> writes:
> Looking with opannotate, there's two main hotspots in
> ExecScanHashBucket:

>  12846 17.4001 :        hashTuple = 
> hashtable->buckets[hjstate->hj_CurBucketNo];
> and
>  22100 29.9348 :        hashTuple = hashTuple->next;

Those are, of course, pretty trivial statements; so the way I read this
is that the fundamental slowdown comes from the hash table being large
compared to the CPU's cache, so that you're incurring lots of cache
misses at these particular fetches.  (You might be able to confirm that
if you can set oprofile to count cache misses rather than wall clock
time.)

> I'm certainly curious about those, but I'm also very interested in the
> possibility of making NTUP_PER_BUCKET much smaller, or perhaps variable
> depending on the work_mem setting.

Not sure about that.  That would make the hash-bucket-header array
larger without changing the size of the rest of the hash table, thus
probably making the CPU cache situation worse not better (which would
manifest as more time at the first of these two statements relative to
the second).

Can you add some instrumentation code to get us information about the
average/max length of the bucket chains?  And maybe try to figure out
how many distinct hash values per bucket, which would give us a clue
whether your two-level-list idea is worth anything.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to