On Tue, Apr 28, 2020 at 06:22:20PM -0400, James Coleman wrote:
I cc'd Andres given his commit introduced simplehash, so I figured
he'd probably have a few pointers on when each one might be useful.

On Tue, Apr 28, 2020 at 8:39 AM James Coleman <jtc...@gmail.com> wrote:
...
> Any particular reasons to pick dynahash over simplehash? ISTM we're
> using simplehash elsewhere in the executor (grouping, tidbitmap, ...),
> while there are not many places using dynahash for simple short-lived
> hash tables. Of course, that alone is a weak reason to insist on using
> simplehash here, but I suppose there were reasons for not using dynahash
> and we'll end up facing the same issues here.

No particular reason; it wasn't clear to me that there was a reason to
prefer one or the other (and I'm not acquainted with the codebase
enough to know the differences), so I chose dynahash because it was
easier to find examples to follow for implementation.

Do you have any thoughts on what the trade-offs/use-cases etc. are for
dynahash versus simple hash?

From reading the commit message in b30d3ea824c it seems like simple
hash is faster and optimized for CPU cache benefits. The comments at
the top of simplehash.h also discourage it's use in non
performance/space sensitive uses, but there isn't anything I can see
that explicitly tries to discuss when dynahash is useful, etc.

Given the performance notes in that commit message, I thinking
switching to simple hash is worth it.


I recall doing some benchmarks for that patch, but it's so long I don't
really remember the details. But in general, I agree simplehash is a bit
more efficient in terms of CPU / caching.

I think the changes required to switch from dynahash to simplehash are
fairly small, so I think the best thing we can do is just try do some
measurement and then decide.

But I also wonder if there might be some value in a README or comments
addition that would be a guide to what the various hash
implementations are useful for. If there's interest, I could try to
type something short up so that we have something to make the code
base a bit more discoverable.


I wouldn't object to that. Although maybe we should simply add some
basic recommendations to the comments in dynahash/simplehash.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply via email to