I cc'd Andres given his commit introduced simplehash, so I figured he'd probably have a few pointers on when each one might be useful.
On Tue, Apr 28, 2020 at 8:39 AM James Coleman <jtc...@gmail.com> wrote: ... > > Any particular reasons to pick dynahash over simplehash? ISTM we're > > using simplehash elsewhere in the executor (grouping, tidbitmap, ...), > > while there are not many places using dynahash for simple short-lived > > hash tables. Of course, that alone is a weak reason to insist on using > > simplehash here, but I suppose there were reasons for not using dynahash > > and we'll end up facing the same issues here. > > No particular reason; it wasn't clear to me that there was a reason to > prefer one or the other (and I'm not acquainted with the codebase > enough to know the differences), so I chose dynahash because it was > easier to find examples to follow for implementation. Do you have any thoughts on what the trade-offs/use-cases etc. are for dynahash versus simple hash? >From reading the commit message in b30d3ea824c it seems like simple hash is faster and optimized for CPU cache benefits. The comments at the top of simplehash.h also discourage it's use in non performance/space sensitive uses, but there isn't anything I can see that explicitly tries to discuss when dynahash is useful, etc. Given the performance notes in that commit message, I thinking switching to simple hash is worth it. But I also wonder if there might be some value in a README or comments addition that would be a guide to what the various hash implementations are useful for. If there's interest, I could try to type something short up so that we have something to make the code base a bit more discoverable. James