Andres Freund писал 2021-08-05 23:12:
Hi,

On 2021-08-05 12:27:49 -0400, John Naylor wrote:
On Wed, Aug 4, 2021 at 3:44 PM Andres Freund <and...@anarazel.de> wrote:
> On 2021-08-04 12:39:29 -0400, John Naylor wrote:
> > typedef struct cc_bucket
> > {
> >   uint32 hashes[4];
> >   catctup *ct[4];
> >   dlist_head;
> > };
>
> I'm not convinced that the above the right idea though. Even if the hash
> matches, you're still going to need to fetch at least catctup->keys[0]
from
> a separate cacheline to be able to return the cache entry.

I see your point. It doesn't make sense to inline only part of the
information needed.

At least not for the unconditionally needed information.


Although I'm guessing inlining just two values in the 4-key case wouldn't
buy much.

Not so sure about that. I'd guess that two key comparisons take more cycles than a cacheline fetch the further keys (perhaps not if we had inlined key comparisons). I.e. I'd expect out-of-order + speculative execution to hide the
latency for fetching the second cacheline for later key values.


> If we stuffed four values into one bucket we could potentially SIMD the
hash
> and Datum comparisons ;)

;-) That's an interesting future direction to consider when we support
building with x86-64-v2. It'd be pretty easy to compare a vector of hashes and quickly get the array index for the key comparisons (ignoring for the
moment how to handle the rare case of multiple identical hashes).
However, we currently don't memcmp() the Datums and instead call an
"eqfast" function, so I don't see how that part would work in a vector
setting.

It definitely couldn't work unconditionally - we have to deal with text,
oidvector, comparisons after all. But we could use it for the other
types. However, looking at the syscaches, I think it'd not very often be
applicable for caches with enough columns.


I have wondered before whether we should have syscache definitions generate code specific to each syscache definition. I do think that'd give a good bit of performance boost. But I don't see a trivial way to get there without
notational overhead.

We could define syscaches in a header using a macro that's defined differently
in syscache.c than everywhere else. The header would declare a set of
functions for each syscache, syscache.c would define them to call an
always_inline function with the relevant constants.

Or perhaps we should move syscache definitions into the pg_*.h headers, and generate the relevant code as part of their processing. That seems like it could be nice from a modularity POV alone. And it could do better than the current approach, because we could hardcode the types for columns in the
syscache definition without increasing notational overhead.

Why don't use simplehash or something like that? Open-addressing schemes
show superior cache locality.

It could be combined: use hashValue as a key in simplehash and dlist for
hashValue collision handling. 99.99% of time there will be no collisions
on hashValue itself, therefore it will be almost always two-three memory
lookups: one-two for dlist_head in simple_hash by hashValue and then
fetching first element in dlist.

And code will remain almost same. Just "bucket" search will change a bit.

(And I'd recommend use lower fill factor for this simplehash. At most 0.85).

Well, simplehash entry will be 24 bytes this way. If simplehash template
supports external key/element storage, then it could be shrunk to 16 bytes,
and syscache entries will not need dlist_node. (But it doesn't at the
moment).

And custom open-addressing table could be even with 8 bytes per element:
- element is a pointer to entry,
- missing node is encoded as NULL,
- with fill factor as low as 0.66 there will be small amount of collisions, therefore non-empty entry will be matched entry most of time, and memory
  lookup for key comparison will be amortized free.
Note that 8byte entry with fill factor 0.66 consumes amortized 12.12 byte,
while 16byte entry with fill factor 0.99 consumes amortized 16.16byte.

regards,
Yura Sokolov

y.soko...@postgrespro.ru
funny.fal...@gmail.com


Reply via email to