On Thu, 8 Apr 2021 at 05:54, Tomas Vondra <tomas.von...@enterprisedb.com> wrote: > I only ran that with a single client, the machine only has 4 cores and > this should not be related to concurrency, so 1 client seems fine. The > average of 10 runs, 15 seconds each look like this:
Thanks for running these tests. The reason I added so much concurrency was that this AMD machine has some weird behaviour in regards to power management. Every bios update I seem to get changes the power management but it still is very unstable despite me having it on the most optimal settings in the bios. Working it a bit harder seems to help make it realise that there might be some urgency. > simple prepared 10/s 10/p 100/s 100/p > ------------------------------------------------------------- > master 21847 59476 23343 59380 11757 56488 > v4 21546 57757 22864 57704 11572 57350 > v4+v5 23374 56089 24410 56140 14765 55302 > > The first two columns are your bench.sql, with -M simple or prepared. > The other columns are 10 or 100 values, /s is simple, /p is prepared. > > Compared to master: > > simple prepared 10/s 10/p 100/s 100/p > ------------------------------------------------------------- > v4 98.62% 97.11% 97.95% 97.18% 98.43% 101.52% > v4+v5 106.99% 94.31% 104.57% 94.54% 125.59% 97.90% > > That seems to mostly match your observation - there's a small > performance hit (~2%), although that might be due to changes in the > layout of the binary. And v4+v5 improves that a bit (even compared to > master), although I don't see the same 20% speedup. I've spent more time hacking at this patch. I had a bit of a change of heart earlier about having this new HashedScalarArrayOpExpr node type. There were more places that I imagined that I needed to add handling for it. For example, partprune.c needed to know about it to allow partition pruning on them. While supporting that is just a few lines to make a recursive call passing in the underlying ScalarArrayOpExpr, I just didn't like the idea. Instead, I think it'll be better just to add a new field to ScalarArrayOpExpr and have the planner set that to tell the executor that it should use a hash table to perform the lookups rather than a linear search. This can just be the hash function oid, which also saves the executor from having to look that up. After quite a bit of hacking, I've ended up with the attached. I added the required JIT code to teach the jit code about EEOP_HASHED_SCALARARRAYOP. I also wrote the missing regression test for non-strict equality ops and moved the declaration for the simplehash.h code into execExprInterp.c and forward declared ScalarArrayOpExprHashTable in exprExpr.h. I also rewrote a large number of comments and fixed a few things like missing permission checks for the hash function. I've not done any further performance tests yet but will start those now. David
saop_hash_v6.patch
Description: Binary data