Re: Speed up Hash Join by teaching ExprState about hashing

2024-11-04 Thread David Rowley
On Tue, 20 Aug 2024 at 13:40, David Rowley wrote: > I made a few more tweaks to the comments and pushed the result. While working on the patch to make HashAgg / hashed subplans and hashed setops to use ExprState hashing, I discovered a bug in that code when hashing 0 hashkeys (Possible with SELEC

Re: Speed up Hash Join by teaching ExprState about hashing

2024-08-19 Thread David Rowley
On Mon, 19 Aug 2024 at 18:41, David Rowley wrote: > The attached v5 patch includes this change. I made a few more tweaks to the comments and pushed the result. Thank you both of you for having a look at this. David

Re: Speed up Hash Join by teaching ExprState about hashing

2024-08-18 Thread David Rowley
Thanks for having a look. On Sat, 17 Aug 2024 at 23:21, Tels wrote: > Is it nec. to rotate existing_hash here before checking for isnull? > Because in case of isnull, isn't the result of the rotate thrown away? Yeah, I think that it's worthwhile moving that to after the isnull check so as not to

Re: Speed up Hash Join by teaching ExprState about hashing

2024-08-17 Thread Tels
Hello David, you wrote: v4 patch attached. If nobody else wants to look at this then I'm planning on pushing it soon. Had a very brief look at this bit caught my attentioon: + EEO_CASE(EEOP_HASHDATUM_NEXT32_STRICT) + { + FunctionCallInfo fcin

Re: Speed up Hash Join by teaching ExprState about hashing

2024-08-16 Thread David Rowley
On Thu, 15 Aug 2024 at 19:50, Alexey Dvoichenkov wrote: > I gave v3 another look. One tiny thing I've noticed is that you > removed ExecHashGetHashValue() but not its forward declaration in > include/executor/nodeHash.h Fixed > I also reviewed the JIT code this time, it looks reasonable to > me.

Re: Speed up Hash Join by teaching ExprState about hashing

2024-07-10 Thread David Rowley
On Mon, 13 May 2024 at 21:23, David Rowley wrote: > In master, if you look at ExecHashGetHashValue() in nodeHash.c, you > can see that it calls ExecEvalExpr() and then manually calls the hash > function on the returned value. This process is repeated once for each > hash key. This is inefficient f