Yesterday Andres mentioned to me that he's getting wrong results with
TPCH-Q02. Andres did the analysis to figure out that this was caused
by adf97c156 due to how I chose to store intermediate hash values when
the Hash Join has multiple join keys.

Per Andres' investigation, it seems what's going on is that
EEOP_PARAM_SET calls ExecEvalParamSet() and that sets the param value
from the ExprState->resvalue, the same location as the previous
hashing step stored its intermediate result.

It's probably not very good practice to store intermediate things in
ExprState->resvalue and expect them to still be there after evaluation
of steps that you might not have any control over, so the attached
adjust things to add a dedicated location for the intermediate hash
value and adjusts the step generation code to store all apart from the
final hashing step into that location.

I tested this with TPCH-Q02 and the results are correct again (when
compared with SET enable_hashjoin=0). I tried both with JIT on and
off.

The Hash Join in question looks like:

 ->  Hash Join  (cost=137144.04..630694.61 rows=1 width=197)
               Hash Cond: ((part.p_partkey = partsupp.ps_partkey) AND
((SubPlan 1) = partsupp.ps_supplycost))

I plan to push this soon, but if anyone wants to look over it, I'll
leave it here for a while before doing so.

David

Attachment: 0001-Don-t-store-intermediate-hash-values-in-ExprState-re.patch
Description: Binary data

Reply via email to