Hey,
That's a great diagnosis :)
I would guess that newCondition became non-nullable for some reason
(rexSimplify runs under RexProgramBuilder so it might be able to narrow the
nullability)
you could try invoking simplify.simplifyPreservingType() on it to see if that
would help.
> I know it's necessary to preserve the nullability when simplifying a boolean
expression in project columns, but as for condition in Filter/Calc, may be we can
omit the
> nullability?
I think that could probably work - we can't change the nullability on project columns because those could be referenced (and the reference also has the type) ; but for
filter/join conditions we don't need to care with it.
It seems we already have a "matchnullability" in ReduceExpressionsRule ; for
FILTER/JOIN we should probably turn that off... :)
cheers,
Zoltan
On 3/24/20 9:15 AM, Shuo Cheng wrote:
Hi Zoltan,
I encountered the problem when running TPC tests, and have not reproduced it in
Calcite master.
But I figured it out how the problem is produced:
There is semi join with the condition:AND(EXPANDED_INDF1, EXPANDED_INDF2), type
of AND is BOOLEAN with nullable `true`
After JoinPushExpressionsRule -->> join condition: AND(INDF1, INDF2), type of
AND is BOOLEAN with nullable `true`
After SemiJoinProjectTransposeRule --> Join condition: CAST(AND(INDF1,
INDF2)), type of AND is BOOLEAN with nullable `false`
Just as what you suspected, It's in `SemiJoinProjectTransposeRule` where forced type correction is added by `RexProgramBuilder#addCondition`, which will call
`RexSimplify#simplifyPreservingType` before registering an expression.
I know it's necessary to preserve the nullability when simplifying a boolean expression in project columns, but as for condition in Filter/Calc, may be we can omit the
nullability?
Best Regards,
Shuo
On Tue, Mar 24, 2020 at 3:35 PM Zoltan Haindrich <[email protected]
<mailto:[email protected]>> wrote:
Hey Shuo!
I think that simplification should been made on join conditions - I've done
a quick check; and it seems to be working for me.
I suspected that it will be either a missing call to RexSimplify for some
reason - or it is added by a forced return type correction: IIRC there are some
cases in which
the
RexNode type should retained after simplification.
Is this reproducible on current master; could you share a testcase?
cheers,
Zoltan
On 3/24/20 7:28 AM, Shuo Cheng wrote:
> Hi, Julian, That's what we do as a workaround way. we remove CAST which
are
> only widening nullability as what CALCITE-2695 does before applying
> hash-join/sort-merge-join rule, such that equiv predicate can be split
> out. I'm not sure whether it's properly for Calcite to do the 'convert
> back' job, for example, simplify the join condition when create a Join;
Or
> maybe let other systems what use Calcite to do the "convert back" job as
an
> optimization? What do you think?
>
> On Tue, Mar 24, 2020 at 2:04 PM Julian Hyde <[email protected]
<mailto:[email protected]>> wrote:
>
>> Or convert it back to a not-nullable BOOLEAN? The join condition treats
>> UNKNOWN the same as FALSE, and besides UNKNOWN will never occur, so the
>> conditions with and without the CAST are equivalent.
>>
>> Julian
>>
>>> On Mar 23, 2020, at 9:34 PM, Shuo Cheng <[email protected]
<mailto:[email protected]>> wrote:
>>>
>>> Hi all,
>>>
>>> Considering the Join condition 'CAST(IS_NOT_DISTINCT_FROM($1, $2),
>>> BOOLEAN)', which cast the non-nullable BOOLEAN to nullable BOOLEAN,
>> Calcite
>>> can not split out equiv predicate, thus some join operation like hash
>> join
>>> / sort merge join may not be used. Maybe we can
>>> expand RelOptUtil#splitJoinCondition to support this scenario?
>>
>