Thanks Zoltan. I added a test to RelBuilderTest and I can confirm that
'_ IS TRUE' is stripped. And I agree that we cannot safely strip '_ IS
NOT FALSE'.

On Tue, Mar 31, 2020 at 5:11 AM Zoltan Haindrich <[email protected]> wrote:
>
>  > we should consider that 'b IS TRUE' and 'b IS NOT FALSE'
>
> I think we already do that... UnknownAs.FALSE essentially means that the 
> expression is enclosed in an "IS TRUE" - since we run filter/join condition 
> simplification in UAF
> mode; its allowed to remove "_ IS TRUE"
>
> At the same time I don't see a way how "X IS NOT FALSE" could be removed 
> because in case X is null; the expression should evaluate to true (this 
> expression translates to
> UnknownAs.TRUE mode) - which could be lost in case of a rewrite. We may 
> consider adding the UnknownAs mode to the filter/join node; but I think that 
> would just cause
> trouble; are there some other way which I've not considered?
>
> cheers,
> Zoltan
>
> On 3/30/20 7:19 PM, Julian Hyde wrote:
> > If we're going down the path, we should consider that 'b IS TRUE' and
> > 'b IS NOT FALSE' are somewhat like casts. Removing them from join
> > conditions does not affect the result of the join.
> >
> > And the same apply to filter conditions.
> >
> > I don't know whether removing casts, _ IS TRUE and _ IS NOT FALSE from
> > conditions genuinely make the world "simpler". But let's try it and
> > see.
> >
> > On Mon, Mar 30, 2020 at 8:06 AM Zoltan Haindrich <[email protected]> wrote:
> >>
> >> Hey Shuo!
> >>
> >> Thank you for sharing the testcase! I've seen that you were able to fix it 
> >> by calling the builder instead of copy - right now I think fixing this 
> >> thru ReduceExpressionRule
> >> might be better - as it could also fix up other cases.
> >> I've tried disabling nullability retainment for filters/join conditions - 
> >> and it seems to be working; I'll submit it under [1].
> >>
> >> Julian: I recommended to try that to provide a quick check to see if at 
> >> that point the issue could be fixed - I was confident that by disabling 
> >> "matchNullability" for
> >> "simplifyPreservingType()" will do the right thing and it doesn't add an 
> >> unnecessary cast - instead it safely removes it; however: it still added 
> >> the cast...and by doing so
> >> it didn't helped :)
> >>
> >> [1] https://issues.apache.org/jira/browse/CALCITE-3887
> >>
> >> cheers,
> >> Zoltan
> >>
> >>
> >> On 3/26/20 12:22 PM, Shuo Cheng wrote:
> >>> I think we may solve the problem from two aspects:
> >>> 1. Do not try to preserve type (nullability) of Join/Filter condition
> >>> expression when simplifying or something like pushing down.
> >>> 2. We can do some work (remove unnecessary CAST) right before create a
> >>> Join/Filter, as Julian said, something in RelBuilder could be done.
> >>> I've do some fix in above Link (remove unnecessary CAST when doing
> >>> pushDownEqualJoinConditions)
> >>>
> >>> On Thu, Mar 26, 2020 at 7:14 PM Shuo Cheng <[email protected]> wrote:
> >>>
> >>>> Sorry for the late reply, I've reproduced the problem here
> >>>> https://github.com/cshuo/calcite/commit/b9a7fb5f536825d3a577bf42a5fc6cc7d4df7929
> >>>> .
> >>>>
> >>>> On Wed, Mar 25, 2020 at 12:38 AM Julian Hyde <[email protected]> wrote:
> >>>>
> >>>>> It does seem to be something that RelBuilder could do. (RexSimplify 
> >>>>> can’t
> >>>>> really do it, because it doesn’t know how the expression is being used.)
> >>>>>
> >>>>> It’s also worth discovering why the CAST was added in the first place. 
> >>>>> It
> >>>>> doesn’t seem to be helpful. I think we should strive to eliminate all of
> >>>>> the slightly unhelpful things that Calcite does; those things can add up
> >>>>> and cause major inefficiencies in the planning process and/or 
> >>>>> sub-optimal
> >>>>> plans.
> >>>>>
> >>>>> Julian
> >>>>>
> >>>>>
> >>>>>> On Mar 24, 2020, at 1:47 AM, Zoltan Haindrich <[email protected]> wrote:
> >>>>>>
> >>>>>> Hey,
> >>>>>>
> >>>>>> That's a great diagnosis :)
> >>>>>> I would guess that newCondition became non-nullable for some reason
> >>>>> (rexSimplify runs under RexProgramBuilder so it might be able to narrow 
> >>>>> the
> >>>>> nullability)
> >>>>>> you could try invoking simplify.simplifyPreservingType() on it to see
> >>>>> if that would help.
> >>>>>>
> >>>>>>> I know it's necessary to preserve the nullability when simplifying a
> >>>>> boolean expression in project columns, but as for condition in 
> >>>>> Filter/Calc,
> >>>>> may be we can omit the
> >>>>>>> nullability?
> >>>>>> I think that could probably work - we can't change the nullability on
> >>>>> project columns because those could be referenced (and the reference 
> >>>>> also
> >>>>> has the type) ; but for filter/join conditions we don't need to care 
> >>>>> with
> >>>>> it.
> >>>>>> It seems we already have a "matchnullability" in ReduceExpressionsRule
> >>>>> ; for FILTER/JOIN we should probably turn that off...  :)
> >>>>>>
> >>>>>> cheers,
> >>>>>> Zoltan
> >>>>>>
> >>>>>>
> >>>>>> On 3/24/20 9:15 AM, Shuo Cheng wrote:
> >>>>>>> Hi Zoltan,
> >>>>>>> I encountered the problem when running TPC tests, and have not
> >>>>> reproduced it in Calcite master.
> >>>>>>> But I figured it out how the problem is produced:
> >>>>>>> There is semi join with the condition:AND(EXPANDED_INDF1,
> >>>>> EXPANDED_INDF2), type of AND is BOOLEAN with nullable `true`
> >>>>>>> After JoinPushExpressionsRule -->> join condition: AND(INDF1, INDF2),
> >>>>> type of AND is BOOLEAN with nullable `true`
> >>>>>>> After  SemiJoinProjectTransposeRule --> Join condition:
> >>>>> CAST(AND(INDF1, INDF2)), type of AND is BOOLEAN with nullable `false`
> >>>>>>> Just as what you suspected, It's in `SemiJoinProjectTransposeRule`
> >>>>> where forced type correction is added by 
> >>>>> `RexProgramBuilder#addCondition`,
> >>>>> which will call `RexSimplify#simplifyPreservingType` before registering 
> >>>>> an
> >>>>> expression.
> >>>>>>> I know it's necessary to preserve the nullability when simplifying a
> >>>>> boolean expression in project columns, but as for condition in 
> >>>>> Filter/Calc,
> >>>>> may be we can omit the nullability?
> >>>>>>> Best Regards,
> >>>>>>> Shuo
> >>>>>>> On Tue, Mar 24, 2020 at 3:35 PM Zoltan Haindrich <[email protected] <mailto:
> >>>>> [email protected]>> wrote:
> >>>>>>>      Hey Shuo!
> >>>>>>>      I think that simplification should been made on join conditions -
> >>>>> I've done a quick check; and it seems to be working for me.
> >>>>>>>      I suspected that it will be either a missing call to RexSimplify
> >>>>> for some reason - or it is added by a forced return type correction: 
> >>>>> IIRC
> >>>>> there are some cases in which
> >>>>>>>      the
> >>>>>>>      RexNode type should retained after simplification.
> >>>>>>>      Is this reproducible on current master; could you share a 
> >>>>>>> testcase?
> >>>>>>>      cheers,
> >>>>>>>      Zoltan
> >>>>>>>      On 3/24/20 7:28 AM, Shuo Cheng wrote:
> >>>>>>>       > Hi, Julian, That's what we do as a workaround way. we remove
> >>>>> CAST which are
> >>>>>>>       > only widening nullability as what CALCITE-2695 does before
> >>>>> applying
> >>>>>>>       > hash-join/sort-merge-join rule, such that equiv predicate can 
> >>>>>>> be
> >>>>> split
> >>>>>>>       > out.  I'm not sure whether it's properly for Calcite to do the
> >>>>> 'convert
> >>>>>>>       > back' job, for example, simplify the join condition when 
> >>>>>>> create
> >>>>> a Join; Or
> >>>>>>>       > maybe let other systems what use Calcite to do the "convert
> >>>>> back" job as an
> >>>>>>>       > optimization? What do you think?
> >>>>>>>       >
> >>>>>>>       > On Tue, Mar 24, 2020 at 2:04 PM Julian Hyde <
> >>>>> [email protected] <mailto:[email protected]>> wrote:
> >>>>>>>       >
> >>>>>>>       >> Or convert it back to a not-nullable BOOLEAN? The join
> >>>>> condition treats
> >>>>>>>       >> UNKNOWN the same as FALSE, and besides UNKNOWN will never
> >>>>> occur, so the
> >>>>>>>       >> conditions with and without the CAST are equivalent.
> >>>>>>>       >>
> >>>>>>>       >> Julian
> >>>>>>>       >>
> >>>>>>>       >>> On Mar 23, 2020, at 9:34 PM, Shuo Cheng <[email protected]
> >>>>> <mailto:[email protected]>> wrote:
> >>>>>>>       >>>
> >>>>>>>       >>> Hi all,
> >>>>>>>       >>>
> >>>>>>>       >>> Considering the Join condition 
> >>>>>>> 'CAST(IS_NOT_DISTINCT_FROM($1,
> >>>>> $2),
> >>>>>>>       >>> BOOLEAN)', which cast the non-nullable BOOLEAN to nullable
> >>>>> BOOLEAN,
> >>>>>>>       >> Calcite
> >>>>>>>       >>> can not split out equiv predicate, thus some join operation
> >>>>> like hash
> >>>>>>>       >> join
> >>>>>>>       >>> / sort merge join may not be used. Maybe we can
> >>>>>>>       >>> expand RelOptUtil#splitJoinCondition to support this 
> >>>>>>> scenario?
> >>>>>>>       >>
> >>>>>>>       >
> >>>>>
> >>>>>
> >>>

Reply via email to