[ 
https://issues.apache.org/jira/browse/CALCITE-7232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18030514#comment-18030514
 ] 

Steve Carlin commented on CALCITE-7232:
---------------------------------------

I have the vague ideas of a phase 1 for this.  It might actually be all I need, 
but I'd like to hear from [~zabetak] (and anyone else, of course) and see if 
this is helpful for Hive.

I'd propose having an extra parameter to RexUtil.expandSearch().  Let's call it 
the Expandinator (I'm really horrible at names, so I doubt y'all would agree to 
that).  I picture it as an interface which contains methods which tell how 
certain expansions are done.

One method of the Expandinator might be called right here: 
[https://github.com/apache/calcite/blob/calcite-1.40.0/core/src/main/java/org/apache/calcite/rex/RexUtil.java#L636]

This method can be called something like List<RexNode> expandPoints(RexBuilder, 
Sarg<C>)   The caller here would take the return value of this method and add 
it to "orList".   

The default implementation would be kept as is to remove any backward 
compatibility.  For the method I just mentioned, the return value would contain 
the IN function as a bunch of ORs

My implementation would return my own custom operator IN.  This sorta punts the 
initial problem mentioned here about having an IN operator supported in 
Calcite.  And to be honest, I just don't have time to wait for an IN Operator 
to be defined in Calcite, so chances are this might be good enough for me. 

> Restore use of IN operator in RexCall
> -------------------------------------
>
>                 Key: CALCITE-7232
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7232
>             Project: Calcite
>          Issue Type: Task
>            Reporter: Stamatis Zampetakis
>            Priority: Major
>
> The use of {{IN}} operator in {{RexCall}} was superseded by the introduction 
> of the {{SEARCH}} operator (CALCITE-4173) and its use is strictly forbidden 
> through 
> [assertions|https://github.com/apache/calcite/blob/6cbbf560b721cb88354c33751aa72b16a58ded23/core/src/main/java/org/apache/calcite/rex/RexCall.java#L94].
>  The {{SEARCH}} operator is more general and powerful than {{IN}} so it's a 
> perfect abstraction to use during the optimization phase.
> However, most databases don't have a {{SEARCH}} operator so the latter needs 
> to be transformed back to {{IN}} (or something else) at some point in time. 
> For instance, Apache Hive has two ways of generating an executable plan:
>  * take a {{RelNode}} and generate an AST tree
>  * take a {{RelNode}} and generate a Hive Operator tree
> both of which are eventually going to be executed.
> *If we don't allow* IN in a RexCall, then it means that we need to create 
> special code to handle SEARCH in both code paths that differ only slightly in 
> each case. (In reality the situation is more complicated for Hive because 
> there are at least two more places where we need to do a SEARCH to IN 
> transformation).
> *If we allow IN* in a RexCall, then at the end of the RelNode optimization 
> phase we can "expand" {{SEARCH}} to {{IN}} so the transformation logic only 
> appears in one place and it remains a {{RelNode}} to {{RelNode}} conversion. 
> In fact, the same transformation logic could be exploited in 
> [SqlImplementor|https://github.com/apache/calcite/blob/6cbbf560b721cb88354c33751aa72b16a58ded23/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L815]
>  that does another {{RelNode}} to "something" conversion.
> The obvious downside with this proposal is that if people start mixing the IN 
> operator in various optimization rules/phases it can certainly affect the 
> quality of the plans and the planning time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to