[ https://issues.apache.org/jira/browse/CALCITE-6887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yu Xu updated CALCITE-6887: --------------------------- Description: Currently IN operator in ReduceExpressionsRule would not distinct values, so need optimize it with distinct values in IN. for example *in (1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1)* transform to *in (1,2,3)* is better, but currently would not *test case:* @Test void testReduceExpressionsWithIn() { final String sql = "select deptno, sal " + "from emp " + "where deptno in (1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1) "; sql(sql).withRule(CoreRules.PROJECT_REDUCE_EXPRESSIONS) .check(); } *before plan:* LogicalProject(DEPTNO=[$7], SAL=[$5]) LogicalFilter(condition=[IN($7, { LogicalValues(tuples=[[ { 1 }, \{ 1 }, \{ 2 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 3 }, \{ 1 }]]) })]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) *after plan:* LogicalProject(DEPTNO=[$7], SAL=[$5]) LogicalFilter(condition=[IN($7, { LogicalValues(tuples=[[ { 1 }, \{ 1 }, \{ 2 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 3 }, \{ 1 }]]) })]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) was: Currently IN operator in ReduceExpressionsRule would not distinct values, so need optimize it with distinct values in IN. *test case:* @Test void testReduceExpressionsWithIn() { final String sql = "select deptno, sal " + "from emp " + "where deptno in (1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1) "; sql(sql).withRule(CoreRules.PROJECT_REDUCE_EXPRESSIONS) .check(); } *before plan:* LogicalProject(DEPTNO=[$7], SAL=[$5]) LogicalFilter(condition=[IN($7, { LogicalValues(tuples=[[ { 1 }, \{ 1 }, \{ 2 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 3 }, \{ 1 }]]) })]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) *after plan:* LogicalProject(DEPTNO=[$7], SAL=[$5]) LogicalFilter(condition=[IN($7, \{ LogicalValues(tuples=[[{ 1 }, \{ 2 }, \{ 3 }]]) })]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > In should distinct values with ReduceExpressionRule > --------------------------------------------------- > > Key: CALCITE-6887 > URL: https://issues.apache.org/jira/browse/CALCITE-6887 > Project: Calcite > Issue Type: Bug > Components: core > Affects Versions: 1.38.0 > Reporter: Yu Xu > Assignee: Yu Xu > Priority: Major > Labels: pull-request-available > Fix For: 1.40.0 > > > Currently IN operator in ReduceExpressionsRule would not distinct values, so > need optimize it with distinct values in IN. > for example *in (1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1)* transform to > *in (1,2,3)* is better, but currently would not > *test case:* > @Test void testReduceExpressionsWithIn() > { final String sql = "select deptno, sal " + "from emp " + "where deptno in > (1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1) "; > sql(sql).withRule(CoreRules.PROJECT_REDUCE_EXPRESSIONS) .check(); } > > *before plan:* > LogicalProject(DEPTNO=[$7], SAL=[$5]) > LogicalFilter(condition=[IN($7, > { LogicalValues(tuples=[[ > { 1 }, \{ 1 }, \{ 2 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 > }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ > 3 }, \{ 1 }]]) > })]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > > *after plan:* > LogicalProject(DEPTNO=[$7], SAL=[$5]) > LogicalFilter(condition=[IN($7, > { LogicalValues(tuples=[[ > { 1 }, \{ 1 }, \{ 2 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 > }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ > 3 }, \{ 1 }]]) > })]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) -- This message was sent by Atlassian Jira (v8.20.10#820010)