[ https://issues.apache.org/jira/browse/HIVE-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17597866#comment-17597866 ]
Krisztian Kasa commented on HIVE-17342: --------------------------------------- This seems to be fixed on the master branch: The plan of the query mentioned in the description: {code:java} POSTHOOK: query: explain SELECT * FROM (select avg(d_year) as y from date_dim where d_year>1999) q WHERE 1=0 POSTHOOK: type: QUERY POSTHOOK: Input: default@date_dim #### A masked pattern was here #### Plan optimized by CBO. Stage-0 Fetch Operator limit:0 {code} It seems that Calcite has the functionality to transform Filter operators with always false predicate to a relational expression which returns empty result set. (CALCITE-1058) To exploit it Hive introduced it's implementation of this expression (HIVE-13316): [https://github.com/apache/hive/blob/64b731820cf462d3c03632fb4e99277ee539dd08/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelBuilder.java#L133] CBO plan is {code:java} POSTHOOK: query: explain cbo SELECT * FROM (select avg(d_year) as y from date_dim where d_year>1999) q WHERE 1=0 POSTHOOK: type: QUERY POSTHOOK: Input: default@date_dim #### A masked pattern was here #### CBO PLAN: HiveSortLimit(fetch=[0]) HiveProject($f0=[/(CAST($0):DOUBLE, $1)]) HiveAggregate(group=[{}], agg#0=[sum($0)], agg#1=[count($0)]) HiveFilter(condition=[>($0, 1999)]) HiveTableScan(table=[[default, date_dim]], table:alias=[date_dim]) {code} Since the {{HiveSortLimit(fetch=[0])}} is the root operator no execution is needed (HIVE-7203): [https://github.com/apache/hive/blob/64b731820cf462d3c03632fb4e99277ee539dd08/ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java#L264] > Where condition with 1=0 should be treated similar to limit 0 > ------------------------------------------------------------- > > Key: HIVE-17342 > URL: https://issues.apache.org/jira/browse/HIVE-17342 > Project: Hive > Issue Type: Improvement > Reporter: Rajesh Balamohan > Priority: Minor > > In some cases, queries may get executed with where condition mentioning to > "1=0" to get schema. E.g > {noformat} > SELECT * FROM (select avg(d_year) as y from date_dim where d_year>1999) q > WHERE 1=0 > {noformat} > Currently hive executes the query; it would be good to consider this similar > to "limit 0" which does not execute the query. > {code} > hive> explain SELECT * FROM (select avg(d_year) as y from date_dim where > d_year>1999) q WHERE 1=0; > OK > Plan optimized by CBO. > Vertex dependency in root stage > Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Reducer 2 vectorized, llap > File Output Operator [FS_13] > Group By Operator [GBY_12] (rows=1 width=76) > Output:["_col0"],aggregations:["avg(VALUE._col0)"] > <-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_11] > Group By Operator [GBY_10] (rows=1 width=76) > Output:["_col0"],aggregations:["avg(d_year)"] > Filter Operator [FIL_9] (rows=1 width=0) > predicate:false > TableScan [TS_0] (rows=1 width=0) > > default@date_dim,date_dim,Tbl:PARTIAL,Col:NONE,Output:["d_year"] > {code} > It does generate 0 splits, but does send a DAG plan to the AM and receive 0 > rows as output. -- This message was sent by Atlassian Jira (v8.20.10#820010)