Vineet Garg created HIVE-21338: ---------------------------------- Summary: Remove order by and limit for aggregates Key: HIVE-21338 URL: https://issues.apache.org/jira/browse/HIVE-21338 Project: Hive Issue Type: Improvement Components: Query Planning Reporter: Vineet Garg Assignee: Vineet Garg
If a query is guaranteed to produce at most one row LIMIT and ORDER BY could be removed. This saves unnecessary vertex for LIMIT/ORDER BY. {code:sql} explain select count(*) cs from store_sales where ss_ext_sales_price > 100.00 order by cs limit 100 {code} {code} STAGE PLANS: Stage: Stage-1 Tez DagId: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4 Edges: Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE) Reducer 3 <- Reducer 2 (SIMPLE_EDGE) DagName: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4 Vertices: Map 1 Map Operator Tree: TableScan alias: store_sales filterExpr: (ss_ext_sales_price > 100) (type: boolean) Statistics: Num rows: 1 Data size: 112 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (ss_ext_sales_price > 100) (type: boolean) Statistics: Num rows: 1 Data size: 112 Basic stats: COMPLETE Column stats: NONE Select Operator Statistics: Num rows: 1 Data size: 112 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: bigint) Execution mode: vectorized Reducer 2 Execution mode: vectorized Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) mode: mergepartial outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: bigint) sort order: + Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE Column stats: NONE TopN Hash Memory Usage: 0.1 Reducer 3 Execution mode: vectorized Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: bigint) outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE Column stats: NONE Limit Number of rows: 100 Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: 100 Processor Tree: ListSink {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)