[ https://issues.apache.org/jira/browse/HIVE-16421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pengcheng Xiong updated HIVE-16421: ----------------------------------- Attachment: HIVE-16421.01.patch > Runtime filtering breaks user-level explain > ------------------------------------------- > > Key: HIVE-16421 > URL: https://issues.apache.org/jira/browse/HIVE-16421 > Project: Hive > Issue Type: Bug > Reporter: Pengcheng Xiong > Assignee: Pengcheng Xiong > Attachments: HIVE-16421.01.patch > > > Query: > {noformat} > SELECT LAG(COALESCE(t2.int_col_14, t1.int_col_80),22) OVER (ORDER BY > t1.tinyint_col_52 DESC) AS int_col FROM table_6 t1 INNER JOIN table_14 t2 ON > ((t2.decimal0101_col_55) = (t1.decimal0101_col_9)); > {noformat} > Without runtime filtering > {noformat} > +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+ > | > Explain > | > +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+ > | Plan not optimized by CBO. > > | > | > > | > | Vertex dependency in root stage > > | > | Map 1 <- Map 3 (BROADCAST_EDGE) > > | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) > > | > | > > | > | Stage-0 > > | > | Fetch Operator > > | > | limit:-1 > > | > | Stage-1 > > | > | Reducer 2 > > | > | File Output Operator [FS_364] > > | > | compressed:false > > | > | Statistics:Num rows: 74781721 Data size: 299126884 Basic stats: > COMPLETE Column stats: COMPLETE > | > | table:{"input > format:":"org.apache.hadoop.mapred.TextInputFormat","output > format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat","serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe"} > | > | Select Operator [SEL_362] > > | > | outputColumnNames:["_col0"] > > | > | Statistics:Num rows: 74781721 Data size: 299126884 Basic > stats: COMPLETE Column stats: COMPLETE > | > | PTF Operator [PTF_361] > > | > | Function definitions:[{"Input > definition":{"type:":"WINDOWING"}},{"order > by:":"_col51(DESC)","name:":"windowingtablefunction","partition by:":"0"}] > | > | Statistics:Num rows: 74781721 Data size: 897380652 Basic > stats: COMPLETE Column stats: COMPLETE > | > | Select Operator [SEL_360] > > | > | | outputColumnNames:["_col51","_col79","_col97"] > > | > | | Statistics:Num rows: 74781721 Data size: 897380652 > Basic stats: COMPLETE Column stats: COMPLETE > | > | |<-Map 1 [SIMPLE_EDGE] vectorized > > | > | Reduce Output Operator [RS_375] > > | > | key expressions:0 (type: int), _col51 (type: > tinyint) > | > | Map-reduce partition columns:0 (type: int) > > | > | sort order:+- > > | > | Statistics:Num rows: 74781721 Data size: 897380652 > Basic stats: COMPLETE Column stats: COMPLETE > | > | value expressions:_col79 (type: int), _col97 (type: > int) > | > | Map Join Operator [MAPJOIN_374] > > | > | | condition map:[{"":"Inner Join 0 to 1"}] > > | > | | HybridGraceHashJoin:true > > | > | | keys:{"Map 3":"decimal0101_col_55 (type: > decimal(1,1))","Map 1":"decimal0101_col_9 (type: decimal(1,1))"} > | > | | outputColumnNames:["_col51","_col79","_col97"] > > | > | | Statistics:Num rows: 74781721 Data size: > 897380652 Basic stats: COMPLETE Column stats: COMPLETE > | > | |<-Map 3 [BROADCAST_EDGE] vectorized > > | > | | Reduce Output Operator [RS_372] > > | > | | key expressions:decimal0101_col_55 (type: > decimal(1,1)) > | > | | Map-reduce partition > columns:decimal0101_col_55 (type: decimal(1,1)) > > | > | | sort order:+ > > | > | | Statistics:Num rows: 26256 Data size: 2749496 > Basic stats: COMPLETE Column stats: COMPLETE > | > | | value expressions:int_col_14 (type: int) > > | > | | Filter Operator [FIL_371] > > | > | | predicate:decimal0101_col_55 is not null > (type: boolean) > | > | | Statistics:Num rows: 26256 Data size: > 2749496 Basic stats: COMPLETE Column stats: COMPLETE > | > | | TableScan [TS_353] > > | > | | alias:t2 > > | > | | Statistics:Num rows: 29079 Data size: > 117014275 Basic stats: COMPLETE Column stats: COMPLETE > | > | |<-Filter Operator [FIL_373] > > | > | predicate:decimal0101_col_9 is not null > (type: boolean) > | > | Statistics:Num rows: 48419 Data size: 5233788 > Basic stats: COMPLETE Column stats: COMPLETE > | > | TableScan [TS_352] > > | > | alias:t1 > > | > | Statistics:Num rows: 53742 Data size: > 200230374 Basic stats: COMPLETE Column stats: COMPLETE > | > | > > | > +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+ > {noformat} > With runtime filtering: > {noformat} > ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+ > | > Explain > > | > +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+ > | STAGE DEPENDENCIES: > > > | > | Stage-1 is a root stage > > > | > | Stage-0 depends on stages: Stage-1 > > > | > | > > > | > | STAGE PLANS: > > > | > | Stage: Stage-1 > > > | > | Tez > > > | > | DagId: hive_20170411232247_e177745a-39d0-4ae7-8ca0-871a137b36fa:1 > > > | > | Edges: > > > | > | Map 1 <- Map 3 (BROADCAST_EDGE), Reducer 4 (BROADCAST_EDGE) > > > | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) > > > | > | Reducer 4 <- Map 3 (SIMPLE_EDGE) > > > | > | DagName: > > > | > | Vertices: > > > | > | Map 1 > > > | > | Map Operator Tree: > > > | > | TableScan > > > | > | alias: t1 > > > | > | filterExpr: (decimal0101_col_9 is not null and > (decimal0101_col_9 BETWEEN DynamicValue(RS_7_t2_decimal0101_col_9_min) AND > DynamicValue(RS_7_t2_decimal0101_col_9_max) and > in_bloom_filter(decimal0101_col_9, > DynamicValue(RS_7_t2_decimal0101_col_9_bloom_filter)))) (type: boolean) | > | Statistics: Num rows: 53742 Data size: 5809320 Basic > stats: COMPLETE Column stats: COMPLETE > > | > | Filter Operator > > > | > | predicate: (decimal0101_col_9 is not null and > (decimal0101_col_9 BETWEEN DynamicValue(RS_7_t2_decimal0101_col_9_min) AND > DynamicValue(RS_7_t2_decimal0101_col_9_max) and > in_bloom_filter(decimal0101_col_9, > DynamicValue(RS_7_t2_decimal0101_col_9_bloom_filter)))) (type: boolean) | > | Statistics: Num rows: 48419 Data size: 5233908 Basic > stats: COMPLETE Column stats: COMPLETE > > | > | Select Operator > > > | > | expressions: decimal0101_col_9 (type: decimal(1,1)), > tinyint_col_52 (type: tinyint), int_col_80 (type: int) > > | > | outputColumnNames: _col0, _col1, _col2 > > > | > | Statistics: Num rows: 48419 Data size: 5233908 Basic > stats: COMPLETE Column stats: COMPLETE > > | > | Map Join Operator > > > | > | condition map: > > > | > | Inner Join 0 to 1 > > > | > | keys: > > > | > | 0 _col0 (type: decimal(1,1)) > > > | > | 1 _col1 (type: decimal(1,1)) > > > | > | outputColumnNames: _col1, _col2, _col3 > > > | > | input vertices: > > > | > | 1 Map 3 > > > | > | Statistics: Num rows: 74781721 Data size: 897380652 > Basic stats: COMPLETE Column stats: COMPLETE > > | > | Reduce Output Operator > > > | > | key expressions: 0 (type: int), _col1 (type: > tinyint) > > | > | sort order: +- > > > | > | Map-reduce partition columns: 0 (type: int) > > > | > | Statistics: Num rows: 74781721 Data size: > 897380652 Basic stats: COMPLETE Column stats: COMPLETE > > | > | value expressions: _col2 (type: int), _col3 > (type: int) > > | > | Execution mode: vectorized, llap > > > | > | Map 3 > > > | > | Map Operator Tree: > > > | > | TableScan > > > | > | alias: t2 > > > | > | filterExpr: decimal0101_col_55 is not null (type: > boolean) > > | > | Statistics: Num rows: 29079 Data size: 3045240 Basic > stats: COMPLETE Column stats: COMPLETE > > | > | Filter Operator > > > | > | predicate: decimal0101_col_55 is not null (type: > boolean) > > | > | Statistics: Num rows: 26256 Data size: 2749612 Basic > stats: COMPLETE Column stats: COMPLETE > > | > | Select Operator > > > | > | expressions: int_col_14 (type: int), > decimal0101_col_55 (type: decimal(1,1)) > > > | > | outputColumnNames: _col0, _col1 > > > | > | Statistics: Num rows: 26256 Data size: 2749612 Basic > stats: COMPLETE Column stats: COMPLETE > > | > | Reduce Output Operator > > > | > | key expressions: _col1 (type: decimal(1,1)) > > > | > | sort order: + > > > | > | Map-reduce partition columns: _col1 (type: > decimal(1,1)) > > | > | Statistics: Num rows: 26256 Data size: 2749612 > Basic stats: COMPLETE Column stats: COMPLETE > > | > | value expressions: _col0 (type: int) > > > | > | Select Operator > > > | > | expressions: _col1 (type: decimal(1,1)) > > > | > | outputColumnNames: _col0 > > > | > | Statistics: Num rows: 26256 Data size: 2749612 > Basic stats: COMPLETE Column stats: COMPLETE > > | > | Group By Operator > > > | > | aggregations: min(_col0), max(_col0), > bloom_filter(_col0, expectedEntries=17) > > | > | mode: hash > > > | > | outputColumnNames: _col0, _col1, _col2 > > > | > | Statistics: Num rows: 1 Data size: 336 Basic > stats: COMPLETE Column stats: COMPLETE > > | > | Reduce Output Operator > > > | > | sort order: > > > | > | Statistics: Num rows: 1 Data size: 336 Basic > stats: COMPLETE Column stats: COMPLETE > > | > | value expressions: _col0 (type: decimal(1,1)), > _col1 (type: decimal(1,1)), _col2 (type: binary) > > | > | Execution mode: vectorized, llap > > > | > | Reducer 2 > > > | > | Execution mode: llap > > > | > | Reduce Operator Tree: > > > | > | Select Operator > > > | > | expressions: KEY.reducesinkkey1 (type: tinyint), > VALUE._col1 (type: int), VALUE._col2 (type: int) > > | > | outputColumnNames: _col1, _col2, _col3 > > > | > | Statistics: Num rows: 74781721 Data size: 897380652 Basic > stats: COMPLETE Column stats: COMPLETE > > | > | PTF Operator > > > | > | Function definitions: > > > | > | Input definition > > > | > | input alias: ptf_0 > > > | > | output shape: _col1: tinyint, _col2: int, _col3: > int > > | > | type: WINDOWING > > > | > | Windowing table definition > > > | > | input alias: ptf_1 > > > | > | name: windowingtablefunction > > > | > | order by: _col1 DESC NULLS LAST > > > | > | partition by: 0 > > > | > | raw input shape: > > > | > | window functions: > > > | > | window function definition > > > | > | alias: LAG_window_0 > > > | > | arguments: COALESCE(_col3,_col2), 22 > > > | > +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+ > | > Explain > > | > +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+ > | name: LAG > > > | > | window function: GenericUDAFLagEvaluator > > > | > | window frame: PRECEDING(MAX)~FOLLOWING(MAX) > > > | > | isPivotResult: true > > > | > | Statistics: Num rows: 74781721 Data size: 897380652 Basic > stats: COMPLETE Column stats: COMPLETE > > | > | Select Operator > > > | > | expressions: LAG_window_0 (type: int) > > > | > | outputColumnNames: _col0 > > > | > | Statistics: Num rows: 74781721 Data size: 299126884 > Basic stats: COMPLETE Column stats: COMPLETE > > | > | File Output Operator > > > | > | compressed: false > > > | > | Statistics: Num rows: 74781721 Data size: 299126884 > Basic stats: COMPLETE Column stats: COMPLETE > > | > | table: > > > | > | input format: > org.apache.hadoop.mapred.SequenceFileInputFormat > > > | > | output format: > org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > > > | > | serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > > > | > | Reducer 4 > > > | > | Execution mode: vectorized, llap > > > | > | Reduce Operator Tree: > > > | > | Group By Operator > > > | > | aggregations: min(VALUE._col0), max(VALUE._col1), > bloom_filter(VALUE._col2, expectedEntries=17) > > | > | mode: final > > > | > | outputColumnNames: _col0, _col1, _col2 > > > | > | Statistics: Num rows: 1 Data size: 336 Basic stats: > COMPLETE Column stats: COMPLETE > > | > | Reduce Output Operator > > > | > | sort order: > > > | > | Statistics: Num rows: 1 Data size: 336 Basic stats: > COMPLETE Column stats: COMPLETE > > | > | value expressions: _col0 (type: decimal(1,1)), _col1 > (type: decimal(1,1)), _col2 (type: binary) > > | > | > > > | > | Stage: Stage-0 > > > | > | Fetch Operator > > > | > | limit: -1 > > > | > | Processor Tree: > > > | > | ListSink > > > | > | > > > | > +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+ > 135 rows selected (2.348 seconds) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)