[ https://issues.apache.org/jira/browse/HIVE-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17560881#comment-17560881 ]
Stamatis Zampetakis commented on HIVE-26365: -------------------------------------------- What happens if the MERGE statement has only INSERT branches? In this case it seems that collecting stats makes sense and could potentially be exploitable too. > Remove column statistics collection task from merge statement plan > ------------------------------------------------------------------- > > Key: HIVE-26365 > URL: https://issues.apache.org/jira/browse/HIVE-26365 > Project: Hive > Issue Type: Sub-task > Reporter: Krisztian Kasa > Assignee: Krisztian Kasa > Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Merge statements may contain delete and update branches. Update is > technically a delete and an insert operation. Column statistics can not be > calculated in case of delete operations from the changed records. Example: > min, max. > Currently Hive marks the column stats of the target table invalid after > Update/Delete/Merge but for merge extra GBY operators and reducers are > generated for insert branches to calculate column stats and Stats works are > collecting Column stats too. > {code} > POSTHOOK: query: explain > merge into acidTbl_n0 as t using nonAcidOrcTbl_n0 s ON t.a = s.a > WHEN MATCHED AND s.a > 8 THEN DELETE > WHEN MATCHED THEN UPDATE SET b = 7 > WHEN NOT MATCHED THEN INSERT VALUES(s.a, s.b) > POSTHOOK: type: QUERY > POSTHOOK: Input: default@acidtbl_n0 > POSTHOOK: Input: default@nonacidorctbl_n0 > POSTHOOK: Output: default@acidtbl_n0 > POSTHOOK: Output: default@acidtbl_n0 > POSTHOOK: Output: default@merge_tmp_table > STAGE DEPENDENCIES: > Stage-5 is a root stage > Stage-6 depends on stages: Stage-5 > Stage-0 depends on stages: Stage-6 > Stage-7 depends on stages: Stage-0 > Stage-1 depends on stages: Stage-6 > Stage-8 depends on stages: Stage-1 > Stage-2 depends on stages: Stage-6 > Stage-9 depends on stages: Stage-2 > Stage-3 depends on stages: Stage-6 > Stage-10 depends on stages: Stage-3 > Stage-4 depends on stages: Stage-6 > Stage-11 depends on stages: Stage-4 > STAGE PLANS: > Stage: Stage-5 > Tez > #### A masked pattern was here #### > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 10 (SIMPLE_EDGE) > Reducer 3 <- Reducer 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 2 (SIMPLE_EDGE) > Reducer 5 <- Reducer 2 (SIMPLE_EDGE) > Reducer 6 <- Reducer 5 (CUSTOM_SIMPLE_EDGE) > Reducer 7 <- Reducer 2 (SIMPLE_EDGE) > Reducer 8 <- Reducer 7 (CUSTOM_SIMPLE_EDGE) > Reducer 9 <- Reducer 2 (SIMPLE_EDGE) > #### A masked pattern was here #### > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: s > Statistics: Num rows: 4 Data size: 32 Basic stats: COMPLETE > Column stats: COMPLETE > Select Operator > expressions: a (type: int), b (type: int) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 4 Data size: 32 Basic stats: > COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > null sort order: z > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 4 Data size: 32 Basic stats: > COMPLETE Column stats: COMPLETE > value expressions: _col1 (type: int) > Execution mode: vectorized, llap > LLAP IO: all inputs > Map 10 > Map Operator Tree: > TableScan > alias: t > filterExpr: a is not null (type: boolean) > Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE > Column stats: COMPLETE > Filter Operator > predicate: a is not null (type: boolean) > Statistics: Num rows: 2 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE > Select Operator > expressions: a (type: int), ROW__ID (type: > struct<writeid:bigint,bucketid:int,rowid:bigint>) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 2 Data size: 160 Basic stats: > COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > null sort order: z > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 2 Data size: 160 Basic stats: > COMPLETE Column stats: COMPLETE > value expressions: _col1 (type: > struct<writeid:bigint,bucketid:int,rowid:bigint>) > Execution mode: vectorized, llap > LLAP IO: may be used (ACID table) > Reducer 2 > Execution mode: llap > Reduce Operator Tree: > Merge Join Operator > condition map: > Left Outer Join 0 to 1 > keys: > 0 _col0 (type: int) > 1 _col0 (type: int) > outputColumnNames: _col0, _col1, _col2, _col3 > Statistics: Num rows: 6 Data size: 288 Basic stats: COMPLETE > Column stats: COMPLETE > Select Operator > expressions: _col3 (type: > struct<writeid:bigint,bucketid:int,rowid:bigint>), _col1 (type: int), _col2 > (type: int), _col0 (type: int) > outputColumnNames: _col0, _col1, _col2, _col3 > Statistics: Num rows: 6 Data size: 288 Basic stats: > COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((_col2 = _col3) and (_col3 > 8)) (type: > boolean) > Statistics: Num rows: 1 Data size: 88 Basic stats: > COMPLETE Column stats: COMPLETE > Select Operator > expressions: _col0 (type: > struct<writeid:bigint,bucketid:int,rowid:bigint>) > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 76 Basic stats: > COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: > struct<writeid:bigint,bucketid:int,rowid:bigint>) > null sort order: z > sort order: + > Map-reduce partition columns: UDFToInteger(_col0) > (type: int) > Statistics: Num rows: 1 Data size: 76 Basic stats: > COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((_col2 = _col3) and (_col3 <= 8)) (type: > boolean) > Statistics: Num rows: 2 Data size: 176 Basic stats: > COMPLETE Column stats: COMPLETE > Select Operator > expressions: _col0 (type: > struct<writeid:bigint,bucketid:int,rowid:bigint>) > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 152 Basic stats: > COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: > struct<writeid:bigint,bucketid:int,rowid:bigint>) > null sort order: z > sort order: + > Map-reduce partition columns: UDFToInteger(_col0) > (type: int) > Statistics: Num rows: 2 Data size: 152 Basic stats: > COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((_col2 = _col3) and (_col3 <= 8)) (type: > boolean) > Statistics: Num rows: 2 Data size: 176 Basic stats: > COMPLETE Column stats: COMPLETE > Select Operator > expressions: _col2 (type: int), 7 (type: int) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 2 Data size: 16 Basic stats: > COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > null sort order: a > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 2 Data size: 16 Basic stats: > COMPLETE Column stats: COMPLETE > value expressions: _col1 (type: int) > Filter Operator > predicate: _col2 is null (type: boolean) > Statistics: Num rows: 4 Data size: 192 Basic stats: > COMPLETE Column stats: COMPLETE > Select Operator > expressions: _col3 (type: int), _col1 (type: int) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 4 Data size: 32 Basic stats: > COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > null sort order: a > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 4 Data size: 32 Basic stats: > COMPLETE Column stats: COMPLETE > value expressions: _col1 (type: int) > Filter Operator > predicate: (_col2 = _col3) (type: boolean) > Statistics: Num rows: 3 Data size: 184 Basic stats: > COMPLETE Column stats: COMPLETE > Select Operator > expressions: _col0 (type: > struct<writeid:bigint,bucketid:int,rowid:bigint>) > outputColumnNames: _col0 > Statistics: Num rows: 3 Data size: 184 Basic stats: > COMPLETE Column stats: COMPLETE > Group By Operator > aggregations: count() > keys: _col0 (type: > struct<writeid:bigint,bucketid:int,rowid:bigint>) > minReductionHashAggr: 0.4 > mode: hash > outputColumnNames: _col0, _col1 > Statistics: Num rows: 2 Data size: 168 Basic stats: > COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: > struct<writeid:bigint,bucketid:int,rowid:bigint>) > null sort order: z > sort order: + > Map-reduce partition columns: _col0 (type: > struct<writeid:bigint,bucketid:int,rowid:bigint>) > Statistics: Num rows: 2 Data size: 168 Basic stats: > COMPLETE Column stats: COMPLETE > value expressions: _col1 (type: bigint) > Reducer 3 > Execution mode: vectorized, llap > Reduce Operator Tree: > Select Operator > expressions: KEY.reducesinkkey0 (type: > struct<writeid:bigint,bucketid:int,rowid:bigint>) > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 76 Basic stats: COMPLETE > Column stats: COMPLETE > File Output Operator > compressed: false > Statistics: Num rows: 1 Data size: 76 Basic stats: COMPLETE > Column stats: COMPLETE > table: > input format: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > output format: > org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat > serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde > name: default.acidtbl_n0 > Write Type: DELETE > Reducer 4 > Execution mode: vectorized, llap > Reduce Operator Tree: > Select Operator > expressions: KEY.reducesinkkey0 (type: > struct<writeid:bigint,bucketid:int,rowid:bigint>) > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 152 Basic stats: COMPLETE > Column stats: COMPLETE > File Output Operator > compressed: false > Statistics: Num rows: 2 Data size: 152 Basic stats: > COMPLETE Column stats: COMPLETE > table: > input format: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > output format: > org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat > serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde > name: default.acidtbl_n0 > Write Type: DELETE > Reducer 5 > Execution mode: vectorized, llap > Reduce Operator Tree: > Select Operator > expressions: KEY.reducesinkkey0 (type: int), VALUE._col0 > (type: int) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE > Column stats: COMPLETE > File Output Operator > compressed: false > Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE > Column stats: COMPLETE > table: > input format: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > output format: > org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat > serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde > name: default.acidtbl_n0 > Write Type: INSERT > Select Operator > expressions: _col0 (type: int), _col1 (type: int) > outputColumnNames: a, b > Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE > Column stats: COMPLETE > Group By Operator > aggregations: min(a), max(a), count(1), count(a), > compute_bit_vector_hll(a), min(b), max(b), count(b), compute_bit_vector_hll(b) > minReductionHashAggr: 0.5 > mode: hash > outputColumnNames: _col0, _col1, _col2, _col3, _col4, > _col5, _col6, _col7, _col8 > Statistics: Num rows: 1 Data size: 328 Basic stats: > COMPLETE Column stats: COMPLETE > Reduce Output Operator > null sort order: > sort order: > Statistics: Num rows: 1 Data size: 328 Basic stats: > COMPLETE Column stats: COMPLETE > value expressions: _col0 (type: int), _col1 (type: > int), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: binary), _col5 > (type: int), _col6 (type: int), _col7 (type: bigint), _col8 (type: binary) > Reducer 6 > Execution mode: vectorized, llap > Reduce Operator Tree: > Group By Operator > aggregations: min(VALUE._col0), max(VALUE._col1), > count(VALUE._col2), count(VALUE._col3), compute_bit_vector_hll(VALUE._col4), > min(VALUE._col5), max(VALUE._col6), count(VALUE._col7), > compute_bit_vector_hll(VALUE._col8) > mode: mergepartial > outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, > _col6, _col7, _col8 > Statistics: Num rows: 1 Data size: 328 Basic stats: COMPLETE > Column stats: COMPLETE > Select Operator > expressions: 'LONG' (type: string), UDFToLong(_col0) (type: > bigint), UDFToLong(_col1) (type: bigint), (_col2 - _col3) (type: bigint), > COALESCE(ndv_compute_bit_vector(_col4),0) (type: bigint), _col4 (type: > binary), 'LONG' (type: string), UDFToLong(_col5) (type: bigint), > UDFToLong(_col6) (type: bigint), (_col2 - _col7) (type: bigint), > COALESCE(ndv_compute_bit_vector(_col8),0) (type: bigint), _col8 (type: binary) > outputColumnNames: _col0, _col1, _col2, _col3, _col4, > _col5, _col6, _col7, _col8, _col9, _col10, _col11 > Statistics: Num rows: 1 Data size: 528 Basic stats: > COMPLETE Column stats: COMPLETE > File Output Operator > compressed: false > Statistics: Num rows: 1 Data size: 528 Basic stats: > COMPLETE Column stats: COMPLETE > table: > input format: > org.apache.hadoop.mapred.SequenceFileInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Reducer 7 > Execution mode: vectorized, llap > Reduce Operator Tree: > Select Operator > expressions: KEY.reducesinkkey0 (type: int), VALUE._col0 > (type: int) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 4 Data size: 32 Basic stats: COMPLETE > Column stats: COMPLETE > File Output Operator > compressed: false > Statistics: Num rows: 4 Data size: 32 Basic stats: COMPLETE > Column stats: COMPLETE > table: > input format: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > output format: > org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat > serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde > name: default.acidtbl_n0 > Write Type: INSERT > Select Operator > expressions: _col0 (type: int), _col1 (type: int) > outputColumnNames: a, b > Statistics: Num rows: 4 Data size: 32 Basic stats: COMPLETE > Column stats: COMPLETE > Group By Operator > aggregations: min(a), max(a), count(1), count(a), > compute_bit_vector_hll(a), min(b), max(b), count(b), compute_bit_vector_hll(b) > minReductionHashAggr: 0.75 > mode: hash > outputColumnNames: _col0, _col1, _col2, _col3, _col4, > _col5, _col6, _col7, _col8 > Statistics: Num rows: 1 Data size: 328 Basic stats: > COMPLETE Column stats: COMPLETE > Reduce Output Operator > null sort order: > sort order: > Statistics: Num rows: 1 Data size: 328 Basic stats: > COMPLETE Column stats: COMPLETE > value expressions: _col0 (type: int), _col1 (type: > int), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: binary), _col5 > (type: int), _col6 (type: int), _col7 (type: bigint), _col8 (type: binary) > Reducer 8 > Execution mode: vectorized, llap > Reduce Operator Tree: > Group By Operator > aggregations: min(VALUE._col0), max(VALUE._col1), > count(VALUE._col2), count(VALUE._col3), compute_bit_vector_hll(VALUE._col4), > min(VALUE._col5), max(VALUE._col6), count(VALUE._col7), > compute_bit_vector_hll(VALUE._col8) > mode: mergepartial > outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, > _col6, _col7, _col8 > Statistics: Num rows: 1 Data size: 328 Basic stats: COMPLETE > Column stats: COMPLETE > Select Operator > expressions: 'LONG' (type: string), UDFToLong(_col0) (type: > bigint), UDFToLong(_col1) (type: bigint), (_col2 - _col3) (type: bigint), > COALESCE(ndv_compute_bit_vector(_col4),0) (type: bigint), _col4 (type: > binary), 'LONG' (type: string), UDFToLong(_col5) (type: bigint), > UDFToLong(_col6) (type: bigint), (_col2 - _col7) (type: bigint), > COALESCE(ndv_compute_bit_vector(_col8),0) (type: bigint), _col8 (type: binary) > outputColumnNames: _col0, _col1, _col2, _col3, _col4, > _col5, _col6, _col7, _col8, _col9, _col10, _col11 > Statistics: Num rows: 1 Data size: 528 Basic stats: > COMPLETE Column stats: COMPLETE > File Output Operator > compressed: false > Statistics: Num rows: 1 Data size: 528 Basic stats: > COMPLETE Column stats: COMPLETE > table: > input format: > org.apache.hadoop.mapred.SequenceFileInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Reducer 9 > Execution mode: llap > Reduce Operator Tree: > Group By Operator > aggregations: count(VALUE._col0) > keys: KEY._col0 (type: > struct<writeid:bigint,bucketid:int,rowid:bigint>) > mode: mergepartial > outputColumnNames: _col0, _col1 > Statistics: Num rows: 2 Data size: 168 Basic stats: COMPLETE > Column stats: COMPLETE > Filter Operator > predicate: (_col1 > 1L) (type: boolean) > Statistics: Num rows: 1 Data size: 84 Basic stats: COMPLETE > Column stats: COMPLETE > Select Operator > expressions: cardinality_violation(_col0) (type: int) > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 4 Basic stats: > COMPLETE Column stats: COMPLETE > File Output Operator > compressed: false > Statistics: Num rows: 1 Data size: 4 Basic stats: > COMPLETE Column stats: COMPLETE > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > name: default.merge_tmp_table > Stage: Stage-6 > Dependency Collection > Stage: Stage-0 > Move Operator > tables: > replace: false > table: > input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat > serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde > name: default.acidtbl_n0 > Write Type: DELETE > Stage: Stage-7 > Stats Work > Basic Stats Work: > Stage: Stage-1 > Move Operator > tables: > replace: false > table: > input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat > serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde > name: default.acidtbl_n0 > Write Type: DELETE > Stage: Stage-8 > Stats Work > Basic Stats Work: > Stage: Stage-2 > Move Operator > tables: > replace: false > table: > input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat > serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde > name: default.acidtbl_n0 > Write Type: INSERT > Stage: Stage-9 > Stats Work > Basic Stats Work: > Stage: Stage-3 > Move Operator > tables: > replace: false > table: > input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat > serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde > name: default.acidtbl_n0 > Write Type: INSERT > Stage: Stage-10 > Stats Work > Basic Stats Work: > Column Stats Desc: > Columns: a, b > Column Types: int, int > Table: default.acidtbl_n0 > Stage: Stage-4 > Move Operator > tables: > replace: false > table: > input format: org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > name: default.merge_tmp_table > Stage: Stage-11 > Stats Work > Basic Stats Work: > {code} > One of the insert Reducers and the follow-up Reducer for col stats collecting: > {code} > Reducer 5 > Execution mode: vectorized, llap > Reduce Operator Tree: > Select Operator > expressions: KEY.reducesinkkey0 (type: int), VALUE._col0 > (type: int) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE > Column stats: COMPLETE > File Output Operator > compressed: false > Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE > Column stats: COMPLETE > table: > input format: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > output format: > org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat > serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde > name: default.acidtbl_n0 > Write Type: INSERT > Select Operator > expressions: _col0 (type: int), _col1 (type: int) > outputColumnNames: a, b > Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE > Column stats: COMPLETE > Group By Operator > aggregations: min(a), max(a), count(1), count(a), > compute_bit_vector_hll(a), min(b), max(b), count(b), compute_bit_vector_hll(b) > minReductionHashAggr: 0.5 > mode: hash > outputColumnNames: _col0, _col1, _col2, _col3, _col4, > _col5, _col6, _col7, _col8 > Statistics: Num rows: 1 Data size: 328 Basic stats: > COMPLETE Column stats: COMPLETE > Reduce Output Operator > null sort order: > sort order: > Statistics: Num rows: 1 Data size: 328 Basic stats: > COMPLETE Column stats: COMPLETE > value expressions: _col0 (type: int), _col1 (type: > int), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: binary), _col5 > (type: int), _col6 (type: int), _col7 (type: bigint), _col8 (type: binary) > Reducer 6 > Execution mode: vectorized, llap > Reduce Operator Tree: > Group By Operator > aggregations: min(VALUE._col0), max(VALUE._col1), > count(VALUE._col2), count(VALUE._col3), compute_bit_vector_hll(VALUE._col4), > min(VALUE._col5), max(VALUE._col6), count(VALUE._col7), > compute_bit_vector_hll(VALUE._col8) > mode: mergepartial > outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, > _col6, _col7, _col8 > Statistics: Num rows: 1 Data size: 328 Basic stats: COMPLETE > Column stats: COMPLETE > Select Operator > expressions: 'LONG' (type: string), UDFToLong(_col0) (type: > bigint), UDFToLong(_col1) (type: bigint), (_col2 - _col3) (type: bigint), > COALESCE(ndv_compute_bit_vector(_col4),0) (type: bigint), _col4 (type: > binary), 'LONG' (type: string), UDFToLong(_col5) (type: bigint), > UDFToLong(_col6) (type: bigint), (_col2 - _col7) (type: bigint), > COALESCE(ndv_compute_bit_vector(_col8),0) (type: bigint), _col8 (type: binary) > outputColumnNames: _col0, _col1, _col2, _col3, _col4, > _col5, _col6, _col7, _col8, _col9, _col10, _col11 > Statistics: Num rows: 1 Data size: 528 Basic stats: > COMPLETE Column stats: COMPLETE > File Output Operator > compressed: false > Statistics: Num rows: 1 Data size: 528 Basic stats: > COMPLETE Column stats: COMPLETE > table: > input format: > org.apache.hadoop.mapred.SequenceFileInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)