Krisztian Kasa created HIVE-28729: ------------------------------------- Summary: Apply nulls order setting in Reduce Sink operator of join branches Key: HIVE-28729 URL: https://issues.apache.org/jira/browse/HIVE-28729 Project: Hive Issue Type: Sub-task Reporter: Krisztian Kasa
{code:java} set hive.default.nulls.last=false; create table t1(key int, value string); EXPLAIN SELECT sum(hash(a.key,a.value,b.key,b.value)) FROM t1 a INNER JOIN t1 b on a.key = b.key; {code} {code:java} STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Tez #### A masked pattern was here #### Edges: Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE) Reducer 3 <- Reducer 2 (CUSTOM_SIMPLE_EDGE) #### A masked pattern was here #### Vertices: Map 1 Map Operator Tree: TableScan alias: a filterExpr: key is not null (type: boolean) Statistics: Num rows: 1 Data size: 188 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) Statistics: Num rows: 1 Data size: 188 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: int), value (type: string) outputColumnNames: key, value Statistics: Num rows: 1 Data size: 188 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: key (type: int) null sort order: z sort order: + Map-reduce partition columns: key (type: int) Statistics: Num rows: 1 Data size: 188 Basic stats: COMPLETE Column stats: NONE value expressions: value (type: string) Execution mode: vectorized, llap LLAP IO: all inputs Map 4 Map Operator Tree: TableScan alias: b filterExpr: key is not null (type: boolean) Statistics: Num rows: 1 Data size: 188 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) Statistics: Num rows: 1 Data size: 188 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: int), value (type: string) outputColumnNames: key, value Statistics: Num rows: 1 Data size: 188 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: key (type: int) null sort order: z sort order: + Map-reduce partition columns: key (type: int) Statistics: Num rows: 1 Data size: 188 Basic stats: COMPLETE Column stats: NONE value expressions: value (type: string) Execution mode: vectorized, llap LLAP IO: all inputs Reducer 2 Execution mode: llap Reduce Operator Tree: Merge Join Operator condition map: Inner Join 0 to 1 keys: 0 key (type: int) 1 key (type: int) outputColumnNames: key, value, key0, value0 Statistics: Num rows: 1 Data size: 206 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: hash(key,value,key0,value0) (type: int) outputColumnNames: $f0 Statistics: Num rows: 1 Data size: 206 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: sum($f0) minReductionHashAggr: 0.99 mode: hash outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator null sort order: sort order: Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: bigint) Reducer 3 Execution mode: vectorized, llap Reduce Operator Tree: Group By Operator aggregations: sum(VALUE._col0) mode: mergepartial outputColumnNames: $f0 Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink {code} Nulls order in RS operators are NULLS LAST but is should be NULLS FIRST because of the config {{hive.default.nulls.last=false}} {code} Map 1 Map Operator Tree: ... Reduce Output Operator key expressions: key (type: int) null sort order: z ... {code} {code} Map 4 Map Operator Tree: ... Reduce Output Operator key expressions: key (type: int) null sort order: z ... {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)