[ https://issues.apache.org/jira/browse/HIVE-8908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xuefu Zhang updated HIVE-8908: ------------------------------ Fix Version/s: (was: spark-branch) 1.1.0 > Investigate test failure on join34.q [Spark Branch] > --------------------------------------------------- > > Key: HIVE-8908 > URL: https://issues.apache.org/jira/browse/HIVE-8908 > Project: Hive > Issue Type: Sub-task > Components: Spark > Affects Versions: spark-branch > Reporter: Chao Sun > Assignee: Chao Sun > Fix For: 1.1.0 > > Attachments: HIVE-8908.1-spark.patch, HIVE-8908.2-spark.patch > > > For this query, the plan doesn't look correct: > {noformat} > OK > STAGE DEPENDENCIES: > Stage-4 is a root stage > Stage-1 depends on stages: Stage-5, Stage-4 > Stage-2 depends on stages: Stage-1 > Stage-0 depends on stages: Stage-2 > Stage-3 depends on stages: Stage-0 > Stage-5 is a root stage > STAGE PLANS: > Stage: Stage-4 > Spark > DagName: chao_20141118150101_a47a2d7b-e750-4764-be66-5ba95ebbe433:6 > Vertices: > Map 4 > Map Operator Tree: > TableScan > alias: x > Statistics: Num rows: 1 Data size: 216 Basic stats: > COMPLETE Column stats: NONE > Filter Operator > predicate: key is not null (type: boolean) > Statistics: Num rows: 1 Data size: 216 Basic stats: > COMPLETE Column stats: NONE > Spark HashTable Sink Operator > condition expressions: > 0 {_col1} > 1 {value} > keys: > 0 _col0 (type: string) > 1 key (type: string) > Reduce Output Operator > key expressions: key (type: string) > sort order: + > Map-reduce partition columns: key (type: string) > Statistics: Num rows: 1 Data size: 216 Basic stats: > COMPLETE Column stats: NONE > value expressions: value (type: string) > Local Work: > Map Reduce Local Work > Stage: Stage-1 > Spark > Edges: > Union 2 <- Map 1 (NONE, 0), Map 3 (NONE, 0) > DagName: chao_20141118150101_a47a2d7b-e750-4764-be66-5ba95ebbe433:4 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: x > Filter Operator > predicate: (key < 20) (type: boolean) > Select Operator > expressions: key (type: string), value (type: string) > outputColumnNames: _col0, _col1 > Map Join Operator > condition map: > Inner Join 0 to 1 > condition expressions: > 0 {_col1} > 1 {key} {value} > keys: > 0 _col0 (type: string) > 1 key (type: string) > outputColumnNames: _col1, _col2, _col3 > input vertices: > 1 Map 4 > Select Operator > expressions: _col2 (type: string), _col3 (type: > string), _col1 (type: string) > outputColumnNames: _col0, _col1, _col2 > File Output Operator > compressed: false > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > name: default.dest_j1 > Local Work: > Map Reduce Local Work > Map 3 > Map Operator Tree: > TableScan > alias: x1 > Filter Operator > predicate: (key > 100) (type: boolean) > Select Operator > expressions: key (type: string), value (type: string) > outputColumnNames: _col0, _col1 > Map Join Operator > condition map: > Inner Join 0 to 1 > condition expressions: > 0 {_col1} > 1 {key} {value} > keys: > 0 _col0 (type: string) > 1 key (type: string) > outputColumnNames: _col1, _col2, _col3 > input vertices: > 1 Map 4 > Select Operator > expressions: _col2 (type: string), _col3 (type: > string), _col1 (type: string) > outputColumnNames: _col0, _col1, _col2 > File Output Operator > compressed: false > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > name: default.dest_j1 > Local Work: > Map Reduce Local Work > Union 2 > Vertex: Union 2 > Stage: Stage-2 > Dependency Collection > Stage: Stage-0 > Move Operator > tables: > replace: true > table: > input format: org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > name: default.dest_j1 > Stage: Stage-3 > Stats-Aggr Operator > Stage: Stage-5 > Spark > DagName: chao_20141118150101_a47a2d7b-e750-4764-be66-5ba95ebbe433:5 > Vertices: > Map 4 > Map Operator Tree: > TableScan > alias: x > Statistics: Num rows: 1 Data size: 216 Basic stats: > COMPLETE Column stats: NONE > Filter Operator > predicate: key is not null (type: boolean) > Statistics: Num rows: 1 Data size: 216 Basic stats: > COMPLETE Column stats: NONE > Spark HashTable Sink Operator > condition expressions: > 0 {_col1} > 1 {value} > keys: > 0 _col0 (type: string) > 1 key (type: string) > Reduce Output Operator > key expressions: key (type: string) > sort order: + > Map-reduce partition columns: key (type: string) > Statistics: Num rows: 1 Data size: 216 Basic stats: > COMPLETE Column stats: NONE > value expressions: value (type: string) > Local Work: > Map Reduce Local Work > Time taken: 0.127 seconds, Fetched: 156 row(s) > {noformat} > Note that Stage-4 and Stage-5 are identical. Also, in Stage-4 there's a > parallel RS operator with the HTS operator, which is strange. -- This message was sent by Atlassian JIRA (v6.3.4#6332)