[ https://issues.apache.org/jira/browse/HIVE-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965652#action_12965652 ]
Sreekanth Ramakrishnan commented on HIVE-1695: ---------------------------------------------- Attaching a sample plan based on the above approach {noformat} ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF test a) (TOK_TABREF test1 b) (= (. (TOK_TABLE_OR_COL a) key) (. (TOK_TABLE_OR_COL b) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_HINTLIST (TOK_HINT TOK_MAPJOIN (TOK_HINTARGLIST b))) (TOK_SELEXPR (. (TOK_TABLE_OR_COL a) key))) (TOK_ORDERBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL a) key))))) STAGE DEPENDENCIES: Stage-4 is a root stage Stage-1 depends on stages: Stage-4 Stage-0 is a root stage STAGE PLANS: Stage: Stage-4 Map Reduce Local Work Alias -> Map Local Tables: b Fetch Operator limit: -1 Alias -> Map Local Operator Tree: b TableScan alias: b HashTable Sink Operator condition expressions: 0 {key} 1 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] Position of Big Table: 0 Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: a TableScan alias: a Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {key} 1 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] outputColumnNames: _col0 Position of Big Table: 0 Reduce Output Operator key expressions: expr: _col0 type: int sort order: + tag: -1 value expressions: expr: _col0 type: int Local Work: Map Reduce Local Work Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-0 Fetch Operator limit: -1 {noformat} Currently, trying to fix the issue in the map failure due to the above plan. Stil figuring out how to add a Select Operator at the end of the map join operator. > MapJoin followed by ReduceSink should be done as single MapReduce Job > --------------------------------------------------------------------- > > Key: HIVE-1695 > URL: https://issues.apache.org/jira/browse/HIVE-1695 > Project: Hive > Issue Type: Improvement > Components: Query Processor > Reporter: Amareshwari Sriramadasu > > Currently MapJoin followed by ReduceSink runs as two MapReduce jobs : One map > only job followed by a Map-Reduce job. It can be combined into single > MapReduce Job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.