[ https://issues.apache.org/jira/browse/HIVE-6144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887641#comment-13887641 ]
Lefty Leverenz commented on HIVE-6144: -------------------------------------- *hive.auto.convert.join.use.nonstaged* is now documented in the Configuration Properties wikidoc. It should also be documented in the Join Optimization doc's "Optimize Auto Join Conversion" section, preferably with guidance and an example. Quick ref: * [Optimize Auto Join Conversion |https://cwiki.apache.org/confluence/display/Hive/LanguageManual+JoinOptimization#LanguageManualJoinOptimization-OptimizeAutoJoinConversion] * [Configuration Properties |https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties] -- search for "hive.auto.convert.join" or the whole string (they're all together) > Implement non-staged MapJoin > ---------------------------- > > Key: HIVE-6144 > URL: https://issues.apache.org/jira/browse/HIVE-6144 > Project: Hive > Issue Type: Improvement > Components: Query Processor > Reporter: Navis > Assignee: Navis > Priority: Minor > Fix For: 0.13.0 > > Attachments: HIVE-6144.1.patch.txt, HIVE-6144.2.patch.txt, > HIVE-6144.3.patch.txt, HIVE-6144.4.patch.txt, HIVE-6144.5.patch.txt, > HIVE-6144.6.patch.txt, HIVE-6144.7.patch.txt, HIVE-6144.8.patch.txt, > HIVE-6144.9.patch.txt > > > For map join, all data in small aliases are hashed and stored into temporary > file in MapRedLocalTask. But for some aliases without filter or projection, > it seemed not necessary to do that. For example. > {noformat} > select a.* from src a join src b on a.key=b.key; > {noformat} > makes plan like this. > {noformat} > STAGE PLANS: > Stage: Stage-4 > Map Reduce Local Work > Alias -> Map Local Tables: > a > Fetch Operator > limit: -1 > Alias -> Map Local Operator Tree: > a > TableScan > alias: a > HashTable Sink Operator > condition expressions: > 0 {key} {value} > 1 > handleSkewJoin: false > keys: > 0 [Column[key]] > 1 [Column[key]] > Position of Big Table: 1 > Stage: Stage-3 > Map Reduce > Alias -> Map Operator Tree: > b > TableScan > alias: b > Map Join Operator > condition map: > Inner Join 0 to 1 > condition expressions: > 0 {key} {value} > 1 > handleSkewJoin: false > keys: > 0 [Column[key]] > 1 [Column[key]] > outputColumnNames: _col0, _col1 > Position of Big Table: 1 > Select Operator > File Output Operator > Local Work: > Map Reduce Local Work > Stage: Stage-0 > Fetch Operator > {noformat} > table src(a) is fetched and stored as-is in MRLocalTask. With this patch, > plan can be like below. > {noformat} > Stage: Stage-3 > Map Reduce > Alias -> Map Operator Tree: > b > TableScan > alias: b > Map Join Operator > condition map: > Inner Join 0 to 1 > condition expressions: > 0 {key} {value} > 1 > handleSkewJoin: false > keys: > 0 [Column[key]] > 1 [Column[key]] > outputColumnNames: _col0, _col1 > Position of Big Table: 1 > Select Operator > File Output Operator > Local Work: > Map Reduce Local Work > Alias -> Map Local Tables: > a > Fetch Operator > limit: -1 > Alias -> Map Local Operator Tree: > a > TableScan > alias: a > Has Any Stage Alias: false > Stage: Stage-0 > Fetch Operator > {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)