[ https://issues.apache.org/jira/browse/HIVE-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rui Li updated HIVE-9097: ------------------------- Attachment: HIVE-9097.1-spark.patch The patch splits the original spark task into two tasks so that conditional map joins can be inserted to process skewed data. Changes to golden files are all in query plan. > Support runtime skew join for more queries [Spark Branch] > --------------------------------------------------------- > > Key: HIVE-9097 > URL: https://issues.apache.org/jira/browse/HIVE-9097 > Project: Hive > Issue Type: Improvement > Reporter: Rui Li > Assignee: Rui Li > Attachments: HIVE-9097.1-spark.patch > > > After HIVE-8913, runtime skew join is enabled for spark. But currently the > optimization only supports the simplest case where join is the leaf > ReduceWork in a work graph. This is because the results from the original > join and the conditional map join have to be unioned to feed to downstream > works, which can be a little tricky for spark. > This JIRA is to research and find a way to relax the above restriction. A > possible solution is to break the original task into two tasks on the join > work, and insert the conditional task in between. -- This message was sent by Atlassian JIRA (v6.3.4#6332)