----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30388/#review70150 -----------------------------------------------------------
Initial feedback. ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkMapJoinResolver.java <https://reviews.apache.org/r/30388/#comment115156> childrenBackupTasks or backChildrenTasks? I suggest more consistent variable/method names. Since the none is "task", I suggest "child". ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java <https://reviews.apache.org/r/30388/#comment115157> In Spark branch -> For Spark - Xuefu Zhang On Jan. 29, 2015, 1:05 a.m., Chao Sun wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/30388/ > ----------------------------------------------------------- > > (Updated Jan. 29, 2015, 1:05 a.m.) > > > Review request for hive and Xuefu Zhang. > > > Bugs: HIVE-9103 > https://issues.apache.org/jira/browse/HIVE-9103 > > > Repository: hive-git > > > Description > ------- > > This patch adds backup task to map join task. The backup task, which uses > common join, will be triggered > in case the mapjoin task failed. > > Note that, no matter how many map joins there are in the SparkTask, we will > only generate one backup task. > This means that if the original task failed at the very last map join, the > whole task will be re-executed. > > The handling of backup task is a little bit different from what MR does, > mostly because we convert JOIN to > MAPJOIN during the operator plan optimization phase, at which time no > task/work exist yet. In the patch, we > cloned the whole operator tree before the JOIN operator is converted. The > operator tree will be processed > and generate a separate work tree for a separate backup SparkTask. > > > Diffs > ----- > > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkMapJoinResolver.java > 69004dc > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/StageIDsRearranger.java > 79c3e02 > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkJoinOptimizer.java > d57ceff > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java > 9ff47c7 > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSortMergeJoinFactory.java > 6e0ac38 > ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java b838bff > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java > 773cfbd > > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/OptimizeSparkProcContext.java > f7586a4 > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java > 3a7477a > ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 0e85990 > ql/src/test/results/clientpositive/spark/auto_join25.q.out ab01b8a > > Diff: https://reviews.apache.org/r/30388/diff/ > > > Testing > ------- > > auto_join25.q > > > Thanks, > > Chao Sun > >