-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30443/
-----------------------------------------------------------

Review request for hive and Xuefu Zhang.


Repository: hive-git


Description
-------

This patch refactors SMB MapJoin optimizations in Spark to be one-pass.  The 
main part of SMB MapJoin optimization is to annotate the MapWork with the 
information from SMBMapJoinOperator and its roots (TableScans).

Instead of doing MapWork init/annotation in the SparkSortMergeJoinFactory in a 
second pass, now both GenSparkWork and SparkSortMergeJoinFactory classes 
collect information.  After the one-pass, we go through all the 
SMBJoinOperators and annotate their mapworks.


Diffs
-----

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSortMergeJoinFactory.java
 6e0ac38 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
773cfbd 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 0eac6e1 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkWork.java cb5d4fe 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 3a7477a 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkSMBMapJoinInfo.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/30443/diff/


Testing
-------


Thanks,

Szehon Ho

Reply via email to