-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7126/
-----------------------------------------------------------

(Updated Sept. 18, 2012, 5:43 p.m.)


Review request for hive.


Changes
-------

bug fix+ 3 test cases


Description
-------

This optimizer exploits intra-query correlations and merges multiple correlated 
MapReduce jobs into one jobs. Open a new request since I have been working on 
hive-git.


This addresses bug HIVE-2206.
    https://issues.apache.org/jira/browse/HIVE-2206


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 2693663 
  ql/src/java/org/apache/hadoop/hive/ql/exec/BaseReduceSinkOperator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationCompositeOperator.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationLocalSimulativeReduceSinkOperator.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationReducerDispatchOperator.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecReducer.java 283d0b6 
  ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 8669051 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 05a399d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 0c22141 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 919a140 
  ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 1a40630 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java 1469325 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CorrelationOptimizer.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/CorrelationOptimizerUtils.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 6bc5fe4 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java f292131 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 8bacd3d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 63e8ff2 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BaseReduceSinkDesc.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationCompositeDesc.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationLocalSimulativeReduceSinkDesc.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationReducerDispatchDesc.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 5f38bf2 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 16eb125 
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 9a95efd 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 142f040 
  ql/src/test/queries/clientpositive/correlationoptimizer1.q PRE-CREATION 
  ql/src/test/queries/clientpositive/correlationoptimizer2.q PRE-CREATION 
  ql/src/test/queries/clientpositive/correlationoptimizer3.q PRE-CREATION 
  ql/src/test/results/clientpositive/correlationoptimizer1.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/correlationoptimizer2.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/correlationoptimizer3.q.out PRE-CREATION 
  ql/src/test/results/compiler/plan/groupby1.q.xml 4382252 
  ql/src/test/results/compiler/plan/groupby2.q.xml eef669c 
  ql/src/test/results/compiler/plan/groupby3.q.xml 9743480 
  ql/src/test/results/compiler/plan/groupby5.q.xml 8e07860 

Diff: https://reviews.apache.org/r/7126/diff/


Testing
-------

Cannot test TestHBaseMinimrCliDriver, TestHBaseCliDriver, 
TestHBaseNegativeCliDriver, testSynchronized in TestEmbeddedHiveMetaStore, 
testSynchronized in TestRemoteHiveMetaStore, testSynchronized in 
TestSetUGIOnBothClientServer, testSynchronized in TestSetUGIOnOnlyClient, 
testSynchronized in TestSetUGIOnOnlyServer, and 
testNegativeCliDriver_local_mapred_error_cache in TestNegativeCliDriver. This 
patch should pass all other tests. 

When the optimizer is enabled (right now, the optimizer is disabled by 
default), there are several cases failed. 1 is optimized by the optimizer. 1 is 
not suitable for this correlation optimizer. 2 are due to potential bugs of the 
trunk. Other failures are parsing cases (xml plans). Those failures are due to 
my minor changes in SemanticAnalyzer since several redundant operators will be 
generated for the correlation optimizer. Overall, those failures are not very 
relevant to the patch. Please see 
https://issues.apache.org/jira/browse/HIVE-2206?focusedCommentId=13456171&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13456171
 for details.


Thanks,

Yin Huai

Reply via email to