----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7126/ -----------------------------------------------------------
(Updated Sept. 24, 2012, 2:33 p.m.) Review request for hive. Changes ------- bug fix + 2 new tests Description ------- This optimizer exploits intra-query correlations and merges multiple correlated MapReduce jobs into one jobs. Open a new request since I have been working on hive-git. This addresses bug HIVE-2206. https://issues.apache.org/jira/browse/HIVE-2206 Diffs (updated) ----- common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 2693663 ql/src/java/org/apache/hadoop/hive/ql/exec/BaseReduceSinkOperator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationCompositeOperator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationLocalSimulativeReduceSinkOperator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationReducerDispatchOperator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/ExecReducer.java 283d0b6 ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 8669051 ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 5f08519 ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 0c22141 ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 919a140 ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 1a40630 ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java 1469325 ql/src/java/org/apache/hadoop/hive/ql/optimizer/CorrelationOptimizer.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/CorrelationOptimizerUtils.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 40dd949 ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java f292131 ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 8bacd3d ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 33ce6ca ql/src/java/org/apache/hadoop/hive/ql/plan/BaseReduceSinkDesc.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationCompositeDesc.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationLocalSimulativeReduceSinkDesc.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationReducerDispatchDesc.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 5f38bf2 ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 16eb125 ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 9a95efd ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 142f040 ql/src/test/queries/clientpositive/correlationoptimizer1.q PRE-CREATION ql/src/test/queries/clientpositive/correlationoptimizer2.q PRE-CREATION ql/src/test/queries/clientpositive/correlationoptimizer3.q PRE-CREATION ql/src/test/queries/clientpositive/correlationoptimizer4.q PRE-CREATION ql/src/test/queries/clientpositive/correlationoptimizer5.q PRE-CREATION ql/src/test/results/clientpositive/correlationoptimizer1.q.out PRE-CREATION ql/src/test/results/clientpositive/correlationoptimizer2.q.out PRE-CREATION ql/src/test/results/clientpositive/correlationoptimizer3.q.out PRE-CREATION ql/src/test/results/clientpositive/correlationoptimizer4.q.out PRE-CREATION ql/src/test/results/clientpositive/correlationoptimizer5.q.out PRE-CREATION ql/src/test/results/compiler/plan/groupby1.q.xml 4382252 ql/src/test/results/compiler/plan/groupby2.q.xml eef669c ql/src/test/results/compiler/plan/groupby3.q.xml 9743480 ql/src/test/results/compiler/plan/groupby5.q.xml 8e07860 Diff: https://reviews.apache.org/r/7126/diff/ Testing ------- Cannot test TestHBaseMinimrCliDriver, TestHBaseCliDriver, TestHBaseNegativeCliDriver, testSynchronized in TestEmbeddedHiveMetaStore, testSynchronized in TestRemoteHiveMetaStore, testSynchronized in TestSetUGIOnBothClientServer, testSynchronized in TestSetUGIOnOnlyClient, testSynchronized in TestSetUGIOnOnlyServer, and testNegativeCliDriver_local_mapred_error_cache in TestNegativeCliDriver, since trunk failed on these tests on my machine. Also, since trunk will generate a different order of results (rows are in a different order) for queries skewjoinopt1.q to skewjoinopt5.q, skewjoinopt10.q, skewjoinopt15.q to skewjoinopt17.q, and skewjoinopt19.q to skewjoinopt20.q, I cannot test these queries on my machine either. All other tests pass. Thanks, Yin Huai