----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29281/ -----------------------------------------------------------
(Updated Dec. 20, 2014, 3:32 a.m.) Review request for hive. Changes ------- Removed the unnecessary type check. Bugs: HIVE-8640 https://issues.apache.org/jira/browse/HIVE-8640 Repository: hive-git Description ------- This change is on the same principle as the refactoring of HIVE-8639. The goal is to move as much of the join optimization as possible to the same traversal, and in fact the same process(joinOp) method, to simplify the logic and also for compiler performance. Whereas it is too hard to bring SparkMapJoinProcessor (for mapjoin hints) into the same level due to the way it was written (see HIVE-8911), it is possible to bring Bucket join and SMB join hints to the same level. This change introduces a parallel processor called 'SparkJoinHintOptimizer', which takes a mapjoin already converted by SparkMapJoinProcessor as input and converts it to Bucket/SMB join accordingly. It runs alongside 'SparkJoinOptimizer' which takes a common join operator and handles the auto-conversion to mapjoin/bucketJoin/SMBJoin. The one difference between mapjoin/bucketJoin vs SMB as Chao found was that while Spark mapjoins expect RS for small-table branches in mapjoin/bucketJoin, this is not expected for SMB join. So I added a class SparkSMBHintJoinOptimizer that first removes this before re-using the rest of the existing code. Another issue was found in NonBlockingOpDeDupProc that corrupts 'mapJoinContext' data structure in the parse context. A fix is offered in HIVE-9117 and that should be committed to trunk and merged first, but it is included here for reference. Diffs (updated) ----- itests/src/test/resources/testconfiguration.properties fd732c1 ql/src/java/org/apache/hadoop/hive/ql/optimizer/NonBlockingOpDeDupProc.java 5e0959a ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkJoinHintOptimizer.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSMBJoinHintOptimizer.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSortMergeJoinOptimizer.java 6a47513 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 5227d92 ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out b18e02f ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out bb7214c ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out c0adef4 ql/src/test/results/clientpositive/spark/bucketmapjoin13.q.out 98d0706 ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out ea763c7 ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 1b31561 ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out 97d2d74 ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out 94952a1 ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out ca59d02 ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out f419eaf ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out b954feb ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out bfe5438 ql/src/test/results/clientpositive/spark/smb_mapjoin9.q.out d769ebe ql/src/test/results/clientpositive/spark/smb_mapjoin_1.q.out 8d0527e ql/src/test/results/clientpositive/spark/smb_mapjoin_10.q.out 2df87cf ql/src/test/results/clientpositive/spark/smb_mapjoin_11.q.out PRE-CREATION ql/src/test/results/clientpositive/spark/smb_mapjoin_12.q.out PRE-CREATION ql/src/test/results/clientpositive/spark/smb_mapjoin_13.q.out 5637206 ql/src/test/results/clientpositive/spark/smb_mapjoin_14.q.out 3aed084 ql/src/test/results/clientpositive/spark/smb_mapjoin_15.q.out 6ed680d ql/src/test/results/clientpositive/spark/smb_mapjoin_16.q.out a4fd7c3 ql/src/test/results/clientpositive/spark/smb_mapjoin_17.q.out 6293450 ql/src/test/results/clientpositive/spark/smb_mapjoin_2.q.out 1cf144b ql/src/test/results/clientpositive/spark/smb_mapjoin_3.q.out 6b44d2c ql/src/test/results/clientpositive/spark/smb_mapjoin_4.q.out d07d65a ql/src/test/results/clientpositive/spark/smb_mapjoin_5.q.out 607b1f0 ql/src/test/results/clientpositive/spark/smb_mapjoin_6.q.out 30746ff ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out c48ed6d Diff: https://reviews.apache.org/r/29281/diff/ Testing ------- Re-enabled all the smb_mapjoin.* tests. I saw that a lot of the tests are again not alphabetized, so re-ran the script to alphabeticize them. As part of that, realized that some tests like 'bucket_map_join_spark.*' and 'join_empty' were missing proper comma deliminters from the next test and probably not ran. Also fixed the windowing.q which is the last test. This is all unrelated, but I am not sure if they will trigger additional test failures if these were unintentionally disabled. Thanks, Szehon Ho