----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/13059/ -----------------------------------------------------------
Review request for hive, Eric Hanson and Jitendra Pandey. Bugs: HIVE-4850 https://issues.apache.org/jira/browse/HIVE-4850 Repository: hive-git Description ------- This is not the final iteration, but I thought is easier to discuss it with a review. This implementation works, handles multiple aliases and multiple values per key. The implementation uses the exiting hash tables saved by the local task for the map join, which are row mode hash tables (have row mode keys and store row mode writable object values). Going forward we should avoid the size-of-big-table conversions of big table keys to row-mode and conversion of small table values to vector data. This would require either converting on-the-fly the hash tables to vector friendly ones (when loaded) or changing the local task tahstable sink to create a vectorization friendly hash. First approach may have memory consumption problems (potentially two hash tables end up in memory, would have to stream the transformation or transform as reading from serialized format... nasty). Diffs ----- ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java 82d4b93 ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 31dbf41 ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 4da1be8 ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 29de38d ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java e579c00 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinDoubleKeys.java d774226 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectKey.java 791bb3f ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java 58a9dc0 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinSingleKey.java 4bff936 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ColumnVector.java 8b4c615 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorColumnAssign.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorColumnAssignFactory.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorExecMapper.java 083b9b9 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapOperator.java 41d2001 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 9c90230 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatch.java ff13f89 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriterFactory.java 9e189c9 ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableDummyDesc.java f15ce48 Diff: https://reviews.apache.org/r/13059/diff/ Testing ------- Manually run some join queries on alltypes_orc table. Thanks, Remus Rusanu