-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13059/
-----------------------------------------------------------

Review request for hive, Eric Hanson and Jitendra Pandey.


Bugs: HIVE-4850
    https://issues.apache.org/jira/browse/HIVE-4850


Repository: hive-git


Description
-------

This is not the final iteration, but I thought is easier to discuss it with a 
review.
This implementation works, handles multiple aliases and multiple values per 
key. The implementation uses the exiting hash tables saved by the local task 
for the map join, which are row mode hash tables (have row mode keys and store 
row mode writable object values). Going forward we should avoid the 
size-of-big-table conversions of big table keys to row-mode and conversion of 
small table values to vector data. This would require either converting 
on-the-fly the hash tables to vector friendly ones (when loaded) or changing 
the local task tahstable sink to create a vectorization friendly hash. First 
approach may have memory consumption problems (potentially two hash tables end 
up in memory, would have to stream the transformation or transform as reading 
from serialized format... nasty).


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java 82d4b93 
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 31dbf41 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 4da1be8 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 29de38d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java e579c00 
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinDoubleKeys.java 
d774226 
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectKey.java 
791bb3f 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java 
58a9dc0 
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinSingleKey.java 
4bff936 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ColumnVector.java 8b4c615 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorColumnAssign.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorColumnAssignFactory.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorExecMapper.java 
083b9b9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapOperator.java 
41d2001 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
9c90230 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatch.java 
ff13f89 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriterFactory.java
 9e189c9 
  ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableDummyDesc.java f15ce48 

Diff: https://reviews.apache.org/r/13059/diff/


Testing
-------

Manually run some join queries on alltypes_orc table.


Thanks,

Remus Rusanu

Reply via email to