-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11770/
-----------------------------------------------------------
Review request for hive.
Description
-------
Modifies ColumnProjectionUtils such there are two flags. One for the column ids
and one indicating whether all columns should be read. Additionally the patch
updates all locations which uses the old method of empty string indicating all
columns should be read.
The automatic formatter generated by ant eclipse-files is fairly aggressive so
there are some unrelated import/whitespace cleanup.
This addresses bug HIVE-4113.
https://issues.apache.org/jira/browse/HIVE-4113
Diffs
-----
hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java
da85501
hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java
bc0e04c
hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java
ac3753f
hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/InitializeInput.java
02ec37f
hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/InternalUtil.java
4167afa
hcatalog/core/src/test/java/org/apache/hcatalog/mapreduce/TestHCatMultiOutputFormat.java
b5f22af
hcatalog/core/src/test/java/org/apache/hcatalog/mapreduce/TestHCatPartitioned.java
dd2ac10
hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hcatalog/pig/TestHCatLoader.java
e907c73
ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java 6bbcb26
ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 1a784b2
ql/src/java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java
49145b7
ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java adf4923
ql/src/java/org/apache/hadoop/hive/ql/io/RCFile.java d18d403
ql/src/java/org/apache/hadoop/hive/ql/io/RCFileRecordReader.java 9521060
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 96ac584
ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeRecordReader.java
cbdc2db
ql/src/test/org/apache/hadoop/hive/ql/QTestUtil.java 9fc52fa
ql/src/test/org/apache/hadoop/hive/ql/io/PerformTestRCFileAndSeqFile.java
0df08e4
ql/src/test/org/apache/hadoop/hive/ql/io/TestRCFile.java e33a1ce
ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java
785f0b1
serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java
23180cf
serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarSerDe.java
11f5f07
serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStruct.java
1335446
serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStructBase.java
e1270cc
serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java
b717278
serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarStruct.java
0317024
serde/src/test/org/apache/hadoop/hive/serde2/TestColumnProjectionUtils.java
PRE-CREATION
serde/src/test/org/apache/hadoop/hive/serde2/TestStatsSerde.java 3ba2699
serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java
99420ca
Diff: https://reviews.apache.org/r/11770/diff/
Testing
-------
All unit tests pass with the patch. ColumnProjectionUtils has new unit tests
covering it's functionality. Additionally I verified manually the select
count(1) from RCFile/Orc resulted in less IO after the change.
Before:
hive> select count(1) from users_orc;
Job 0: Map: 1 Reduce: 1 Cumulative CPU: 17.75 sec HDFS Read: 28782851 HDFS
Write: 9 SUCCESS
hive> select count(1) from users_rc;
Job 0: Map: 3 Reduce: 1 Cumulative CPU: 23.72 sec HDFS Read: 825865962
HDFS Write: 9 SUCCESS
After:
hive> select count(1) from users_orc;
Job 0: Map: 1 Reduce: 1 Cumulative CPU: 9.9 sec HDFS Read: 67325 HDFS
Write: 9 SUCCESS
hive> select count(1) from users_rc;
Job 0: Map: 3 Reduce: 1 Cumulative CPU: 16.96 sec HDFS Read: 96045618 HDFS
Write: 9 SUCCESS
Thanks,
Brock Noland