----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11770/ -----------------------------------------------------------
Review request for hive. Description ------- Modifies ColumnProjectionUtils such there are two flags. One for the column ids and one indicating whether all columns should be read. Additionally the patch updates all locations which uses the old method of empty string indicating all columns should be read. The automatic formatter generated by ant eclipse-files is fairly aggressive so there are some unrelated import/whitespace cleanup. This addresses bug HIVE-4113. https://issues.apache.org/jira/browse/HIVE-4113 Diffs ----- hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java da85501 hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java bc0e04c hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java ac3753f hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/InitializeInput.java 02ec37f hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/InternalUtil.java 4167afa hcatalog/core/src/test/java/org/apache/hcatalog/mapreduce/TestHCatMultiOutputFormat.java b5f22af hcatalog/core/src/test/java/org/apache/hcatalog/mapreduce/TestHCatPartitioned.java dd2ac10 hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hcatalog/pig/TestHCatLoader.java e907c73 ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java 6bbcb26 ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 1a784b2 ql/src/java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java 49145b7 ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java adf4923 ql/src/java/org/apache/hadoop/hive/ql/io/RCFile.java d18d403 ql/src/java/org/apache/hadoop/hive/ql/io/RCFileRecordReader.java 9521060 ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 96ac584 ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeRecordReader.java cbdc2db ql/src/test/org/apache/hadoop/hive/ql/QTestUtil.java 9fc52fa ql/src/test/org/apache/hadoop/hive/ql/io/PerformTestRCFileAndSeqFile.java 0df08e4 ql/src/test/org/apache/hadoop/hive/ql/io/TestRCFile.java e33a1ce ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java 785f0b1 serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java 23180cf serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarSerDe.java 11f5f07 serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStruct.java 1335446 serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStructBase.java e1270cc serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java b717278 serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarStruct.java 0317024 serde/src/test/org/apache/hadoop/hive/serde2/TestColumnProjectionUtils.java PRE-CREATION serde/src/test/org/apache/hadoop/hive/serde2/TestStatsSerde.java 3ba2699 serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java 99420ca Diff: https://reviews.apache.org/r/11770/diff/ Testing ------- All unit tests pass with the patch. ColumnProjectionUtils has new unit tests covering it's functionality. Additionally I verified manually the select count(1) from RCFile/Orc resulted in less IO after the change. Before: hive> select count(1) from users_orc; Job 0: Map: 1 Reduce: 1 Cumulative CPU: 17.75 sec HDFS Read: 28782851 HDFS Write: 9 SUCCESS hive> select count(1) from users_rc; Job 0: Map: 3 Reduce: 1 Cumulative CPU: 23.72 sec HDFS Read: 825865962 HDFS Write: 9 SUCCESS After: hive> select count(1) from users_orc; Job 0: Map: 1 Reduce: 1 Cumulative CPU: 9.9 sec HDFS Read: 67325 HDFS Write: 9 SUCCESS hive> select count(1) from users_rc; Job 0: Map: 3 Reduce: 1 Cumulative CPU: 16.96 sec HDFS Read: 96045618 HDFS Write: 9 SUCCESS Thanks, Brock Noland