----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/56687/#review166735 -----------------------------------------------------------
common/src/java/org/apache/hadoop/hive/common/StringInternUtils.java (line 69) <https://reviews.apache.org/r/56687/#comment238743> Nit: please follow hive coding conventions for if statements. Same in other places. (http://www.oracle.com/technetwork/java/javase/documentation/codeconventions-142311.html#431) ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveLockObject.java (line 57) <https://reviews.apache.org/r/56687/#comment238746> any point in interning a timestamp ? likelihood of this hitting the pool is almost zero, correct ? ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java (lines 205 - 220) <https://reviews.apache.org/r/56687/#comment238751> Do these paths eventually getting interned up the chain or are these ignored because these are aren't used/accessed in PartitionDesc ?...wasn't clear to me. ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java (lines 238 - 245) <https://reviews.apache.org/r/56687/#comment238756> same for database name, table name strings accessed via MetaStoreUtils.getSchema -- getting interned someplace ? ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java (lines 324 - 334) <https://reviews.apache.org/r/56687/#comment238758> same for this. ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java (lines 367 - 379) <https://reviews.apache.org/r/56687/#comment238761> same for this. ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java (line 91) <https://reviews.apache.org/r/56687/#comment238763> what about this ? - Mohit Sabharwal On Feb. 23, 2017, 9:01 p.m., Misha Dmitriev wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/56687/ > ----------------------------------------------------------- > > (Updated Feb. 23, 2017, 9:01 p.m.) > > > Review request for hive, Chaoyu Tang, Mohit Sabharwal, and Sergio Pena. > > > Bugs: https://issues.apache.org/jira/browse/HIVE-15882 > > https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/HIVE-15882 > > > Repository: hive-git > > > Description > ------- > > See the description of the problem in > https://issues.apache.org/jira/browse/HIVE-15882 Interning strings per this > review removes most of the overhead due to duplicate strings. > > Also, where maps in several places are created from other maps, use the > original map's size for the new map. This is to avoid the situation when a > map with default capacity (typically 16) is created to hold just 2-3 entries, > and the rest of the internal 16-entry array is wasted. > > > Diffs > ----- > > common/src/java/org/apache/hadoop/hive/common/StringInternUtils.java > PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java > e81cbce3e333d44a4088c10491f399e92a505293 > ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java > 08420664d59f28f75872c25c9f8ee42577b23451 > ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java > e91064b9c75e8adb2b36f21ff19ec0c1539b03b9 > ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java > 51530ac16c92cc75d501bfcb573557754ba0c964 > ql/src/java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java > 55b3b551a1dac92583b6e03b10beb8172ca93d45 > ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveLockObject.java > 82dc89803be9cf9e0018720eeceb90ff450bfdc8 > ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java > c0edde9e92314d86482b5c46178987e79fae57fe > ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java > c6ae6f290857cfd10f1023058ede99bf4a10f057 > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java > 24d16812515bdfa90b4be7a295c0388fcdfe95ef > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java > ede4fcbe342052ad86dadebcc49da2c0f515ea98 > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/NullScanTaskDispatcher.java > 0882ae2c6205b1636cbc92e76ef66bb70faadc76 > > ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java > 68b0ad9ea63f051f16fec3652d8525f7ab07eb3f > ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java > d4bdd96eaf8d179bed43b8a8c3be0d338940154a > ql/src/java/org/apache/hadoop/hive/ql/plan/MsckDesc.java > b7a7e4b7a5f8941b080c7805d224d3885885f444 > ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java > 73981e826870139a42ad881103fdb0a2ef8433a2 > > Diff: https://reviews.apache.org/r/56687/diff/ > > > Testing > ------- > > I've measured how much memory this change plus another one (interning > Properties in PartitionDesc) save in my HS2 benchmark - the result is 37%. > See the details in HIVE-15882. > > > Thanks, > > Misha Dmitriev > >