[ https://issues.apache.org/jira/browse/HIVE-16151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959989#comment-15959989 ]
Sergey Shelukhin edited comment on HIVE-16151 at 4/6/17 11:50 PM: ------------------------------------------------------------------ Need to look at that null check and where it happens, it could be avoidable. 4% seems too much as this is kind of obscure. We could even just allocate all the memory at once as before, but in small chunks, eliminating it. Since with a good hash function, we expect every sub-array to be used anyway. was (Author: sershe): Need to look at that null check and where it happens, it could be avoidable. 4% seems too much as this is kind of obscure. We could even just allocate all the memory at once as before, but in small chunks, eliminating it. > BytesBytesHashTable allocates large arrays > ------------------------------------------ > > Key: HIVE-16151 > URL: https://issues.apache.org/jira/browse/HIVE-16151 > Project: Hive > Issue Type: Bug > Reporter: Prasanth Jayachandran > Assignee: Sergey Shelukhin > Attachments: HIVE-16151.patch > > > These arrays cause GC pressure and also impose key count limitations on the > table. Wrt the latter, we won't be able to get rid of it without a 64-bit > hash function, but for now we can get rid of the former. If we need the > latter we'd add murmur64 and probably account for it differently for resize > (we don't want to blow up the hashtable by 4 bytes/key in the common case > where #of keys is less than ~1.5B :)) -- This message was sent by Atlassian JIRA (v6.3.15#6346)