[ 
https://issues.apache.org/jira/browse/HIVE-16151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959989#comment-15959989
 ] 

Sergey Shelukhin edited comment on HIVE-16151 at 4/6/17 11:50 PM:
------------------------------------------------------------------

Need to look at that null check and where it happens, it could be avoidable. 4% 
seems too much as this is kind of obscure.
We could even just allocate all the memory at once as before, but in small 
chunks, eliminating it. Since with a good hash function, we expect every 
sub-array to be used anyway.


was (Author: sershe):
Need to look at that null check and where it happens, it could be avoidable. 4% 
seems too much as this is kind of obscure.
We could even just allocate all the memory at once as before, but in small 
chunks, eliminating it.

> BytesBytesHashTable allocates large arrays
> ------------------------------------------
>
>                 Key: HIVE-16151
>                 URL: https://issues.apache.org/jira/browse/HIVE-16151
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Prasanth Jayachandran
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-16151.patch
>
>
> These arrays cause GC pressure and also impose key count limitations on the 
> table. Wrt the latter, we won't be able to get rid of it without a 64-bit 
> hash function, but for now we can get rid of the former. If we need the 
> latter we'd add murmur64 and probably account for it differently for resize 
> (we don't want to blow up the hashtable by 4 bytes/key in the common case 
> where #of keys is less than ~1.5B :))



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to