[ 
https://issues.apache.org/jira/browse/HIVE-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12934382#action_12934382
 ] 

He Yongqiang commented on HIVE-1802:
------------------------------------

I think we only need serialize here. No? Can we make it easier? I mean only 
processing cases where there is one key, and type is Text, and also only for 
group by. In this case, we can avoid an array copy. 

But if it is a join, or there are multiple keys in group by, we anyway need to 
do array copy. The problem of binarysortableserde is that it uses write() to 
write bytes. Can we make binarysortableserde to use array copy? Maybe we can 
use some java nio classes, like ByteBuffer?

> Encode MapReduce Shuffling Keys Differently for  Single string/bigint Key
> -------------------------------------------------------------------------
>
>                 Key: HIVE-1802
>                 URL: https://issues.apache.org/jira/browse/HIVE-1802
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Siying Dong
>            Assignee: Siying Dong
>         Attachments: HIVE-1802.1.patch
>
>
> Delimiters are not needed if we only have one shuffling key, and in the same 
> time escaping delimiters are not needed. We can save some CPU time on 
> serializing and shuffle slightly less amount of data to save memory footprint 
> and network traffic.
> Also there is a bug that for group-by, we by mistake add a -1 to the end of 
> the key and pay one more unnecessary mem-copy. Can be easily fixed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to