[ 
https://issues.apache.org/jira/browse/HIVE-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12935043#action_12935043
 ] 

He Yongqiang commented on HIVE-1802:
------------------------------------

>>For any Group by, we needed 2 mem-copies. One from Text objects to buffer, 
>>one add an extra tag to the end of the buffer.
I think for Join we will need array copy and put a tag at the end.

I mean optimize BinarySortableSerDe might be a better idea to optimize cases 
when need array copy.
The code can be cleaner and simpler if only optimize the one Text key case in 
Group by, and put other optimizations in BinarySortableSerDe.

> Encode MapReduce Shuffling Keys Differently for  Single string/bigint Key
> -------------------------------------------------------------------------
>
>                 Key: HIVE-1802
>                 URL: https://issues.apache.org/jira/browse/HIVE-1802
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Siying Dong
>            Assignee: Siying Dong
>         Attachments: HIVE-1802.1.patch, HIVE-1802.2.patch
>
>
> Delimiters are not needed if we only have one shuffling key, and in the same 
> time escaping delimiters are not needed. We can save some CPU time on 
> serializing and shuffle slightly less amount of data to save memory footprint 
> and network traffic.
> Also there is a bug that for group-by, we by mistake add a -1 to the end of 
> the key and pay one more unnecessary mem-copy. Can be easily fixed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to