[ 
https://issues.apache.org/jira/browse/HIVE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162705#comment-13162705
 ] 

Krishna Kumar commented on HIVE-2600:
-------------------------------------

He Yongqiang,

  In UberCompressor, I have gone with the decision that the mechanism can 
change on a per-block basis. So the file as a whole will declare the 
compression codec as "UberCompressionCodec" in the file header, and each column 
block will indicate the mechanism used for that block.

Carl,

  ColumnarSerde serializes all types as strings. Other Serdes can serialize the 
bytes as they wish. With 2604, I have added a dummy serde called 
UberCompressorSerde which is used to serialize the objects into bytes 
(BytesRefArrayWritable), the codec then can recover the objects which can then 
be fed to type-specific compressors. 

Just to make sure we are on the same page, please note that both the above 
points relate to a specific implementation of a schema-aware compressor. This 
jira in itself only introduces the interface, and the invocations of that 
interface. I'd like move any threads of implementation discussions to HIVE-2604.

                
> Enable/Add type-specific compression for rcfile
> -----------------------------------------------
>
>                 Key: HIVE-2600
>                 URL: https://issues.apache.org/jira/browse/HIVE-2600
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Query Processor, Serializers/Deserializers
>            Reporter: Krishna Kumar
>            Assignee: Krishna Kumar
>            Priority: Minor
>         Attachments: HIVE-2600.v0.patch, HIVE-2600.v1.patch
>
>
> Enable schema-aware compression codecs which can perform type-specific 
> compression on a per-column basis. I see this as in three-parts
> 1. Add interfaces for the rcfile to communicate column information to the 
> codec
> 2. Add an "uber compressor" which can perform column-specific compression on 
> a per-block basis. Initially, this can be config driven, but we can go for a 
> dynamic implementation later.
> 3. A bunch of type-specific compressors
> This jira is for the first part of the effort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to