Explore mechanisms for better compression with RC Files
-------------------------------------------------------
Key: HIVE-2097
URL: https://issues.apache.org/jira/browse/HIVE-2097
Project: Hive
Issue Type: Improvement
Components: Query Processor, Serializers/Deserializers
Reporter: Krishna Kumar
Priority: Minor
Optimization of the compression mechanisms used by RC File to be explored.
Some initial ideas
1. More efficient serialization/deserialization based on type-specific and
storage-specific knowledge.
For instance, storing sorted numeric values efficiently using some delta
coding techniques
2. More efficient compression based on type-specific and storage-specific
knowledge
Enable compression codecs to be specified based on types or individual
columns
3. Reordering the on-disk storage for better compression efficiency.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira