[ https://issues.apache.org/jira/browse/HIVE-2097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Krishna Kumar updated HIVE-2097: -------------------------------- Attachment: datacomp.tar.gz unrefactored source for all the implemented compression codecs > Explore mechanisms for better compression with RC Files > ------------------------------------------------------- > > Key: HIVE-2097 > URL: https://issues.apache.org/jira/browse/HIVE-2097 > Project: Hive > Issue Type: Improvement > Components: Query Processor, Serializers/Deserializers > Reporter: Krishna Kumar > Assignee: Krishna Kumar > Priority: Minor > Attachments: datacomp.tar.gz > > > Optimization of the compression mechanisms used by RC File to be explored. > Some initial ideas > > 1. More efficient serialization/deserialization based on type-specific and > storage-specific knowledge. > > For instance, storing sorted numeric values efficiently using some delta > coding techniques > 2. More efficient compression based on type-specific and storage-specific > knowledge > Enable compression codecs to be specified based on types or individual > columns > 3. Reordering the on-disk storage for better compression efficiency. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira