[
https://issues.apache.org/jira/browse/HIVE-2623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206967#comment-13206967
]
Krishna Kumar commented on HIVE-2623:
-------------------------------------
I will attach cpu usage information for compression/decompression.
- Note that this compression/decompression is trivial compared to the work
done as part of bzip2 etc. The numbers are converted to bits one item at a
time, with no lookback/sorting etc.
- The numbers above are specific to a particular distribution, for which the
encoding is optimal. I have a few other integer compressors coded (eliasdelta,
eliasomega, Vint, Continuation Bit, Golomb/Rice) each of which will have its
own sweet spot. (The last is parametric, the parameter being estimated on the
block, which means that it will have several optimal distributions.) I need to
do the analysis and get the numbers for these too.
> Add Integer type compressors
> ----------------------------
>
> Key: HIVE-2623
> URL: https://issues.apache.org/jira/browse/HIVE-2623
> Project: Hive
> Issue Type: Sub-task
> Components: Contrib
> Reporter: Krishna Kumar
> Assignee: Krishna Kumar
> Priority: Minor
> Attachments: HIVE-2623.v0.patch, HIVE-2623.v1.patch,
> HIVE-2623.v2.patch, data.tar.gz
>
>
> Type-specific compressors for integers.
> Starting with elias gamma which prefers small values as per a power-law like
> distribution.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira