[
https://issues.apache.org/jira/browse/HIVE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13733115#comment-13733115
]
Owen O'Malley commented on HIVE-4123:
-------------------------------------
This is looking good, Prasanth.
A couple more comments:
* You need to handle the date type.
* You should update the checkEncoding to only accept the encodings that are
appropriate for each type (direct for binary, boolean, struct, and byte;
direct_v2, dictionary, or dictionary_v2 for string; and direct or direct_v2 for
most of the rest)
* You should probably make a factory for creating the intreader so that you
only have the code in one place.
* The formatting on some of the new classes seems to use 8 spaces for
indentation.
> The RLE encoding for ORC can be improved
> ----------------------------------------
>
> Key: HIVE-4123
> URL: https://issues.apache.org/jira/browse/HIVE-4123
> Project: Hive
> Issue Type: New Feature
> Components: File Formats
> Affects Versions: 0.12.0
> Reporter: Owen O'Malley
> Assignee: Prasanth J
> Labels: orcfile
> Fix For: 0.12.0
>
> Attachments: HIVE-4123.1.git.patch.txt, HIVE-4123.2.git.patch.txt,
> HIVE-4123.3.patch.txt, HIVE-4123.4.patch.txt, HIVE-4123.5.txt,
> HIVE-4123.6.txt, ORC-Compression-Ratio-Comparison.xlsx
>
>
> The run length encoding of integers can be improved:
> * tighter bit packing
> * allow delta encoding
> * allow longer runs
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira