[ 
https://issues.apache.org/jira/browse/HIVE-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14194323#comment-14194323
 ] 

Prasanth J commented on HIVE-8521:
----------------------------------

[~owen.omalley] I took a pass over the document. Mostly looks good. Few things
1) Section 4.4: "Runs start with an initial byte of 0x00 to 0xf7". Shouldn't it 
be 0x7f?
2) Section 4.5.1: "encoded if they type is signed" should be "the type"
3) Section 4.5.2: DEAD BEEF hex code :)
4) Section 4.5.3: I think we should revert the percentile back to 95. Since we 
only have 5 bits patch length we will not be able to encode lengths >32 which 
could happen if we consider 90th percentile (512 * 0.1 = 51 elements can be 
patched).
5) Section 5: The default stripe size is now 64MB. Do we need to mention that 
in this section?
6) Section 5.1: "DICTIONARY_DATA", "DIRECT_V2", "DICTIONARY_V2" has a stray "\" 
before _
7) Section 5.2.7: "definition was change" should be "changed"

> Document the ORC format
> -----------------------
>
>                 Key: HIVE-8521
>                 URL: https://issues.apache.org/jira/browse/HIVE-8521
>             Project: Hive
>          Issue Type: Bug
>          Components: Documentation, File Formats
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>         Attachments: orc-spec.pdf
>
>
> It is past time that we document the ORC file format. I've started and should 
> have a first pass this week.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to