[ https://issues.apache.org/jira/browse/HIVE-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14194323#comment-14194323 ]
Prasanth J commented on HIVE-8521: ---------------------------------- [~owen.omalley] I took a pass over the document. Mostly looks good. Few things 1) Section 4.4: "Runs start with an initial byte of 0x00 to 0xf7". Shouldn't it be 0x7f? 2) Section 4.5.1: "encoded if they type is signed" should be "the type" 3) Section 4.5.2: DEAD BEEF hex code :) 4) Section 4.5.3: I think we should revert the percentile back to 95. Since we only have 5 bits patch length we will not be able to encode lengths >32 which could happen if we consider 90th percentile (512 * 0.1 = 51 elements can be patched). 5) Section 5: The default stripe size is now 64MB. Do we need to mention that in this section? 6) Section 5.1: "DICTIONARY_DATA", "DIRECT_V2", "DICTIONARY_V2" has a stray "\" before _ 7) Section 5.2.7: "definition was change" should be "changed" > Document the ORC format > ----------------------- > > Key: HIVE-8521 > URL: https://issues.apache.org/jira/browse/HIVE-8521 > Project: Hive > Issue Type: Bug > Components: Documentation, File Formats > Reporter: Owen O'Malley > Assignee: Owen O'Malley > Attachments: orc-spec.pdf > > > It is past time that we document the ORC file format. I've started and should > have a first pass this week. -- This message was sent by Atlassian JIRA (v6.3.4#6332)