[ https://issues.apache.org/jira/browse/HIVE-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814298#comment-13814298 ]
Hive QA commented on HIVE-5562: ------------------------------- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12612113/HIVE-5562.2.patch.txt {color:green}SUCCESS:{color} +1 4552 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/142/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/142/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12612113 > Provide stripe level column statistics in ORC > --------------------------------------------- > > Key: HIVE-5562 > URL: https://issues.apache.org/jira/browse/HIVE-5562 > Project: Hive > Issue Type: New Feature > Components: File Formats > Affects Versions: 0.13.0 > Reporter: Prasanth J > Assignee: Prasanth J > Labels: orcfile > Fix For: 0.13.0 > > Attachments: HIVE-5562.1.patch.txt, HIVE-5562.2.patch.txt > > > ORC maintains two levels of column statistics. Index statistics (for every > rowgroup) and file level column statistics for the entire file. It is useful > to have stripe level column statistics which will be intermediate to index > and file statistics. The reason to maintain stripe level statistics is that, > the current input split computation logic is based on stripe boundaries. So > if stripe level statistics are available and if a stripe doesn't satisfy a > predicate condition then that entire stripe (also split) can be eliminated > from split computation. -- This message was sent by Atlassian JIRA (v6.1#6144)