[ https://issues.apache.org/jira/browse/HIVE-15390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15779695#comment-15779695 ]
Abhishek Somani commented on HIVE-15390: ---------------------------------------- Review Board: https://reviews.apache.org/r/55045/ Not sure why the precommit tests are not running. Uploading the same patch with a patch number to try and trigger the tests. > Orc reader unnecessarily reading stripe footers with > hive.optimize.index.filter set to true > ------------------------------------------------------------------------------------------- > > Key: HIVE-15390 > URL: https://issues.apache.org/jira/browse/HIVE-15390 > Project: Hive > Issue Type: Bug > Components: ORC > Affects Versions: 1.2.1 > Reporter: Abhishek Somani > Assignee: Abhishek Somani > Attachments: HIVE-15390.patch > > > In a split given to a task, the task's orc reader is unnecessarily reading > stripe footers for stripes that are not its responsibility to read. This is > happening with hive.optimize.index.filter set to true. > Assuming one split per task(no tez grouping considered), a task should not > need to read beyond the split's end offset. Even in some split computation > strategies where a split's end offset can be in the middle of a stripe, it > should not need to read more than one stripe beyond the split's end offset(to > fully read a stripe that started in it). However I see that some tasks make > unnecessary filesystem calls to read all the stripe footers in a file from > the split start offset till the end of the file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)