[ https://issues.apache.org/jira/browse/HIVE-23158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17079336#comment-17079336 ]
Hive QA commented on HIVE-23158: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12999358/HIVE-23158.01.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 18208 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/21530/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/21530/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-21530/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12999358 - PreCommit-HIVE-Build > Optimize S3A recordReader policy for Random IO formats > ------------------------------------------------------ > > Key: HIVE-23158 > URL: https://issues.apache.org/jira/browse/HIVE-23158 > Project: Hive > Issue Type: Bug > Reporter: Panagiotis Garefalakis > Assignee: Panagiotis Garefalakis > Priority: Trivial > Labels: pull-request-available > Attachments: HIVE-23158.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > > S3A filesystem client (inherited by Hadoop) supports the notion of input > policies. > These policies tune the behaviour of HTTP requests that are used for reading > different filetypes such as TEXT or ORC. > For formats such as ORC and Parquet that do a lot of seek operations, there > is an optimized RANDOM mode that reads files only partially instead of fully > (default). > I am suggesting to add some extra logic as part of HiveInputFormat to make > sure we optimize RecordReader requests for random IO when data is stored on > S3A using formats such as ORC or Parquet. -- This message was sent by Atlassian Jira (v8.3.4#803005)