[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1

ASF GitHub Bot (Jira) Thu, 22 Jul 2021 08:08:06 -0700


     [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=626731&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626731
 ]


ASF GitHub Bot logged work on HIVE-24484:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 22/Jul/21 15:07
            Start Date: 22/Jul/21 15:07
    Worklog Time Spent: 10m 
      Work Description: belugabehr commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674896581



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/io/RecordReaderWrapper.java
##########
@@ -69,7 +70,14 @@ static RecordReader create(InputFormat inputFormat, 
HiveInputFormat.HiveInputSpl
       JobConf jobConf, Reporter reporter) throws IOException {
     int headerCount = Utilities.getHeaderCount(tableDesc);
     int footerCount = Utilities.getFooterCount(tableDesc, jobConf);
-    RecordReader innerReader = 
inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter);
+
+    RecordReader innerReader = null;
+    try {
+     innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, 
reporter);
+    } catch (InterruptedIOException iioe) {
+      // If reading from the underlying record reader is interrupted, return a 
no-op record reader
+      return new ZeroRowsInputFormat().getRecordReader(split.getInputSplit(), 
jobConf, reporter);

Review comment:
       Hey.
   
   So, in my experimentation, this is the least-bad option.  I did this to 
preserve the previous behavior.  The Hive code is not setup to handle this 
error condition.  As thing currently stand in `master`, if the calling Thread 
was interrupted, the thread would finish fetching the rows regardless and then 
just later ignore them (throw them away).  The calling code does not handle 
'null' return value and it does not handle this Exception.  As currently 
implemented in Hive `master`, if it gets an exception it simply exits execution 
with an Error message, without implementing a lot more code, there is no way to 
ignore/skip this one specific error type.  So, the cleanest thing to do is to 
return `ZeroRows` since it's going to be thrown away later anyway.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 626731)
    Time Spent: 4h 43m  (was: 4.55h)

> Upgrade Hadoop to 3.3.1
> -----------------------
>
>                 Key: HIVE-24484
>                 URL: https://issues.apache.org/jira/browse/HIVE-24484
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: David Mollitor
>            Assignee: David Mollitor
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 4h 43m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1

Reply via email to