[ https://issues.apache.org/jira/browse/HADOOP-17812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17386993#comment-17386993 ]
Bobby Wang commented on HADOOP-17812: ------------------------------------- Hi [~ste...@apache.org] Thx for your comments on the PR and the JIRA, I modified the unit test to include the repro step described in this JIRA, and the unit tests all passed. I followed the step to run the Integration Test on our internal s3 storage and seems some tests failed. And I was told some failed tests were expected, So I just uploaded the failsafe-report.html in the attachment named with s3a-test.tar.gz, Could you help to check if the failed test cases are expected? BTW, I just configured the below items in the auth-keys.xml, {quote}<configuration> <property> <name>test.fs.s3a.name</name> <value>s3a://testawss3a/</value> </property> <property> <name>fs.contract.test.fs.s3a</name> <value>${test.fs.s3a.name}</value> </property> <property> <name>fs.s3a.access.key</name> <value>XXX</value> </property> <property> <name>fs.s3a.secret.key</name> <value>XXXXXX</value> </property> <property> <name>fs.s3a.endpoint</name> <value>XXXXX</value> </property> <property> <name>fs.s3a.path.style.access</name> <value>true</value> </property> </configuration> {quote} > NPE in S3AInputStream read() after failure to reconnect to store > ---------------------------------------------------------------- > > Key: HADOOP-17812 > URL: https://issues.apache.org/jira/browse/HADOOP-17812 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 > Affects Versions: 3.2.2, 3.3.1 > Reporter: Bobby Wang > Priority: Major > Labels: pull-request-available > Attachments: s3a-test.tar.gz > > Time Spent: 1h 50m > Remaining Estimate: 0h > > when [reading from S3a > storage|https://github.com/apache/hadoop/blob/rel/release-3.2.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java#L450], > SSLException (which extends IOException) happens, which will trigger > [onReadFailure|https://github.com/apache/hadoop/blob/rel/release-3.2.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java#L458]. > onReadFailure calls "reopen". it will first close the original > *wrappedStream* and set *wrappedStream = null*, and then it will try to > [re-get > *wrappedStream*|https://github.com/apache/hadoop/blob/rel/release-3.2.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java#L184]. > But what if the previous code [obtaining > S3Object|https://github.com/apache/hadoop/blob/rel/release-3.2.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java#L183] > throw exception, then "wrappedStream" will be null. > And the > [retry|https://github.com/apache/hadoop/blob/rel/release-3.2.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java#L446] > mechanism may re-execute the > [wrappedStream.read|https://github.com/apache/hadoop/blob/rel/release-3.2.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java#L450] > and cause NPE. > > For more details, please refer to > [https://github.com/NVIDIA/spark-rapids/issues/2915] -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org