ivandika3 opened a new pull request, #6435: URL: https://github.com/apache/ozone/pull/6435
## What changes were proposed in this pull request? Currently, the MessageDigest instance is a thread local variable (one per S3G Jetty thread). MessageDigest requires the call to either MessageDigest#digest or MessageDigest#reset to reset the digest. In normal ObjectEndpoint#put flow, MessageDigest#digest is called after the data has been written to the datanodes, before the key is committed. However, if an IOException happens (e.g. EOFException due to client cancelling during the write), the digest will not be reset and remains in the inconsistent state. This will affect the subsequent request that uses the same thread and therefore the ETag generated will be completely different from the md5 hash of the object causing AWS S3 SDK to detect inconsistent hash when downloading the object. The issue can be replicated using an S3G with a single thread and doing three put-object operations for the same key and same payload. You can set the `hadoop.http.max.threads` in `ozone-site.xml` to a small value (e.g. 4) to increase the chance of the same thread handling the request. - 1st put-object: cancel the operation before it put-object operation can finish, ensure the EOFException is thrown in the S3Gateway logs - 2nd put-object: let the put-object finish. The resulting ETag will not be the same as the md5 digest of the payload (you might need to do this for a few time since the S3G thread might not be the same from the previous call) - 3rd put-object: also let the put-object finish. Since the previous put-object reset the digest, the resulting ETag will be correct. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-10587 ## How was this patch tested? Manual test from Ozone Intellij IDE setup as shown in the description. Ref: https://cwiki.apache.org/confluence/display/OZONE/Run+Ozone+cluster+from+IDE -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
