snvijaya commented on code in PR #6069: URL: https://github.com/apache/hadoop/pull/6069#discussion_r1332534729
########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/ConfigurationKeys.java: ########## @@ -241,6 +241,9 @@ public final class ConfigurationKeys { /** Add extra resilience to rename failures, at the expense of performance. */ public static final String FS_AZURE_ABFS_RENAME_RESILIENCE = "fs.azure.enable.rename.resilience"; + /** Add extra layer of verification of the integrity of the request content during transport. */ + public static final String FS_AZURE_ABFS_ENABLE_CHECKSUM_VALIDATION = "fs.azure.enable.checksum.validation"; Review Comment: Add documenation for the config in https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-azure/src/site/markdown/abfs.md Also highlight that this will have perf impact due to client and server md5 recomputations. ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/contracts/exceptions/InvalidChecksumException.java: ########## @@ -0,0 +1,44 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * <p> + * http://www.apache.org/licenses/LICENSE-2.0 + * <p> + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + + +package org.apache.hadoop.fs.azurebfs.contracts.exceptions; + +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceStability; +import org.apache.hadoop.fs.azurebfs.contracts.services.AzureServiceErrorCode; + +/** + * Exception to wrap invalid checksum verification on client side. + */ +@InterfaceAudience.Public +@InterfaceStability.Evolving +public class InvalidChecksumException extends AbfsRestOperationException { Review Comment: Rename to AbfsInvalidChecksumException ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java: ########## @@ -761,6 +764,11 @@ public AbfsRestOperation append(final String path, final byte[] buffer, requestHeaders.add(new AbfsHttpHeader(USER_AGENT, userAgentRetry)); } + // Add MD5 Hash of request content as request header if feature is enabled + if (isChecksumValidationEnabled()) { Review Comment: In case of appends, as per REST API doc, server will fail: "If the two hashes do not match, the operation will fail with error code 400 (Bad Request)." Are there indications in server error code response header to determine its due to MD5 mismatch and can get converted to AbfsInvalidChecksumException too ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org