[ 
https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17850285#comment-17850285
 ] 

ASF GitHub Bot commented on HADOOP-19120:
-----------------------------------------

saxenapranav commented on code in PR #6633:
URL: https://github.com/apache/hadoop/pull/6633#discussion_r1618363938


##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsHttpOperation.java:
##########
@@ -20,370 +20,515 @@
 
 import java.io.IOException;
 import java.io.InputStream;
-import java.io.OutputStream;
 import java.net.HttpURLConnection;
-import java.net.ProtocolException;
 import java.net.URL;
+import java.util.ArrayList;
 import java.util.List;
 import java.util.Map;
 
-import javax.net.ssl.HttpsURLConnection;
-import javax.net.ssl.SSLSocketFactory;
-
-import org.apache.hadoop.classification.VisibleForTesting;
-import org.apache.hadoop.security.ssl.DelegatingSSLSocketFactory;
-
+import com.fasterxml.jackson.core.JsonFactory;
+import com.fasterxml.jackson.core.JsonParser;
+import com.fasterxml.jackson.core.JsonToken;
+import com.fasterxml.jackson.databind.ObjectMapper;
 import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
 
 import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants;
 import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations;
-
-import static 
org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EXPECT_100_JDK_ERROR;
-import static 
org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HUNDRED_CONTINUE;
-import static 
org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.JDK_FALLBACK;
-import static 
org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.JDK_IMPL;
-import static 
org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations.EXPECT;
+import org.apache.hadoop.fs.azurebfs.contracts.services.AbfsPerfLoggable;
+import org.apache.hadoop.fs.azurebfs.contracts.services.ListResultSchema;
+import org.apache.hadoop.fs.azurebfs.utils.UriUtils;
 
 /**
- * Implementation of {@link HttpOperation} for orchestrating calls using JDK's 
HttpURLConnection.
+ * Base Http operation class for orchestrating server IO calls. Child classes 
would
+ * define the certain orchestration implementation on the basis of network 
library used.
+ * <p>
+ * For JDK netlib usage, the child class would be {@link 
AbfsJdkHttpOperation}. <br>
+ * For ApacheHttpClient netlib usage, the child class would be {@link 
AbfsAHCHttpOperation}.
+ * </p>
  */
-public class AbfsHttpOperation extends HttpOperation {
+public abstract class AbfsHttpOperation implements AbfsPerfLoggable {
+
+  private final Logger log;
+
+  private static final int CLEAN_UP_BUFFER_SIZE = 64 * 1024;
+
+  private static final int ONE_THOUSAND = 1000;
+
+  private static final int ONE_MILLION = ONE_THOUSAND * ONE_THOUSAND;
+
+  private final String method;
+  private final URL url;
+  private String maskedUrl;
+  private String maskedEncodedUrl;
+  private int statusCode;
+  private String statusDescription;
+  private String storageErrorCode = "";
+  private String storageErrorMessage = "";
+  private String requestId = "";
+  private String expectedAppendPos = "";
+  private ListResultSchema listResultSchema = null;
+
+  // metrics
+  private int bytesSent;
+  private int expectedBytesToBeSent;
+  private long bytesReceived;
 
-  private static final Logger LOG = LoggerFactory.getLogger(
-      AbfsHttpOperation.class);
+  private long connectionTimeMs;
+  private long sendRequestTimeMs;
+  private long recvResponseTimeMs;
+  private boolean shouldMask = false;
 
-  private HttpURLConnection connection;
+  private final List<AbfsHttpHeader> requestHeaders;
 
-  private boolean connectionDisconnectedOnError = false;
+  private final int connectionTimeout, readTimeout;
 
-  public static AbfsHttpOperation getAbfsHttpOperationWithFixedResult(
+  public AbfsHttpOperation(Logger logger,
       final URL url,
       final String method,
       final int httpStatus) {
-    AbfsHttpOperationWithFixedResult httpOp
-        = new AbfsHttpOperationWithFixedResult(url, method, httpStatus);
-    return httpOp;
+    this.log = logger;
+    this.url = url;
+    this.method = method;
+    this.statusCode = httpStatus;

Review Comment:
   This is no more required, as there was hardSet done on children. Now, it 
will directly happen on the parent class. The changed code of this block is 
reverted to that as of trunk now.





> [ABFS]: ApacheHttpClient adaptation as network library
> ------------------------------------------------------
>
>                 Key: HADOOP-19120
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19120
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/azure
>    Affects Versions: 3.5.0
>            Reporter: Pranav Saxena
>            Assignee: Pranav Saxena
>            Priority: Major
>              Labels: pull-request-available
>
> Apache HttpClient is more feature-rich and flexible and gives application 
> more granular control over networking parameter.
> ABFS currently relies on the JDK-net library. This library is managed by 
> OpenJDK and has no performance problem. However, it limits the application's 
> control over networking, and there are very few APIs and hooks exposed that 
> the application can use to get metrics, choose which and when a connection 
> should be reused. ApacheHttpClient will give important hooks to fetch 
> important metrics and control networking parameters.
> A custom implementation of connection-pool is used. The implementation is 
> adapted from the JDK8 connection pooling. Reasons for doing it:
> 1. PoolingHttpClientConnectionManager heuristic caches all the reusable 
> connections it has created. JDK's implementation only caches limited number 
> of connections. The limit is given by JVM system property 
> "http.maxConnections". If there is no system-property, it defaults to 5. 
> Connection-establishment latency increased with all the connections were 
> cached. Hence, adapting the pooling heuristic of JDK netlib,
> 2. In PoolingHttpClientConnectionManager, it expects the application to 
> provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as 
> the total number of connections it can create. For application using ABFS, it 
> is not feasible to provide a value in the initialisation of the 
> connectionManager. JDK's implementation has no cap on the number of 
> connections it can have opened on a moment. Hence, adapting the pooling 
> heuristic of JDK netlib,



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to