[ 
https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17836595#comment-17836595
 ] 

ASF GitHub Bot commented on HADOOP-19120:
-----------------------------------------

anmolanmol1234 commented on code in PR #6633:
URL: https://github.com/apache/hadoop/pull/6633#discussion_r1562502863


##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsHttpOperation.java:
##########
@@ -20,370 +20,515 @@
 
 import java.io.IOException;
 import java.io.InputStream;
-import java.io.OutputStream;
 import java.net.HttpURLConnection;
-import java.net.ProtocolException;
 import java.net.URL;
+import java.util.ArrayList;
 import java.util.List;
 import java.util.Map;
 
-import javax.net.ssl.HttpsURLConnection;
-import javax.net.ssl.SSLSocketFactory;
-
-import org.apache.hadoop.classification.VisibleForTesting;
-import org.apache.hadoop.security.ssl.DelegatingSSLSocketFactory;
-
+import com.fasterxml.jackson.core.JsonFactory;
+import com.fasterxml.jackson.core.JsonParser;
+import com.fasterxml.jackson.core.JsonToken;
+import com.fasterxml.jackson.databind.ObjectMapper;
 import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
 
 import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants;
 import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations;
-
-import static 
org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EXPECT_100_JDK_ERROR;
-import static 
org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HUNDRED_CONTINUE;
-import static 
org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.JDK_FALLBACK;
-import static 
org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.JDK_IMPL;
-import static 
org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations.EXPECT;
+import org.apache.hadoop.fs.azurebfs.contracts.services.AbfsPerfLoggable;
+import org.apache.hadoop.fs.azurebfs.contracts.services.ListResultSchema;
+import org.apache.hadoop.fs.azurebfs.utils.UriUtils;
 
 /**
- * Implementation of {@link HttpOperation} for orchestrating calls using JDK's 
HttpURLConnection.
+ * Base Http operation class for orchestrating server IO calls. Child classes 
would
+ * define the certain orchestration implementation on the basis of network 
library used.
+ * <p>
+ * For JDK netlib usage, the child class would be {@link 
AbfsJdkHttpOperation}. <br>
+ * For ApacheHttpClient netlib usage, the child class would be {@link 
AbfsAHCHttpOperation}.
+ * </p>
  */
-public class AbfsHttpOperation extends HttpOperation {
+public abstract class AbfsHttpOperation implements AbfsPerfLoggable {
+
+  private final Logger log;
+
+  private static final int CLEAN_UP_BUFFER_SIZE = 64 * 1024;
+
+  private static final int ONE_THOUSAND = 1000;
+
+  private static final int ONE_MILLION = ONE_THOUSAND * ONE_THOUSAND;
+
+  private final String method;
+  private final URL url;
+  private String maskedUrl;
+  private String maskedEncodedUrl;
+  private int statusCode;
+  private String statusDescription;
+  private String storageErrorCode = "";
+  private String storageErrorMessage = "";
+  private String requestId = "";
+  private String expectedAppendPos = "";
+  private ListResultSchema listResultSchema = null;
+
+  // metrics
+  private int bytesSent;
+  private int expectedBytesToBeSent;
+  private long bytesReceived;
 
-  private static final Logger LOG = LoggerFactory.getLogger(
-      AbfsHttpOperation.class);
+  private long connectionTimeMs;
+  private long sendRequestTimeMs;
+  private long recvResponseTimeMs;
+  private boolean shouldMask = false;
 
-  private HttpURLConnection connection;
+  private final List<AbfsHttpHeader> requestHeaders;
 
-  private boolean connectionDisconnectedOnError = false;
+  private final int connectionTimeout, readTimeout;
 
-  public static AbfsHttpOperation getAbfsHttpOperationWithFixedResult(
+  public AbfsHttpOperation(Logger logger,
       final URL url,
       final String method,
       final int httpStatus) {
-    AbfsHttpOperationWithFixedResult httpOp
-        = new AbfsHttpOperationWithFixedResult(url, method, httpStatus);
-    return httpOp;
+    this.log = logger;
+    this.url = url;
+    this.method = method;
+    this.statusCode = httpStatus;

Review Comment:
   shouldn't the status code come from connection response ?





> [ABFS]: ApacheHttpClient adaptation as network library
> ------------------------------------------------------
>
>                 Key: HADOOP-19120
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19120
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/azure
>    Affects Versions: 3.5.0
>            Reporter: Pranav Saxena
>            Assignee: Pranav Saxena
>            Priority: Major
>              Labels: pull-request-available
>
> Apache HttpClient is more feature-rich and flexible and gives application 
> more granular control over networking parameter.
> ABFS currently relies on the JDK-net library. This library is managed by 
> OpenJDK and has no performance problem. However, it limits the application's 
> control over networking, and there are very few APIs and hooks exposed that 
> the application can use to get metrics, choose which and when a connection 
> should be reused. ApacheHttpClient will give important hooks to fetch 
> important metrics and control networking parameters.
> A custom implementation of connection-pool is used. The implementation is 
> adapted from the JDK8 connection pooling. Reasons for doing it:
> 1. PoolingHttpClientConnectionManager heuristic caches all the reusable 
> connections it has created. JDK's implementation only caches limited number 
> of connections. The limit is given by JVM system property 
> "http.maxConnections". If there is no system-property, it defaults to 5. 
> Connection-establishment latency increased with all the connections were 
> cached. Hence, adapting the pooling heuristic of JDK netlib,
> 2. In PoolingHttpClientConnectionManager, it expects the application to 
> provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as 
> the total number of connections it can create. For application using ABFS, it 
> is not feasible to provide a value in the initialisation of the 
> connectionManager. JDK's implementation has no cap on the number of 
> connections it can have opened on a moment. Hence, adapting the pooling 
> heuristic of JDK netlib,



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to