rishabhdaim commented on code in PR #2621:
URL: https://github.com/apache/jackrabbit-oak/pull/2621#discussion_r2554596800
##########
oak-blob-cloud/src/main/java/org/apache/jackrabbit/oak/blob/cloud/s3/Utils.java:
##########
@@ -372,16 +408,24 @@ static String calculateMD5(final String customerKey) {
}
private static ClientOverrideConfiguration
getClientConfiguration(Properties prop) {
+ final boolean isS3 = Objects.equals(RemoteStorageMode.S3,
prop.get(S3Constants.MODE));
+
int maxErrorRetry =
Integer.parseInt(prop.getProperty(S3Constants.S3_MAX_ERR_RETRY));
int connectionTimeOut =
Integer.parseInt(prop.getProperty(S3Constants.S3_CONN_TIMEOUT));
String encryptionType = prop.getProperty(S3Constants.S3_ENCRYPTION);
+ // API timeout should be much longer than connection timeout for large
file uploads
+ // Use at least 5 minutes, or 10x connection timeout, whichever is
larger
+ int apiTimeout = Math.max(connectionTimeOut * 10, 300000); // At least
5 minutes
Review Comment:
These values have been suggested by the cursor and confirmed the same with
perplexity.
##########
oak-blob-cloud/src/main/java/org/apache/jackrabbit/oak/blob/cloud/s3/S3Backend.java:
##########
@@ -514,11 +514,13 @@ public void addMetadataRecord(final InputStream input,
final String name) throws
try {
Thread.currentThread().setContextClassLoader(getClass().getClassLoader());
// Specify `null` for the content length when you don't know the
content length.
- final AsyncRequestBody body =
AsyncRequestBody.fromInputStream(input, null, executor);
+ byte[] bytes = input.readAllBytes();
Review Comment:
I have tried with files up to 200+ MB, and it worked fine with them.
The real reason for doing it this way is to allow GCP to work, which doesn't
work when the length is not known.
##########
oak-blob-cloud/src/main/java/org/apache/jackrabbit/oak/blob/cloud/s3/Utils.java:
##########
@@ -410,12 +473,39 @@ private static SdkHttpClient getSdkHttpClient(Properties
prop) {
private static SdkAsyncHttpClient getSdkAsyncHttpClient(Properties prop) {
HttpClientConfig config = new HttpClientConfig(prop);
+ final boolean isGCP = Objects.equals(RemoteStorageMode.GCP,
prop.get(S3Constants.MODE));
final NettyNioAsyncHttpClient.Builder builder =
NettyNioAsyncHttpClient.builder();
- builder.connectionTimeout(Duration.ofMillis(config.connectionTimeout))
+ // Calculate connection lifecycle based on socket timeout (all in
SECONDS)
+ long socketTimeoutSeconds = config.socketTimeout / 1000;
+
+ // Idle time: 2x socket timeout (min 30s, max 120s)
+ long idleTimeSeconds = Math.min(120, Math.max(30, socketTimeoutSeconds
* 2));
+
+ // TTL: 5x socket timeout (min 60s, max 600s = 10min)
+ long ttlSeconds = Math.min(600, Math.max(60, socketTimeoutSeconds *
5));
+
+ // GCP needs higher concurrency (no HTTP/2, so more connections needed)
+ final int concurrency = isGCP ? Math.max(100, config.maxConnections) :
Math.max(50, config.maxConnections);
+
+ // More threads for GCP
+ final int threads = isGCP ? Math.max(16,
Runtime.getRuntime().availableProcessors() * 2)
+ : Math.max(4, Runtime.getRuntime().availableProcessors());
+
+
+ builder.connectionTimeout(Duration.ofMillis(config.connectionTimeout))
// Connection timeouts
.readTimeout(Duration.ofMillis(config.socketTimeout))
.writeTimeout(Duration.ofMillis(config.socketTimeout))
- .maxConcurrency(config.maxConnections);
+ .maxConcurrency(concurrency) // Connection pool - increased
for better concurrency
+ .connectionMaxIdleTime(Duration.ofSeconds(idleTimeSeconds))
+ .connectionTimeToLive(Duration.ofSeconds(ttlSeconds))
+ .useIdleConnectionReaper(true)
+ .connectionAcquisitionTimeout(Duration.ofSeconds(10)) // Don't
wait too long for a connection from pool
Review Comment:
This is an leftover change, would remove it.
##########
oak-blob-cloud/src/main/java/org/apache/jackrabbit/oak/blob/cloud/s3/Utils.java:
##########
@@ -482,6 +574,17 @@ private static void configureBuilder(final
S3BaseClientBuilder builder, final Pr
builder.endpointOverride(getEndPointUri(prop, accReq, region));
builder.crossRegionAccessEnabled(Boolean.parseBoolean(prop.getProperty(S3Constants.S3_CROSS_REGION_ACCESS)));
+
+ // Disable checksums (replaces deprecated checksumValidationEnabled)
+
builder.requestChecksumCalculation(RequestChecksumCalculation.WHEN_REQUIRED);
+
builder.responseChecksumValidation(ResponseChecksumValidation.WHEN_REQUIRED);
+
Review Comment:
Good catch, these needs to be disabled only for GCP.
##########
oak-blob-cloud/src/main/java/org/apache/jackrabbit/oak/blob/cloud/s3/S3Backend.java:
##########
@@ -528,15 +530,6 @@ public void addMetadataRecord(final InputStream input,
final String name) throws
if (contextClassLoader != null) {
Thread.currentThread().setContextClassLoader(contextClassLoader);
}
- executor.shutdown();
Review Comment:
Actually, the creation of Executor hasn't been removed. Missed that, would
remove it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]