[ https://issues.apache.org/jira/browse/HADOOP-19664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18016496#comment-18016496 ]
Ahmar Suhail commented on HADOOP-19664: --------------------------------------- Sync. It's using async currently. But the readVectored + AAL use case is not ideal for the async client. As we already have our own thread pool, and each thread is responsible for making a single S3 request, and start reading data from that input stream immediately to fill the internal buffers.. With the async client, this means you need to join() immediately, and when at a higher concurrency things get stuck in the Netty thread pool and the AsyncResponseTransformer.toBlockingInputStream() of s3AsyncClient .getObject(builder.build(), AsyncResponseTransformer.toBlockingInputStream()). S3Async client works well (I think) when you have high concurrency but don't need to join on the data immediately, so the netty io pool is sufficient to satisfy those requests. > S3A Analytics-Accelerator: Move AAL to use Java sync client > ----------------------------------------------------------- > > Key: HADOOP-19664 > URL: https://issues.apache.org/jira/browse/HADOOP-19664 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.5.0 > Reporter: Ahmar Suhail > Priority: Major > > Java sync client is giving the best performance for our use case, especially > for readVectored() where a large number of requests can be made concurrently. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org