[ 
https://issues.apache.org/jira/browse/HADOOP-19664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18016496#comment-18016496
 ] 

Ahmar Suhail commented on HADOOP-19664:
---------------------------------------

Sync. 

It's using async currently. But the readVectored + AAL use case is not ideal 
for the async client. As we already have our own thread pool, and each thread 
is responsible for making a single S3 request, and start reading data from that 
input stream immediately to fill the internal buffers.. 

With the async client, this means you need to join() immediately, and when at a 
higher concurrency things get stuck in the Netty thread pool and the 
AsyncResponseTransformer.toBlockingInputStream() of

s3AsyncClient
.getObject(builder.build(), AsyncResponseTransformer.toBlockingInputStream()).
 
S3Async client works well (I think) when you have high concurrency but don't 
need to join on the data immediately, so the netty io pool is sufficient to 
satisfy those requests. 

> S3A Analytics-Accelerator: Move AAL to use Java sync client
> -----------------------------------------------------------
>
>                 Key: HADOOP-19664
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19664
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.5.0
>            Reporter: Ahmar Suhail
>            Priority: Major
>
> Java sync client is giving the best performance for our use case, especially 
> for readVectored() where a large number of requests can be made concurrently. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to