[ 
https://issues.apache.org/jira/browse/HADOOP-19641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18012494#comment-18012494
 ] 

Anuj Modi commented on HADOOP-19641:
------------------------------------

Thanks for the feedback steve. We will definitely incorporate that.
I will hold onto this PR and will make this change with the Read Policy 
suggested by user taken into consideration.

Will work diligently on all the read policies and have reads happening in way 
optimal for each one of them.

> ABFS: [ReadAheadV2] First Read should bypass ReadBufferManager
> --------------------------------------------------------------
>
>                 Key: HADOOP-19641
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19641
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/azure
>    Affects Versions: 3.4.1
>            Reporter: Anuj Modi
>            Assignee: Anuj Modi
>            Priority: Major
>              Labels: Performance
>
> We have observed this across multiple workload runs that when we start 
> reading data from input stream. The first read which came to input stream has 
> to be read synchronously even if we trigger prefetch request for that 
> particular offset. Most of the times we end up doing extra work of checking 
> if the prefetch is trigerred, removing prefetch from the pending queue and go 
> ahead to do a direct remote read in workload thread itself.
> To avoid all this overhead, we will always bypass read ahead for the very 
> first read of each input stream and trigger read aheads for second read 
> onwards.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to