[ 
https://issues.apache.org/jira/browse/KAFKA-15776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17809116#comment-17809116
 ] 

Kamal Chandraprakash commented on KAFKA-15776:
----------------------------------------------

[~fvisconte] 

We are cancelling the currently executing fetch 
[task|https://sourcegraph.com/github.com/apache/kafka@92a67e8571500a53cc864ba6df4cb9cfdac6a763/-/blob/core/src/main/scala/kafka/server/DelayedRemoteFetch.scala?L86]
 when the timeout happens. When the remote storage degrades, then the consumer 
may not be able to make progress. I'll open a discussion thread to discuss on 
this.

One approach is not to cancel the currently executing remote fetch task and 
cache the result on the storage manager, so that the subsequent consumer FETCH 
request (for the same fetch-offset) can be served from the cache.

> Update delay timeout for DelayedRemoteFetch request
> ---------------------------------------------------
>
>                 Key: KAFKA-15776
>                 URL: https://issues.apache.org/jira/browse/KAFKA-15776
>             Project: Kafka
>          Issue Type: Task
>            Reporter: Kamal Chandraprakash
>            Assignee: Kamal Chandraprakash
>            Priority: Major
>
> We are reusing the {{fetch.max.wait.ms}} config as a delay timeout for 
> DelayedRemoteFetchPurgatory. {{fetch.max.wait.ms}} purpose is to wait for the 
> given amount of time when there is no data available to serve the FETCH 
> request.
> {code:java}
> The maximum amount of time the server will block before answering the fetch 
> request if there isn't sufficient data to immediately satisfy the requirement 
> given by fetch.min.bytes.
> {code}
> [https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/DelayedRemoteFetch.scala#L41]
> Using the same timeout in the DelayedRemoteFetchPurgatory can confuse the 
> user on how to configure optimal value for each purpose. Moreover, the config 
> is of *LOW* importance and most of the users won't configure it and use the 
> default value of 500 ms.
> Having the delay timeout of 500 ms in DelayedRemoteFetchPurgatory can lead to 
> higher number of expired delayed remote fetch requests when the remote 
> storage have any degradation.
> We should introduce one {{fetch.remote.max.wait.ms}} config (preferably 
> server config) to define the delay timeout for DelayedRemoteFetch requests 
> (or) take it from client similar to {{request.timeout.ms}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to