[jira] [Commented] (KAFKA-15776) Update delay timeout for DelayedRemoteFetch request

Jorge Esteban Quilcate Otoya (Jira) Thu, 25 Jan 2024 17:18:03 -0800


    [ 
https://issues.apache.org/jira/browse/KAFKA-15776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17811071#comment-17811071
 ]


Jorge Esteban Quilcate Otoya commented on KAFKA-15776:
------------------------------------------------------

Thanks [~showuon]!

Sure, I can prepare a KIP if there's an initial agreement on the path to 
follow. Will prepare something for next week.

On not interrupting the thread:

My understanding is that currently on consumer remote fetch, requests are 
submitted to the thread pool and cancelled on timeout – only then retried. This 
means only 1 task is submitted per consumer-partition remote fetch at any time.

If we opt for not cancelling the tasks, then future would be cancelled but the 
thread will still be running until completion. On timeout, consumer will retry 
fetching, allocating yet another task on the thread pool. Potentially, we would 
have more than one task submitted per consumer-partition remote fetch, holding 
more resources than needed to deal with a single consumer-partition consumption 
from remote storage.

Let me know if it make sense. This is mostly speculation, so can dive further 
if some of my reasoning is incorrect.

> Update delay timeout for DelayedRemoteFetch request
> ---------------------------------------------------
>
>                 Key: KAFKA-15776
>                 URL: https://issues.apache.org/jira/browse/KAFKA-15776
>             Project: Kafka
>          Issue Type: Task
>            Reporter: Kamal Chandraprakash
>            Assignee: Kamal Chandraprakash
>            Priority: Major
>
> We are reusing the {{fetch.max.wait.ms}} config as a delay timeout for 
> DelayedRemoteFetchPurgatory. {{fetch.max.wait.ms}} purpose is to wait for the 
> given amount of time when there is no data available to serve the FETCH 
> request.
> {code:java}
> The maximum amount of time the server will block before answering the fetch 
> request if there isn't sufficient data to immediately satisfy the requirement 
> given by fetch.min.bytes.
> {code}
> [https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/DelayedRemoteFetch.scala#L41]
> Using the same timeout in the DelayedRemoteFetchPurgatory can confuse the 
> user on how to configure optimal value for each purpose. Moreover, the config 
> is of *LOW* importance and most of the users won't configure it and use the 
> default value of 500 ms.
> Having the delay timeout of 500 ms in DelayedRemoteFetchPurgatory can lead to 
> higher number of expired delayed remote fetch requests when the remote 
> storage have any degradation.
> We should introduce one {{fetch.remote.max.wait.ms}} config (preferably 
> server config) to define the delay timeout for DelayedRemoteFetch requests 
> (or) take it from client similar to {{request.timeout.ms}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-15776) Update delay timeout for DelayedRemoteFetch request

Reply via email to