kamalcph commented on PR #14778:
URL: https://github.com/apache/kafka/pull/14778#issuecomment-1818291232

   > Could you please help me understand how this change works with 
fetch.max.wait.ms from a user perspective i.e. what happens when we are 
retrieving data from both local & remote in a single fetch call?
   
   `fetch.max.wait.ms` timeout is applicable only when there is no enough data 
(`fetch.min.bytes`) to respond back to the client. This is a special case where 
we are reading the data from both local and remote, the FETCH request has to 
wait for the tail latency which is a combined latency of reading from both 
local and remote storage. 
   
   Note that we always read from only one remote partition up-to 
`max.partition.fetch.bytes` even-though there is available bandwidth in the 
FETCH response (`fetch.max.bytes`) and the client rotates the partition order 
in the next FETCH request so that next partitions are served.
   
   > Also, wouldn't this change user clients? Asking because prior to this 
change users were expecting a guaranteed response within fetch.max.wait.ms = 
500ms but now they might not receive a response until 40s request.timeout.ms. 
If the user has configured their application timeouts to according to 
fetch.max.wait.ms, this change will break my application.
   
   `fetch.max.wait.ms` doesn't guarantee a response within this timeout. The 
client expires the request only when it exceeds the `request.timeout.ms` of 30 
seconds (default). The time taken to serve the FETCH request can be higher than 
the `fetch.max.wait.ms` due to slow hard-disk, sector errors in disk and so on.
   
   The 
[FetchRequest.json](https://sourcegraph.com/github.com/apache/kafka/-/blob/clients/src/main/resources/common/message/FetchRequest.json)
 doesn't expose the client configured request timeout, so we are using the 
default server request timeout of 30 seconds. Otherwise, we can introduce one 
more config `fetch.remote.max.wait.ms` to define the delay timeout for 
DelayedRemoteFetch requests. We need to decide whether to keep this config in 
the client/server since the server operator may need to tune this config if the 
remote storage degrades and latency to serve the FETCH requests is high.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to