kamalcph commented on PR #14778: URL: https://github.com/apache/kafka/pull/14778#issuecomment-1818291232
> Could you please help me understand how this change works with fetch.max.wait.ms from a user perspective i.e. what happens when we are retrieving data from both local & remote in a single fetch call? `fetch.max.wait.ms` timeout is applicable only when there is no enough data (`fetch.min.bytes`) to respond back to the client. This is a special case where we are reading the data from both local and remote, the FETCH request has to wait for the tail latency which is a combined latency of reading from both local and remote storage. Note that we always read from only one remote partition up-to `max.partition.fetch.bytes` even-though there is available bandwidth in the FETCH response (`fetch.max.bytes`) and the client rotates the partition order in the next FETCH request so that next partitions are served. > Also, wouldn't this change user clients? Asking because prior to this change users were expecting a guaranteed response within fetch.max.wait.ms = 500ms but now they might not receive a response until 40s request.timeout.ms. If the user has configured their application timeouts to according to fetch.max.wait.ms, this change will break my application. `fetch.max.wait.ms` doesn't guarantee a response within this timeout. The client expires the request only when it exceeds the `request.timeout.ms` of 30 seconds (default). The time taken to serve the FETCH request can be higher than the `fetch.max.wait.ms` due to slow hard-disk, sector errors in disk and so on. The [FetchRequest.json](https://sourcegraph.com/github.com/apache/kafka/-/blob/clients/src/main/resources/common/message/FetchRequest.json) doesn't expose the client configured request timeout, so we are using the default server request timeout of 30 seconds. Otherwise, we can introduce one more config `fetch.remote.max.wait.ms` to define the delay timeout for DelayedRemoteFetch requests. We need to decide whether to keep this config in the client/server since the server operator may need to tune this config if the remote storage degrades and latency to serve the FETCH requests is high. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org