Hello Apache Kafka community,

In Consumer, Producer, AdminClient and Broker configuration documentation
there's a common config property, request.timeout.ms, with common
description part being:
"The configuration controls the maximum amount of time the client will wait
for the response of a request. If the response is not received before the
timeout elapses the client will resend the request if necessary or fail the
request if retries are exhausted."

If I'm not mistaken "client" term in all the different request.timeout.ms
config property descriptions actually refers to NetworkClient, which is
kind of leaky internal abstraction. It seems there's no mentioning of
NetworkClient on the Kafka documentation page. By it's javadoc
NetworkClient is:
"A network client for asynchronous request/response network i/o. This is an
internal class used to implement the user-facing producer and consumer
clients."
Since it's considered to be internal class maybe it could be moved in
"internal" package as other internal classes.
More importantly NetworkClient javadoc (second sentence) is not entirely
true, since NetworkClient is used on the broker side too, e.g. to exchange
controlled shutdown request/response, which IMO has nothing to do with
"user-facing producer and consumer clients". Because NetworkClient
abstraction is used on the broker side, there's request.timeout.ms config
property not only for producer/consumer but also in broker configuration.

Can somebody please verify if my understanding of the current situation is
correct?

There's no mentioning in the Kafka documentation about which requests will
be affected by tuning each of the request.timeout.ms config properties, or
how if at all are different request timeouts related.

Specifically I'd like to lower producer/consumer request timeout, so
user-facing client requests like Produce/Fetch/Metadata should be affected,
but e.g. controlled shutdown requests on the broker side should not. I'm
not sure whether broker side request timeout can be left unchanged or if
there's combination/chain of client and broker side request/responses that
are related so that the request timeout settings have to be kept in sync. I
guess maybe client side Produce request and broker side replica Fetch form
kind of a chain/dependency - depending on acks Produce cannot finish
successfully until enough of replicas got the message. Producer's "
request.timeout.ms" description explains relationship with Broker's "
replica.lag.time.max.ms" (potential produce failure or duplication of
messages due to retries being negative side-effects) but relationship with
Broker's "request.timeout.ms" is not covered. Similarly Consumer's Fetch
request to lead broker seems can only retrieve messages replicated to rest
of ISR set, so there's again kind of dependency on replica Fetch, this time
dependency has not so negative side-effect, it seems there could be more
empty reads if Consumer request timeout is lower than Broker's which is a
tradeoff, lower latency of individual requests vs lower load / number of
requests.

Is there a reason that the producer/consumer request timeout are set to
same default value as request timeout default on the broker side?

Additionally, is there are reason why there are no per-type-of-request
timeout config properties? Does 30sec timeout really make sense for all
kinds of requests, like user facing and internal coordination requests? New
AdminClient request timeout is the only one it seems to have different,
120sec as default request timeout (old Scala AdminClient it seems has 5sec
default request timeout).

On a related note, even after looking at ServerSocketTest I'm not sure what
happens with request handling on broker side when producer/consumer side
timeout occurs - how will broker treat this, will it stop request
processing and release any resources used, or will it continue to process
the request and consume resources until request is full processed so when
producer/consumer retries its timedout request, the broker side load will
just multiply with number of retries?

Kind regards,
Stevo Slavic.

Reply via email to