We're using Kafka 2.4.0 and trying to understand the Java consumer behavior w.r.t. transient network problems, such as getting temporarily disconnected from a broker due to temporary broker or network issue.
The following consumer config. settings imply that in the above scenario, all consumer client API calls like poll() and commit(), and internal group coordination and heartbeat calls would be automatically retried up to the configured max. time. Is that correct? *reconnect.backoff.ms <http://reconnect.backoff.ms>*: The base amount of time to wait before attempting to reconnect to a given host. This avoids repeatedly connecting to a host in a tight loop. This backoff applies to all connection attempts by the client to a broker. *reconnect.backoff.max.ms <http://reconnect.backoff.max.ms>*: The maximum amount of time in milliseconds to wait when reconnecting to a broker that has repeatedly failed to connect. If provided, the backoff per host will increase exponentially for each consecutive connection failure, up to this maximum. After calculating the backoff increase, 20% random jitter is added to avoid connection storms. For an "external" consumer call like poll() or commit() that fails after max. backoff, if the connection to the broker is later reestablished, subsequent calls to those APIs should complete successfully, correct? What about the internal calls for group coordination, heartbeating, etc. Are those client functions permanently disabled -- and hence the client is effectively dead in the water and should be restarted -- after failing for reconnect.backoff.max.ms, or are those NOT subject to that maximum? thanks, Chris