Hi all,

We're upgrading a Kafka streams application from 0.10.2.1 to 0.11.0.1 and
our application is running against a Kafka cluster with version 0.10.2.1.

When we first attempted to upgrade our application to Kafka 0.11.0.1, we
observed that when we deleted the PVCs for the service, and restarted, the
new Kafka stream thread task stickiness behavior (
https://issues.apache.org/jira/browse/KAFKA-4677) had unintentional
side-effects. Specifically, in scenarios where PVC recreation is necessary,
it appears that the new task stickiness semantics caused the task assignor
to prematurely decide that there was only ever going to be one instance of
our app in the consumer groups, and therefore all stream threads would be
launched on the first instance in the consumer group. Later, when the
second and third instance of our app were launched, the task assignor did
not take these new instances into account, and none of the partitions were
assigned to the second and third instance of our app.

We worked around this by deleting the PVCs, and when pod one was running
(running as defined by k8s), we would repeatedly kubectl delete the pod
until instance 2 and 3 were successfully running as well. Then, we would
allow the reprocessing to take place. This caused the task assignor to take
into account all pod instances, before making assignments which 'stuck'.


I saw with the 0.11.0.0 change that a new broker config was introduced "
group.initial.rebalance.delay.ms" however from what I understand this would
require upgrading our Kafka cluster to 0.11.0.0 which is not something we
will be able to do in the near future.



*"The group.initial.rebalance.delay.ms
<http://group.initial.rebalance.delay.ms/> config specifies the amount of
time, in milliseconds, that the GroupCoordinator will delay the initial
consumer rebalance. The rebalance is further delayed by the value
of group.initial.rebalance.delay.ms
<http://group.initial.rebalance.delay.ms/> as each new member joins the
consumer group, up to a maximum of the value set by max.poll.interval.ms
<http://max.poll.interval.ms/>. The net benefit is that this can reduce the
overall startup time for Kafka Streams applications that have more than one
thread. The default value for group.initial.rebalance.delay.ms
<http://group.initial.rebalance.delay.ms/> is 3000 milliseconds.In practice
this means that if you are starting up your Kafka Streams app from a cold
start, when the first member joins the group there will be at least a 3
second delay before it is assigned any tasks. If any other members join the
group within the initial 3 seconds, then there will be a further 3 second
delay. Once no new members have joined the group within the 3 second delay,
or max.poll.interval.ms <http://max.poll.interval.ms/> is reached, then the
group rebalance can complete and all current members will be assigned
tasks. The benefit of this approach, particularly for Kafka Streams
applications, is that we can now delay the assignment and re-assignment of
potentially expensive tasks as new members join. So we can avoid the
situation where one instance is assigned all tasks, begins
restoring/processing, only to shortly after be rebalanced, and then have to
start again with half of the tasks and so on."*

Similar to the "group.initial.rebalance.delay.ms", are there any
configuration values we can tweak to defer sticky task assignment to ensure
all instances are running first.

Thanks,
Laven

Reply via email to