Rohit Bobade created KAFKA-15520:
------------------------------------

             Summary: Kafka Streams Stateful Aggregation Rebalancing causing 
processing to pause on all partitions
                 Key: KAFKA-15520
                 URL: https://issues.apache.org/jira/browse/KAFKA-15520
             Project: Kafka
          Issue Type: Bug
          Components: streams
    Affects Versions: 2.6.2
            Reporter: Rohit Bobade


Kafka broker version: 2.8.0 Kafka Streams client version: 2.6.2

I am running kafka streams stateful aggregations on K8s statefulset with 
persistent volume attached to each pod. I have also specified

props.put(ConsumerConfig.GROUP_INSTANCE_ID_CONFIG, podName);

which makes sure it gets the sticky partition assignment.

Enabled standby replica - props.put(StreamsConfig.NUM_STANDBY_REPLICAS_CONFIG, 
1);

and set props.put(StreamsConfig.ACCEPTABLE_RECOVERY_LAG_CONFIG, "0");

However, I'm seeing that when pods restart - it triggers rebalances and causes 
processing to be paused on all pods till the rebalance and state restore is in 
progress.

My understanding is that even if there is a rebalance - only the partitions 
that should be moved around will be restored in a cooperative way and not pause 
all the processing. Also, it should failover to standby replica in this case 
and avoid state restoring on other pods.

I have increased session timeout to 480 seconds and max poll interval to 15 
mins to minimize rebalances.

Also added
props.put(ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG, 
CooperativeStickyAssignor.class.getName());

to enable CooperativeStickyAssignor




could someone please help if I'm missing something?

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to