hudeqi created KAFKA-12478:
------------------------------

             Summary: Consumer group may lose data for newly expanded 
partitions when add partitions for topic if the group is set to consume from 
the latest
                 Key: KAFKA-12478
                 URL: https://issues.apache.org/jira/browse/KAFKA-12478
             Project: Kafka
          Issue Type: Improvement
          Components: clients
    Affects Versions: 2.7.0
            Reporter: hudeqi


  This problem is exposed in our product environment: a topic is used to 
produce monitoring data. *After expanding partitions, the consumer side of the 
business reported that the data is lost.* 

  After preliminary investigation, the lost data is all concentrated in the 
newly expanded partitions. The reason is: when the server expands, the producer 
firstly perceives the expansion, and some data is written in the newly expanded 
partitions. But the consumer group perceives the expansion later, after the 
rebalance is completed, the newly expanded partitions will be consumed from the 
latest if it is set to consume from the latest. Within a period of time, the 
data of the newly expanded partitions is skipped and lost by the consumer.

  If it is not necessarily set to consume from the earliest for a huge data 
flow topic when starts up, this will make the group consume historical data 
from the broker crazily, which will affect the performance of brokers to a 
certain extent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to