Chia-Ping Tsai created KAFKA-20035:
--------------------------------------

             Summary: Prevent data loss during partition expansion by enforcing 
"earliest" offset reset for dynamically added partitions
                 Key: KAFKA-20035
                 URL: https://issues.apache.org/jira/browse/KAFKA-20035
             Project: Kafka
          Issue Type: Bug
            Reporter: Chia-Ping Tsai
            Assignee: Chia-Ping Tsai


Currently, when a consumer group is configured with {{{}auto.offset.reset = 
latest{}}}, dynamically adding new partitions to a subscribed topic can lead to 
data loss due to a race condition.

The scenario is as follows:
 # A group subscribes to a topic with {{{}auto.offset.reset = latest{}}}.

 # The topic is expanded (e.g., from 3 to 4 partitions).

 # Producers immediately start writing data to the new partition (Partition 3).

 # The Group Coordinator detects the change and assigns Partition 3 to a member.

 # The member initializes the partition. Since there is no committed offset, it 
applies the {{latest}} policy.

 # *Result:* Any messages written to Partition 3 between step 3 and step 5 are 
skipped and lost.

>From a user's perspective, {{latest}} should mean "start consuming from the 
>point of subscription," not "skip data from newly created infrastructure."



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to