[
https://issues.apache.org/jira/browse/KAFKA-20035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18060729#comment-18060729
]
David Jacot commented on KAFKA-20035:
-------------------------------------
`by_duration` is definitely not perfect either but it may be a good short term
mitigation until we figure out how to address it. If you set it to say 10
minutes, it will catch new partitions without reprocessing all the records.
The part which is not clear to me is why some would accept to use
`auto.offset.reset=latest` knowing that it means loosing records on retention
or truncation while worrying about losing records for new partitions. It is
kind of contradictory to me. Is it because they are worried by having to
reprocess everything?
> Prevent data loss during partition expansion by enforcing "earliest" offset
> reset for dynamically added partitions
> ------------------------------------------------------------------------------------------------------------------
>
> Key: KAFKA-20035
> URL: https://issues.apache.org/jira/browse/KAFKA-20035
> Project: Kafka
> Issue Type: Bug
> Components: clients, consumer, core, group-coordinator
> Reporter: Chia-Ping Tsai
> Assignee: Ken Huang
> Priority: Critical
> Labels: kip
>
> Currently, when a consumer group is configured with {{{}auto.offset.reset =
> latest{}}}, dynamically adding new partitions to a subscribed topic can lead
> to data loss due to a race condition.
> The scenario is as follows:
> # A group subscribes to a topic with {{{}auto.offset.reset = latest{}}}.
> # The topic is expanded (e.g., from 3 to 4 partitions).
> # Producers immediately start writing data to the new partition (Partition
> 3).
> # The Group Coordinator detects the change and assigns Partition 3 to a
> member.
> # The member initializes the partition. Since there is no committed offset,
> it applies the
> # *Result: Any messages written to Partition 3 between step 3 and step 5 are
> skipped and lost.*
> From a user's perspective, {{latest}} should mean "start consuming from the
> point of subscription," not "skip data from newly created infrastructure."
> KIP-1282:
> [https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=406619800]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)