Hi James, This looks like this known issue KAFKA-13636 <https://issues.apache.org/jira/browse/KAFKA-13636>, which should be fixed in the newer version.
Thank you. Luke On Mon, Apr 11, 2022 at 9:18 AM James Olsen <ja...@inaseq.com> wrote: > I recently observed the following series of events for a particular > partition (MyTopic-6): > > 2022-03-18 03:18:28,562 INFO > [org.apache.kafka.clients.consumer.internals.ConsumerCoordinator] > 'executor-thread-2' [Consumer clientId=consumer-MyTopicService-group-3, > groupId=MyTopicService-group] Setting offset for partition MyTopic-6 to the > committed offset FetchPosition{offset=438, offsetEpoch=Optional.empty, > currentLeader=LeaderAndEpoch{leader=Optional[b-2.redacted.kafka.us< > http://b-2.redacted.kafka.us>-east-1.amazonaws.com:9094 (id: 2 rack: > use1-az4)], epoch=64}} > > -- RESTART (bring up new consumer node) > > 2022-04-01 15:17:47,943 INFO > [org.apache.kafka.clients.consumer.internals.ConsumerCoordinator] > 'executor-thread-6' [Consumer clientId=consumer-MyTopicService-group-7, > groupId=MyTopicService-group] Setting offset for partition MyTopic-6 to the > committed offset FetchPosition{offset=449, offsetEpoch=Optional.empty, > currentLeader=LeaderAndEpoch{leader=Optional[b-2.redacted.kafka.us< > http://b-2.redacted.kafka.us>-east-1.amazonaws.com:9094 (id: 2 rack: > use1-az4)], epoch=64}} > > -- REBALANCE (drop old consumer node) > > 2022-04-01 15:18:24,414 INFO > [org.apache.kafka.clients.consumer.internals.ConsumerCoordinator] > 'executor-thread-2' [Consumer clientId=consumer-MyTopicService-group-3, > groupId=MyTopicService-group] Found no committed offset for partition > MyTopic-6 > 2022-04-01 15:18:24,474 INFO > [org.apache.kafka.clients.consumer.internals.SubscriptionState] > 'executor-thread-2' [Consumer clientId=consumer-MyTopicService-group-3, > groupId=MyTopicService-group] Resetting offset for partition MyTopic-6 to > position FetchPosition{offset=411, offsetEpoch=Optional.empty, > currentLeader=LeaderAndEpoch{leader=Optional[b-2.redacted.kafka.us< > http://b-2.redacted.kafka.us>-east-1.amazonaws.com:9094 (id: 2 rack: > use1-az4)], epoch=64}}. > > Seems odd that no offsets were found at 2022-04-01 15:18:24,414 when they > were clearly present 36 seconds earlier at 2022-04-01 15:17:47,943. > > This resulted in message replay from offset 411-449. This was in a test > system only and we have duplicate detection in place but I'd still like to > avoid similar occurrences in production if we can. > > There has clearly been a low volume of traffic but there have been active > consumers all the time. We have > log.retention.ms<http://log.retention.ms>=1814400000 > (3 weeks) which I believe explains why it resumed from 411 as messages > prior to that will have been deleted. > > There may not have been any new traffic in the last 7 days (we have the > default offset retention) so I'm wondering if there is a chance the offsets > were deleted during the rebalance when I presume there's a brief moment > when there is no active consumer. My understanding is that they shouldn't > be deleted until there has been no consumer for 7 days ( > https://kafka.apache.org/27/documentation.html#brokerconfigs_offsets.retention.minutes > - not using static assignment). Is it possible the logic is actually > checking for no consumer now and no offsets for 7 days instead? > > Server and Client are 2.7.2. Sorry I don't have any more detailed > server-side logs. > > Regards, James. >