Hi James,

This looks like this known issue KAFKA-13636
<https://issues.apache.org/jira/browse/KAFKA-13636>, which should be fixed
in the newer version.

Thank you.
Luke

On Mon, Apr 11, 2022 at 9:18 AM James Olsen <ja...@inaseq.com> wrote:

> I recently observed the following series of events for a particular
> partition (MyTopic-6):
>
> 2022-03-18 03:18:28,562 INFO
> [org.apache.kafka.clients.consumer.internals.ConsumerCoordinator]
> 'executor-thread-2' [Consumer clientId=consumer-MyTopicService-group-3,
> groupId=MyTopicService-group] Setting offset for partition MyTopic-6 to the
> committed offset FetchPosition{offset=438, offsetEpoch=Optional.empty,
> currentLeader=LeaderAndEpoch{leader=Optional[b-2.redacted.kafka.us<
> http://b-2.redacted.kafka.us>-east-1.amazonaws.com:9094 (id: 2 rack:
> use1-az4)], epoch=64}}
>
> -- RESTART (bring up new consumer node)
>
> 2022-04-01 15:17:47,943 INFO
> [org.apache.kafka.clients.consumer.internals.ConsumerCoordinator]
> 'executor-thread-6' [Consumer clientId=consumer-MyTopicService-group-7,
> groupId=MyTopicService-group] Setting offset for partition MyTopic-6 to the
> committed offset FetchPosition{offset=449, offsetEpoch=Optional.empty,
> currentLeader=LeaderAndEpoch{leader=Optional[b-2.redacted.kafka.us<
> http://b-2.redacted.kafka.us>-east-1.amazonaws.com:9094 (id: 2 rack:
> use1-az4)], epoch=64}}
>
> -- REBALANCE (drop old consumer node)
>
> 2022-04-01 15:18:24,414 INFO
> [org.apache.kafka.clients.consumer.internals.ConsumerCoordinator]
> 'executor-thread-2' [Consumer clientId=consumer-MyTopicService-group-3,
> groupId=MyTopicService-group] Found no committed offset for partition
> MyTopic-6
> 2022-04-01 15:18:24,474 INFO
> [org.apache.kafka.clients.consumer.internals.SubscriptionState]
> 'executor-thread-2' [Consumer clientId=consumer-MyTopicService-group-3,
> groupId=MyTopicService-group] Resetting offset for partition MyTopic-6 to
> position FetchPosition{offset=411, offsetEpoch=Optional.empty,
> currentLeader=LeaderAndEpoch{leader=Optional[b-2.redacted.kafka.us<
> http://b-2.redacted.kafka.us>-east-1.amazonaws.com:9094 (id: 2 rack:
> use1-az4)], epoch=64}}.
>
> Seems odd that no offsets were found at 2022-04-01 15:18:24,414 when they
> were clearly present 36 seconds earlier at 2022-04-01 15:17:47,943.
>
> This resulted in message replay from offset 411-449.  This was in a test
> system only and we have duplicate detection in place but I'd still like to
> avoid similar occurrences in production if we can.
>
> There has clearly been a low volume of traffic but there have been active
> consumers all the time.  We have 
> log.retention.ms<http://log.retention.ms>=1814400000
> (3 weeks) which I believe explains why it resumed from 411 as messages
> prior to that will have been deleted.
>
> There may not have been any new traffic in the last 7 days (we have the
> default offset retention) so I'm wondering if there is a chance the offsets
> were deleted during the rebalance when I presume there's a brief moment
> when there is no active consumer.  My understanding is that they shouldn't
> be deleted until there has been no consumer for 7 days (
> https://kafka.apache.org/27/documentation.html#brokerconfigs_offsets.retention.minutes
> - not using static assignment).  Is it possible the logic is actually
> checking for no consumer now and no offsets for 7 days instead?
>
> Server and Client are 2.7.2.  Sorry I don't have any more detailed
> server-side logs.
>
> Regards, James.
>

Reply via email to