[
https://issues.apache.org/jira/browse/KAFKA-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17165892#comment-17165892
]
Jason Gustafson edited comment on KAFKA-10284 at 7/27/20, 6:13 PM:
-------------------------------------------------------------------
[~bchen225242] Hmm.. Not totally sure I buy this:
> A static member X joins the group and updates member.id to M1, then gets stuck
> Another static member Y with the same instance.id joins and updates member.id
> to M2, while starts working and commit offsets
> The group coordinator migrates, and the member.id for the same static member
> rewinds to M1
> The static member X goes back online, and validated. It would try to fetch
> from Y's committed offset
Why would member X fetch Y's committed offset? If it doesn't know it had been
fenced temporarily, it might just commit its latest offsets. This does seem
like a correctness problem to me.
was (Author: hachikuji):
Hmm.. Not totally sure I buy this:
> A static member X joins the group and updates member.id to M1, then gets stuck
> Another static member Y with the same instance.id joins and updates member.id
> to M2, while starts working and commit offsets
> The group coordinator migrates, and the member.id for the same static member
> rewinds to M1
> The static member X goes back online, and validated. It would try to fetch
> from Y's committed offset
Why would member X fetch Y's committed offset? If it doesn't know it had been
fenced temporarily, it might just commit its latest offsets. This does seem
like a correctness problem to me.
> Group membership update due to static member rejoin should be persisted
> -----------------------------------------------------------------------
>
> Key: KAFKA-10284
> URL: https://issues.apache.org/jira/browse/KAFKA-10284
> Project: Kafka
> Issue Type: Bug
> Components: consumer
> Affects Versions: 2.3.0, 2.4.0, 2.5.0, 2.6.0
> Reporter: Boyang Chen
> Assignee: Boyang Chen
> Priority: Major
> Fix For: 2.6.1
>
>
> For known static members rejoin, we would update its corresponding member.id
> without triggering a new rebalance. This serves the purpose for avoiding
> unnecessary rebalance for static membership, as well as fencing purpose if
> some still uses the old member.id.
> The bug is that we don't actually persist the membership update, so if no
> upcoming rebalance gets triggered, this new member.id information will get
> lost during group coordinator immigration, thus bringing up the zombie member
> identity.
> The bug find credit goes to [~hachikuji]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)