[GitHub] [kafka] divijvaidya commented on pull request #13111: KAFKA-14190: Update Zk TopicId from locally stored cache in controller

GitBox Wed, 18 Jan 2023 06:24:47 -0800


divijvaidya commented on PR #13111:
URL: https://github.com/apache/kafka/pull/13111#issuecomment-1387158089


   Thanks for the comments folks. I would like to break the conversation as 
multiple FAQs and hopefully that would address the questions and points that 
you have raised above.
   
   **What is the motivation? Is it to make the latest versions compatible with 
pre 2.8 clients (for the scope of this bug) OR is it to protect the server when 
older clients are used?**
   
   It’s the latter. Currently, the bug manifests in availability loss for the 
impacted topic since the topic stops replication. This is recoverable by 
deleting the metadata file and broker will recreate it from Zk.  However, when 
KIP405(Tiered Storage) is merged in, it will begin to impact data integrity. 
This is because the metadata for a segment uses topicId as a key. When same 
segment for same topic is uploaded with different topic Ids, it leads to an 
unrecoverable situation.
   
    I would be happy to discuss a different solution than what has been 
proposed in the PR which can protect the server against the above two cases.  
    
   **Does this bug impact client/server in the same major version as well? **
   Yes. <2.8 client with >=2.8 server will have this bug.
   
   ** Can the users migrate to the newer versions? **
   That would be ideal. But practically, there are many cases where the users 
rely on 3P libraries which haven’t updated their client version. We have been 
observing multiple cases where customers are facing this bug.  As a community, 
we can push the problem back to the users and request them to upgrade their 
software, OR we can empathise with their situation and try to find a path 
forward which doesn’t have side effects and doesn’t burden the newer 
clients/servers. 
   In some cases, former is the right thing to do but I would argue that in 
this particular case, we have a simple and safe fix to prevent majority of the 
cases. Hence, in this particular case, we can strive to improve the experience 
of the users and go with the latter option.
    
   **  Why is this PR safe to merge?**
   Change in this PR breaks the premise that Zk is the source of truth since it 
updates Zk with a value that is stored locally in the controller. This is not 
ideal. But it is a safe change to make. This is primarily because topic IDs are 
immutable and controller context is either empty or consistent with the latest 
state of the system. More specifically:
   
   1. we update Zk *only when* it doesn’t have a topic Id during alter 
partition which is not possible (since create topic would have allocated a 
topic Id) unless it hits this bug.   Hence, we won't encounter a scenario where 
we "overwrite" an existing topic Id.
   
   2. Topic IDs are immutable. They only change for a topic, when it is deleted 
and re-created. In cases, where topic is deleted and re-created, controller 
context removes the topic Id from local cache on deletion. Hence, the topic Id 
in the local cache of a controller is always the one which should correctly be 
associated with a particular topic.
   
   3. The zkClient.setTopicIds() ensures that Zk is only updated from the 
latest controller (by verifying the controller epoch), hence, eliminating the 
possibility of a stale controller updating the Zk with stale topic Id.
   
   **What are the alternative ways to protect the state of the server against 
thus bug?**
   1. As Colin, suggested, we could potentially start storing topic Ids in a 
different place in Zk so that they don't get overwritten by older clients. I 
believe that it is a more intrusive change (and much holistic covering 100% of 
bug scenarios) than what I suggested above.
   2. If a topic Id mismatch is detected, consider the partition as a "bad 
partition" and perform the recovery steps listed 
https://issues.apache.org/jira/browse/KAFKA-14190 manually. Stop archival to 
remote storage as soon as a topic Id mismatch is detected. We should probably 
make this change in addition to the change in this PR.
   
   Any other suggestions?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [kafka] divijvaidya commented on pull request #13111: KAFKA-14190: Update Zk TopicId from locally stored cache in controller

Reply via email to