[ https://issues.apache.org/jira/browse/KAFKA-905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sriram Subramanian updated KAFKA-905: ------------------------------------- Attachment: KAFKA-905-v2.patch - made the logging changes - added the missing file > Logs can have same offsets causing recovery failure > --------------------------------------------------- > > Key: KAFKA-905 > URL: https://issues.apache.org/jira/browse/KAFKA-905 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.8 > Reporter: Sriram Subramanian > Assignee: Sriram Subramanian > Fix For: 0.8 > > Attachments: KAFKA-905.patch, KAFKA-905.rtf, KAFKA-905-v2.patch > > > Consider the following scenario - > L F > 1 m1,m2 1 m1,m2 > 3 m3,m4 3 m3,m4 > 5 m5,m6 5 m5,m6 > HW = 6 HW = 4 > Follower goes down and comes back up. Truncates its log to HW > L F > 1 m1,m2 1 m1,m2 > 3 m3,m4 3 m3,m4 > 5 m5,m6 > HW = 6 HW = 4 > Before follower catches up with the leader, leader goes down and follower > becomes the leader. It then gets new messages > F L > 1 m1,m2 1 m1,m2 > 3 m3,m4 3 m3,m4 > 5 m5,m6 10 m5-m10 > HW=6 HW=4 > follower fetches from offset 7. Since offset 7 is within the compressed > message 10 in the leader, the whole message chunk is sent to the follower > F L > 1 m1,m2 1 m1,m2 > 3 m3,m4 3 m3,m4 > 5 m5,m6 10 m5-m10 > 10 m5-m10 > HW=4 HW=10 > The follower logs now contain the same offsets. On recovery, re-indexing will > fail due to repeated offsets. > Possible ways to fix this - > 1. The fetcher thread can do deep iteration instead of shallow iteration and > drop the offsets that are less than the log end offset. This would however > incur performance hit. > 2. To optimize step 1, we could do the deep iteration till the logical offset > of the fetched message set is greater than the log end offset of the follower > log and then switch to shallow iteration. > 3. On recovery we just truncate the active segment and refetch the data. > All the above 3 steps are hacky. The right fix is to ensure we never corrupt > the logs. We can incur data loss but should not compromise consistency. For > 0.8, the easiest and simplest fix would be 3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira