[ 
https://issues.apache.org/jira/browse/KAFKA-905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriram Subramanian updated KAFKA-905:
-------------------------------------

    Attachment: KAFKA-905-v2.patch

- made the logging changes
- added the missing file
                
> Logs can have same offsets causing recovery failure
> ---------------------------------------------------
>
>                 Key: KAFKA-905
>                 URL: https://issues.apache.org/jira/browse/KAFKA-905
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Sriram Subramanian
>            Assignee: Sriram Subramanian
>             Fix For: 0.8
>
>         Attachments: KAFKA-905.patch, KAFKA-905.rtf, KAFKA-905-v2.patch
>
>
> Consider the following scenario - 
> L                       F
> 1  m1,m2        1 m1,m2
> 3  m3,m4        3 m3,m4
> 5  m5,m6        5 m5,m6
> HW = 6           HW = 4
> Follower goes down and comes back up. Truncates its log to HW
> L                             F
> 1  m1,m2               1 m1,m2
> 3  m3,m4               3 m3,m4
> 5  m5,m6
> HW = 6            HW = 4
> Before follower catches up with the leader, leader goes down and follower 
> becomes the leader. It then gets new messages
> F                       L
> 1  m1,m2        1  m1,m2
> 3  m3,m4        3  m3,m4
> 5  m5,m6      10 m5-m10
> HW=6              HW=4
> follower fetches from offset 7. Since offset 7 is within the compressed 
> message 10 in the leader, the whole message chunk is sent to the follower
> F                        L      
> 1   m1,m2         1  m1,m2
> 3   m3,m4         3  m3,m4  
> 5   m5,m6       10  m5-m10
> 10 m5-m10
> HW=4               HW=10
> The follower logs now contain the same offsets. On recovery, re-indexing will 
> fail due to repeated offsets.
> Possible ways to fix this - 
> 1. The fetcher thread can do deep iteration instead of shallow iteration and 
> drop the offsets that are less than the log end offset. This would however 
> incur performance hit.
> 2. To optimize step 1, we could do the deep iteration till the logical offset 
> of the fetched message set is greater than the log end offset of the follower 
> log and then switch to shallow iteration.
> 3. On recovery we just truncate the active segment and refetch the data.
> All the above 3 steps are hacky. The right fix is to ensure we never corrupt 
> the logs. We can incur data loss but should not compromise consistency. For 
> 0.8, the easiest and simplest fix would be 3. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to