[ 
https://issues.apache.org/jira/browse/KAFKA-905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671553#comment-13671553
 ] 

Jun Rao commented on KAFKA-905:
-------------------------------

Thanks for the patch. Looks good. Some minor comments:

1. Log: 
1.1 logSegments.get(logSegments.size - 1) is used twice when handling 
InvalidOffsetExcetpion. Could we just call it once and reuse the result?
1.2 The info logging should probably be warning. Also, it would be useful to 
log the dir name so that we know the topic/partition.

2. OffsetIndex: It would be useful to include the full file name in the message 
of the exception so that we know the topic/partition.

3. The patch doesn't compile since it's missing the new file 
InvalidOffsetExcetpion.
                
> Logs can have same offsets causing recovery failure
> ---------------------------------------------------
>
>                 Key: KAFKA-905
>                 URL: https://issues.apache.org/jira/browse/KAFKA-905
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Sriram Subramanian
>            Assignee: Sriram Subramanian
>             Fix For: 0.8
>
>         Attachments: KAFKA-905.patch, KAFKA-905.rtf
>
>
> Consider the following scenario - 
> L                       F
> 1  m1,m2        1 m1,m2
> 3  m3,m4        3 m3,m4
> 5  m5,m6        5 m5,m6
> HW = 6           HW = 4
> Follower goes down and comes back up. Truncates its log to HW
> L                             F
> 1  m1,m2               1 m1,m2
> 3  m3,m4               3 m3,m4
> 5  m5,m6
> HW = 6            HW = 4
> Before follower catches up with the leader, leader goes down and follower 
> becomes the leader. It then gets new messages
> F                       L
> 1  m1,m2        1  m1,m2
> 3  m3,m4        3  m3,m4
> 5  m5,m6      10 m5-m10
> HW=6              HW=4
> follower fetches from offset 7. Since offset 7 is within the compressed 
> message 10 in the leader, the whole message chunk is sent to the follower
> F                        L      
> 1   m1,m2         1  m1,m2
> 3   m3,m4         3  m3,m4  
> 5   m5,m6       10  m5-m10
> 10 m5-m10
> HW=4               HW=10
> The follower logs now contain the same offsets. On recovery, re-indexing will 
> fail due to repeated offsets.
> Possible ways to fix this - 
> 1. The fetcher thread can do deep iteration instead of shallow iteration and 
> drop the offsets that are less than the log end offset. This would however 
> incur performance hit.
> 2. To optimize step 1, we could do the deep iteration till the logical offset 
> of the fetched message set is greater than the log end offset of the follower 
> log and then switch to shallow iteration.
> 3. On recovery we just truncate the active segment and refetch the data.
> All the above 3 steps are hacky. The right fix is to ensure we never corrupt 
> the logs. We can incur data loss but should not compromise consistency. For 
> 0.8, the easiest and simplest fix would be 3. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to