[jira] [Commented] (KAFKA-1112) broker can not start itself after kafka is killed with -9
[ https://issues.apache.org/jira/browse/KAFKA-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13823822#comment-13823822 ] Jay Kreps commented on KAFKA-1112: -- David, this should not be happening in 0.8. If it is I suspect it is a different problem that causes the same bad outcome. Are you seeing this on 0.8? If so how reproducable is it? > broker can not start itself after kafka is killed with -9 > - > > Key: KAFKA-1112 > URL: https://issues.apache.org/jira/browse/KAFKA-1112 > Project: Kafka > Issue Type: Bug > Components: log >Affects Versions: 0.8, 0.8.1 >Reporter: Kane Kim >Assignee: Jay Kreps >Priority: Critical > Attachments: KAFKA-1112-v1.patch, KAFKA-1112-v2.patch, KAFKA-1112.out > > > When I kill kafka with -9, broker cannot start itself because of corrupted > index logs. I think kafka should try to delete/rebuild indexes itself without > manual intervention. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (KAFKA-1112) broker can not start itself after kafka is killed with -9
[ https://issues.apache.org/jira/browse/KAFKA-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13823829#comment-13823829 ] Jay Kreps commented on KAFKA-1112: -- Jun, this is true. However, if you think about it recovery of the log has the same problem. We read a message and then compare it to its CRC. The CRC is a 32 bit number. The crc could certainly match the message by chance. In this case we compare to a 64 bit number so this should be less likely. But in reality there are many rare events here: (1) we hard crash, (2) hard crash leads to corruption, (3) corruption of index points to a location that exactly matches the recovery offset. In general I think peoples concern with this approach is that it is just kind of hacky. I agree with this complaint and am sort of disappointed with this set of changes overall. I will post a slightly more paranoid version of the check, and then let's discuss that. > broker can not start itself after kafka is killed with -9 > - > > Key: KAFKA-1112 > URL: https://issues.apache.org/jira/browse/KAFKA-1112 > Project: Kafka > Issue Type: Bug > Components: log >Affects Versions: 0.8, 0.8.1 >Reporter: Kane Kim >Assignee: Jay Kreps >Priority: Critical > Attachments: KAFKA-1112-v1.patch, KAFKA-1112-v2.patch, KAFKA-1112.out > > > When I kill kafka with -9, broker cannot start itself because of corrupted > index logs. I think kafka should try to delete/rebuild indexes itself without > manual intervention. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (KAFKA-1112) broker can not start itself after kafka is killed with -9
[ https://issues.apache.org/jira/browse/KAFKA-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Kreps updated KAFKA-1112: - Attachment: KAFKA-1112-v3.patch Okay here is a maximally paranoid patch. > broker can not start itself after kafka is killed with -9 > - > > Key: KAFKA-1112 > URL: https://issues.apache.org/jira/browse/KAFKA-1112 > Project: Kafka > Issue Type: Bug > Components: log >Affects Versions: 0.8, 0.8.1 >Reporter: Kane Kim >Assignee: Jay Kreps >Priority: Critical > Attachments: KAFKA-1112-v1.patch, KAFKA-1112-v2.patch, > KAFKA-1112-v3.patch, KAFKA-1112.out > > > When I kill kafka with -9, broker cannot start itself because of corrupted > index logs. I think kafka should try to delete/rebuild indexes itself without > manual intervention. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (KAFKA-1112) broker can not start itself after kafka is killed with -9
[ https://issues.apache.org/jira/browse/KAFKA-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13823883#comment-13823883 ] Guozhang Wang commented on KAFKA-1112: -- How about we resort back to the clean shutdown file for recovery checking, and if recovery is needed, we can use the recovery point to optimize recovery overhead. > broker can not start itself after kafka is killed with -9 > - > > Key: KAFKA-1112 > URL: https://issues.apache.org/jira/browse/KAFKA-1112 > Project: Kafka > Issue Type: Bug > Components: log >Affects Versions: 0.8, 0.8.1 >Reporter: Kane Kim >Assignee: Jay Kreps >Priority: Critical > Attachments: KAFKA-1112-v1.patch, KAFKA-1112-v2.patch, > KAFKA-1112-v3.patch, KAFKA-1112.out > > > When I kill kafka with -9, broker cannot start itself because of corrupted > index logs. I think kafka should try to delete/rebuild indexes itself without > manual intervention. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (KAFKA-1112) broker can not start itself after kafka is killed with -9
[ https://issues.apache.org/jira/browse/KAFKA-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13824068#comment-13824068 ] Jay Kreps commented on KAFKA-1112: -- Yeah I would not be opposed to that as an alternative. Both are really a hack. I guess the questions is what should the end state be? > broker can not start itself after kafka is killed with -9 > - > > Key: KAFKA-1112 > URL: https://issues.apache.org/jira/browse/KAFKA-1112 > Project: Kafka > Issue Type: Bug > Components: log >Affects Versions: 0.8, 0.8.1 >Reporter: Kane Kim >Assignee: Jay Kreps >Priority: Critical > Attachments: KAFKA-1112-v1.patch, KAFKA-1112-v2.patch, > KAFKA-1112-v3.patch, KAFKA-1112.out > > > When I kill kafka with -9, broker cannot start itself because of corrupted > index logs. I think kafka should try to delete/rebuild indexes itself without > manual intervention. -- This message was sent by Atlassian JIRA (v6.1#6144)