[jira] [Commented] (KAFKA-1112) broker can not start itself after kafka is killed with -9

2013-11-15 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13823822#comment-13823822
 ] 

Jay Kreps commented on KAFKA-1112:
--

David, this should not be happening in 0.8. If it is I suspect it is a 
different problem that causes the same bad outcome. Are you seeing this on 0.8? 
If so how reproducable is it?

> broker can not start itself after kafka is killed with -9
> -
>
> Key: KAFKA-1112
> URL: https://issues.apache.org/jira/browse/KAFKA-1112
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.8, 0.8.1
>Reporter: Kane Kim
>Assignee: Jay Kreps
>Priority: Critical
> Attachments: KAFKA-1112-v1.patch, KAFKA-1112-v2.patch, KAFKA-1112.out
>
>
> When I kill kafka with -9, broker cannot start itself because of corrupted 
> index logs. I think kafka should try to delete/rebuild indexes itself without 
> manual intervention. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (KAFKA-1112) broker can not start itself after kafka is killed with -9

2013-11-15 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13823829#comment-13823829
 ] 

Jay Kreps commented on KAFKA-1112:
--

Jun, this is true.

However, if you think about it recovery of the log has the same problem. We 
read a message and then compare it to its CRC. The CRC is a 32 bit number. The 
crc could certainly match the message by chance.

In this case we compare to a 64 bit number so this should be less likely. But 
in reality there are many rare events here: (1) we hard crash, (2) hard crash 
leads to corruption, (3) corruption of index points to a location that exactly 
matches the recovery offset.

In general I think peoples concern with this approach is that it is just kind 
of hacky. I agree with this complaint and am sort of disappointed with this set 
of changes overall.

I will post a slightly more paranoid version of the check, and then let's 
discuss that.

> broker can not start itself after kafka is killed with -9
> -
>
> Key: KAFKA-1112
> URL: https://issues.apache.org/jira/browse/KAFKA-1112
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.8, 0.8.1
>Reporter: Kane Kim
>Assignee: Jay Kreps
>Priority: Critical
> Attachments: KAFKA-1112-v1.patch, KAFKA-1112-v2.patch, KAFKA-1112.out
>
>
> When I kill kafka with -9, broker cannot start itself because of corrupted 
> index logs. I think kafka should try to delete/rebuild indexes itself without 
> manual intervention. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (KAFKA-1112) broker can not start itself after kafka is killed with -9

2013-11-15 Thread Jay Kreps (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Kreps updated KAFKA-1112:
-

Attachment: KAFKA-1112-v3.patch

Okay here is a maximally paranoid patch.

> broker can not start itself after kafka is killed with -9
> -
>
> Key: KAFKA-1112
> URL: https://issues.apache.org/jira/browse/KAFKA-1112
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.8, 0.8.1
>Reporter: Kane Kim
>Assignee: Jay Kreps
>Priority: Critical
> Attachments: KAFKA-1112-v1.patch, KAFKA-1112-v2.patch, 
> KAFKA-1112-v3.patch, KAFKA-1112.out
>
>
> When I kill kafka with -9, broker cannot start itself because of corrupted 
> index logs. I think kafka should try to delete/rebuild indexes itself without 
> manual intervention. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (KAFKA-1112) broker can not start itself after kafka is killed with -9

2013-11-15 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13823883#comment-13823883
 ] 

Guozhang Wang commented on KAFKA-1112:
--

How about we resort back to the clean shutdown file for recovery checking, and 
if recovery is needed, we can use the recovery point to optimize recovery 
overhead.

> broker can not start itself after kafka is killed with -9
> -
>
> Key: KAFKA-1112
> URL: https://issues.apache.org/jira/browse/KAFKA-1112
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.8, 0.8.1
>Reporter: Kane Kim
>Assignee: Jay Kreps
>Priority: Critical
> Attachments: KAFKA-1112-v1.patch, KAFKA-1112-v2.patch, 
> KAFKA-1112-v3.patch, KAFKA-1112.out
>
>
> When I kill kafka with -9, broker cannot start itself because of corrupted 
> index logs. I think kafka should try to delete/rebuild indexes itself without 
> manual intervention. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (KAFKA-1112) broker can not start itself after kafka is killed with -9

2013-11-15 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13824068#comment-13824068
 ] 

Jay Kreps commented on KAFKA-1112:
--

Yeah I would not be opposed to that as an alternative. Both are really a hack.

I guess the questions is what should the end state be?

> broker can not start itself after kafka is killed with -9
> -
>
> Key: KAFKA-1112
> URL: https://issues.apache.org/jira/browse/KAFKA-1112
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.8, 0.8.1
>Reporter: Kane Kim
>Assignee: Jay Kreps
>Priority: Critical
> Attachments: KAFKA-1112-v1.patch, KAFKA-1112-v2.patch, 
> KAFKA-1112-v3.patch, KAFKA-1112.out
>
>
> When I kill kafka with -9, broker cannot start itself because of corrupted 
> index logs. I think kafka should try to delete/rebuild indexes itself without 
> manual intervention. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)