The 0.8 version I use was built from trunk last Dec. Since then, this error happened 3 times. Each time we had to remove all the ZK and Kafka log data and restart the services.
I will try newer versions with more recent patches and keep monitoring it. thanks! Jason On Wed, Mar 20, 2013 at 10:39 AM, Jun Rao <jun...@gmail.com> wrote: > Ok, so you are using the same broker id. What the error is saying is that > broker 1 doesn't seem to be up. > > Not sure what revision of 0.8 you are using. Could you try the latest > revision in 0.8 and see if the problem still exists? You may have to wipe > out all ZK and Kafka data first since some ZK data structures have been > rename a few weeks ago. > > Thanks, > > Jun > > On Wed, Mar 20, 2013 at 6:57 AM, Jason Huang <jason.hu...@icare.com> wrote: > >> I restarted the zookeeper server first, then broker. It's the same >> instance of kafka 0.8 and I am using the same config file. In >> server.properties I have: brokerid=1 >> >> Is that sufficient to ensure the broker get restarted with the same >> broker id as before? >> >> thanks, >> >> Jason >> >> On Wed, Mar 20, 2013 at 12:30 AM, Jun Rao <jun...@gmail.com> wrote: >> > Did the broker get restarted with the same broker id? >> > >> > Thanks, >> > >> > Jun >> > >> > On Tue, Mar 19, 2013 at 1:34 PM, Jason Huang <jason.hu...@icare.com> >> wrote: >> > >> >> Hello, >> >> >> >> My kafka (0.8) server went down today for unknown reason and when I >> >> restarted both zookeeper and kafka server I got the following error at >> >> the kafka server log: >> >> >> >> [2013-03-19 13:39:16,131] INFO [Partition state machine on Controller >> >> 1]: Invoking state change to OnlinePartition for partitions >> >> (kafka.controller.PartitionStateMachine) >> >> [2013-03-19 13:39:16,262] INFO [Partition state machine on Controller >> >> 1]: Electing leader for partition >> >> [topic_a937ac27-1883-4ca0-95bc-c9a740d08947, 0] >> >> (kafka.controller.PartitionStateMachine) >> >> [2013-03-19 13:39:16,451] ERROR [Partition state machine on Controller >> >> 1]: State change for partition >> >> [topic_a937ac27-1883-4ca0-95bc-c9a740d08947, 0] from OfflinePartition >> >> to OnlinePartition failed (kafka.controller.PartitionStateMachine) >> >> kafka.common.PartitionOfflineException: All replicas for partition >> >> [topic_a937ac27-1883-4ca0-95bc-c9a740d08947, 0] are dead. Marking this >> >> partition offline >> >> at >> >> >> kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:300) >> >> ..... >> >> Caused by: kafka.common.PartitionOfflineException: No replica for >> >> partition ([topic_a937ac27-1883-4ca0-95bc-c9a740d08947, 0]) is alive. >> >> Live brokers are: [Set()], Assigned replicas are: [List(1)] >> >> ....... >> >> >> >> I am using one single server to host kafka and zookeeper. Replication >> >> factor is set to 1. >> >> >> >> This happened for all the existing topics. Not sure how this happened >> >> but it appeared to be a bug. I did some search and the only possible >> >> fix for this bug seems to be KAFKA-708. >> >> >> >> Any comments on this? Thanks! >> >> >> >> Jason >> >> >>