Due to KAFKA-1393, the server probably never ended up completely creating the replicas. Let us know how 0.8.1.1 goes.
Thanks, Neha On Tue, Aug 12, 2014 at 10:12 AM, Ryan Williams <rwilli...@gmail.com> wrote: > Using version 0.8.1. > > Looking to update to 0.8.1.1 now probably. > > > On Tue, Aug 12, 2014 at 9:25 AM, Guozhang Wang <wangg...@gmail.com> wrote: > > > The "0" there in the kafka-topics output is the broker id. > > > > From the broker log I think you are hitting KAFKA-1393 > > <https://issues.apache.org/jira/browse/KAFKA-1393>, which Kafka version > > are > > you using? > > > > Guozhang > > > > > > On Mon, Aug 11, 2014 at 10:37 PM, Ryan Williams <rwilli...@gmail.com> > > wrote: > > > > > Thanks for the heads up on attachments, here's a gist: > > > > > > > > > > > > https://gist.githubusercontent.com/ryanwi/84deb8774a6922ff3704/raw/75c33ad71d0d41301533cbc645fa9846736d5eb0/gistfile1.txt > > > > > > This seems to mostly happen in my development environment, when > running a > > > single broker. I don't see any broker failure in the controller log. > > > Anything else to look for with the topics reporting 0 replicas? > > > > > > > > > > > > > > > On Mon, Aug 11, 2014 at 9:31 PM, Guozhang Wang <wangg...@gmail.com> > > wrote: > > > > > > > Ryan, > > > > > > > > Apache mailing list does not allow attachments exceeding a certain > size > > > > limit, so the server logs is blocked. > > > > > > > > From the controller log it seems this only broker has failed and > hence > > no > > > > partitions will be available. This could be a soft failure (e.g. long > > > GC), > > > > or the ZK server side issues. You may want to take a look at your > > > > controller log to see if there is any entries like "broker failure" > > > before > > > > the offline leader selection process. > > > > > > > > Guozhang > > > > > > > > > > > > > > > > > > > > On Mon, Aug 11, 2014 at 5:08 PM, Ryan Williams <rwilli...@gmail.com> > > > > wrote: > > > > > > > > > The broker appears to be running > > > > > > > > > > $ telnet kafka-server 9092 > > > > > Trying... > > > > > Connected to kafka-server > > > > > Escape character is '^]'. > > > > > > > > > > I've attached today's server.log. There was a manual restart of > > kafka, > > > > > which you'll notice, but that didn't fix the issue. > > > > > > > > > > Thanks for looking! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Aug 11, 2014 at 4:30 PM, Guozhang Wang <wangg...@gmail.com > > > > > > wrote: > > > > > > > > > >> Hi Ryan, > > > > >> > > > > >> Could you check if all of your brokers are still live and running? > > > Also > > > > >> could you check the server log in addition to the producer / > > > > state-change > > > > >> / > > > > >> controller logs? > > > > >> > > > > >> Guozhang > > > > >> > > > > >> > > > > >> On Mon, Aug 11, 2014 at 12:45 PM, Ryan Williams < > > rwilli...@gmail.com> > > > > >> wrote: > > > > >> > > > > >> > I have a single broker test Kafka instance that was running fine > > on > > > > >> Friday > > > > >> > (basically out of the box configuration with 2 partitions), now > I > > > come > > > > >> back > > > > >> > on Monday and producers are unable to send messages. > > > > >> > > > > > >> > What else can i look at to debug, and prevent? > > > > >> > > > > > >> > I know how to recover by removing data directories for kafka and > > > > >> zookeeper > > > > >> > to start fresh. But, this isn't the first time this has > happened, > > > so > > > > I > > > > >> > would like to understand it better to feel more comfortable with > > > > kafka. > > > > >> > > > > > >> > > > > > >> > =================== > > > > >> > Producer error (from console produce) > > > > >> > =================== > > > > >> > [2014-08-11 19:32:49,781] WARN Error while fetching metadata > > > > >> > [{TopicMetadata for topic mytopic -> > > > > >> > No partition metadata for topic mytopic due to > > > > >> > kafka.common.LeaderNotAvailableException}] for topic [mytopic]: > > > class > > > > >> > kafka.common.LeaderNotAvailableException > > > > >> > (kafka.producer.BrokerPartitionInfo) > > > > >> > [2014-08-11 19:32:49,782] ERROR Failed to collate messages by > > topic, > > > > >> > partition due to: Failed to fetch topic metadata for topic: > > mytopic > > > > >> > (kafka.producer.async.DefaultEventHandler) > > > > >> > > > > > >> > =============== > > > > >> > state-change.log > > > > >> > =============== > > > > >> > [2014-08-11 19:12:45,312] TRACE Controller 0 epoch 3 started > > leader > > > > >> > election for partition [mytopic,0] (state.change.logger) > > > > >> > [2014-08-11 19:12:45,321] ERROR Controller 0 epoch 3 initiated > > state > > > > >> change > > > > >> > for partition [mytopic,0] from OfflinePartition to > OnlinePartition > > > > >> failed > > > > >> > (state.change.logger) > > > > >> > kafka.common.NoReplicaOnlineException: No replica for partition > > > > >> [mytopic,0] > > > > >> > is alive. Live brokers are: [Set()], Assigned replicas are: > > > [List(0)] > > > > >> > at > > > > >> > > > > > >> > > > > > >> > > > > > > > > > > kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:61) > > > > >> > [2014-08-11 19:12:45,312] TRACE Controller 0 epoch 3 started > > leader > > > > >> > election for partition [mytopic,1] (state.change.logger) > > > > >> > [2014-08-11 19:12:45,321] ERROR Controller 0 epoch 3 initiated > > state > > > > >> change > > > > >> > for partition [mytopic,1] from OfflinePartition to > OnlinePartition > > > > >> failed > > > > >> > (state.change.logger) > > > > >> > kafka.common.NoReplicaOnlineException: No replica for partition > > > > >> [mytopic,1] > > > > >> > is alive. Live brokers are: [Set()], Assigned replicas are: > > > [List(0)] > > > > >> > at > > > > >> > > > > > >> > > > > > >> > > > > > > > > > > kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:61) > > > > >> > > > > > >> > =============== > > > > >> > controller.log > > > > >> > =============== > > > > >> > [2014-08-11 19:12:45,308] DEBUG > [OfflinePartitionLeaderSelector]: > > No > > > > >> broker > > > > >> > in ISR is alive for [mytopic,1]. Pick the leader from the alive > > > > assigned > > > > >> > replicas: (kafka.controller.OfflinePartitionLeaderSelector) > > > > >> > [2014-08-11 19:12:45,321] DEBUG > [OfflinePartitionLeaderSelector]: > > No > > > > >> broker > > > > >> > in ISR is alive for [mytopic,0]. Pick the leader from the alive > > > > assigned > > > > >> > replicas: (kafka.controller.OfflinePartitionLeaderSelector) > > > > >> > > > > > >> > > > > >> > > > > >> > > > > >> -- > > > > >> -- Guozhang > > > > >> > > > > > > > > > > > > > > > > > > > > > > -- > > > > -- Guozhang > > > > > > > > > > > > > > > -- > > -- Guozhang > > >