[ https://issues.apache.org/jira/browse/KAFKA-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14205580#comment-14205580 ]
Noah Yetter commented on KAFKA-1460: ------------------------------------ For what it's worth there's nothing in the documentation that recommends against running Kafka and ZK on the same physical nodes. One can infer that it's less than optimal, but for a small deployment we reasoned it should be sufficient. In fact I would say that running ZK and Kafka on the same hardware is a no-brainer for a toe-in-the-water deployment, so if it's a bad idea, something somewhere should say so. In general, the "Hardware and OS" documentation (http://kafka.apache.org/documentation.html#hwandos) is near useless to anyone not operating at a LinkedIn scale. We are accepting about 200 messages per second, not 2 million, so the idea of running Kafka on 8-core machines with 24GB RAM and 8 spindles is absurd, as is the idea that ZK would need 3-5GB of heap to manage the metadata and offsets for a dozen topics with no more than 4 partitions each. > NoReplicaOnlineException: No replica for partition > -------------------------------------------------- > > Key: KAFKA-1460 > URL: https://issues.apache.org/jira/browse/KAFKA-1460 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.8.1.1 > Reporter: Artur Denysenko > Priority: Critical > Attachments: state-change.log > > > We have a standalone kafka server. > After several days of running we get: > {noformat} > kafka.common.NoReplicaOnlineException: No replica for partition > [gk.q.module,1] is alive. Live brokers are: [Set()], Assigned replicas are: > [List(0)] > at > kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:61) > at > kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:336) > at > kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:185) > at > kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:99) > at > kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:96) > at > scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:743) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:95) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:95) > at scala.collection.Iterator$class.foreach(Iterator.scala:772) > at > scala.collection.mutable.HashTable$$anon$1.foreach(HashTable.scala:157) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:190) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:45) > at scala.collection.mutable.HashMap.foreach(HashMap.scala:95) > at > scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:742) > at > kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:96) > at > kafka.controller.PartitionStateMachine.startup(PartitionStateMachine.scala:68) > at > kafka.controller.KafkaController.onControllerFailover(KafkaController.scala:312) > at > kafka.controller.KafkaController$$anonfun$1.apply$mcV$sp(KafkaController.scala:162) > at > kafka.server.ZookeeperLeaderElector.elect(ZookeeperLeaderElector.scala:63) > at > kafka.controller.KafkaController$SessionExpirationListener$$anonfun$handleNewSession$1.apply$mcZ$sp(KafkaController.scala:1068) > at > kafka.controller.KafkaController$SessionExpirationListener$$anonfun$handleNewSession$1.apply(KafkaController.scala:1066) > at > kafka.controller.KafkaController$SessionExpirationListener$$anonfun$handleNewSession$1.apply(KafkaController.scala:1066) > at kafka.utils.Utils$.inLock(Utils.scala:538) > at > kafka.controller.KafkaController$SessionExpirationListener.handleNewSession(KafkaController.scala:1066) > at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472) > at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) > {noformat} > Please see attached [state-change.log] > You can find all server logs (450mb) here: > http://46.4.114.35:9999/deploy/kafka-logs.2014-05-14-16.tgz > On client we get: > {noformat} > 16:28:36,843 [ool-12-thread-2] WARN ZookeeperConsumerConnector - > [dev_dev-1400257716132-e7b8240c], no brokers found when trying to rebalance. > {noformat} > If we try to send message using 'kafka-console-producer.sh': > {noformat} > [root@dev kafka]# /srv/kafka/bin/kafka-console-producer.sh --broker-list > localhost:9092 --topic test > message > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". > SLF4J: Defaulting to no-operation (NOP) logger implementation > SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further > details. > [2014-05-16 19:45:30,950] WARN Fetching topic metadata with correlation id 0 > for topics [Set(test)] from broker [id:0,host:localhost,port:9092] failed > (kafka.client.ClientUtils$) > java.net.SocketTimeoutException > at > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229) > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > at > java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) > at kafka.utils.Utils$.read(Utils.scala:375) > at > kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) > at kafka.network.Receive$class.readCompletely(Transmission.scala:56) > at > kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29) > at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100) > at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:74) > at > kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:71) > at kafka.producer.SyncProducer.send(SyncProducer.scala:112) > at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:53) > at > kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82) > at > kafka.producer.async.DefaultEventHandler$$anonfun$handle$1.apply$mcV$sp(DefaultEventHandler.scala:67) > at kafka.utils.Utils$.swallow(Utils.scala:167) > at kafka.utils.Logging$class.swallowError(Logging.scala:106) > at kafka.utils.Utils$.swallowError(Utils.scala:46) > at > kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:67) > at > kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:104) > at > kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:87) > at > kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:67) > at scala.collection.immutable.Stream.foreach(Stream.scala:526) > at > kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:66) > at > kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:44) > {noformat} > If we try to receive message using 'kafka-console-consumer.sh': > {noformat} > [root@dev kafka]# /srv/kafka/bin/kafka-console-consumer.sh --zookeeper > localhost:2181 --topic test > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". > SLF4J: Defaulting to no-operation (NOP) logger implementation > SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further > details. > [2014-05-16 19:46:23,029] WARN > [console-consumer-69449_dev-1400262382648-1c9bfcd3], no brokers found when > trying to rebalance. (kafka.consumer.ZookeeperConsumerConnector) > {noformat} > Port 9092 is open: > {noformat} > [root@dev kafka]# telnet localhost 9092 > Trying 127.0.0.1... > Connected to localhost. > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)