Thanks, Cody. Yes we did see that writeup from Jay, it seems to just refer to his test 6 partitions. I've been looking for more of a recipe of what the possible max is vs. what the optimal value may be; haven't found such.
KAFKA-899 appears related but it was fixed in Kafka 0.8.2.0 - we're running 0.8.2.1. I'm more curious about another error message from the logs which is this: *fetching topic metadata for topics [Set(my-topic-1)] from broker [ArrayBuffer(id:0,host:data2.acme.com <http://data2.acme.com>,port:9092, id:1,host:data3.acme.com <http://data3.acme.com>,port:9092)] failed* I know that data2 should have broker ID of 1 and data3 should have broker ID of 2. So there's some disconnect somewhere as to what these ID's are. In Zookeeper, ls /brokers/ids lists: [1, 2]. So where could the [0, 1] be stuck? On Tue, Sep 29, 2015 at 9:39 AM, Cody Koeninger <c...@koeninger.org> wrote: > Try writing and reading to the topics in question using the kafka command > line tools, to eliminate your code as a variable. > > > That number of partitions is probably more than sufficient: > > > https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines > > Obviously if you ask for more replicas than you have brokers you're going > to have a problem, but that doesn't seem to be the case. > > > > Also, depending on what version of kafka you're using on the broker, you > may want to look through the kafka jira, e.g. > > https://issues.apache.org/jira/browse/KAFKA-899 > > > On Tue, Sep 29, 2015 at 8:05 AM, Dmitry Goldenberg < > dgoldenberg...@gmail.com> wrote: > >> "more partitions and replicas than available brokers" -- what would be a >> good ratio? >> >> We've been trying to set up 3 topics with 64 partitions. I'm including >> the output of "bin/kafka-topics.sh --zookeeper localhost:2181 --describe >> topic1" below. >> >> I think it's symptomatic and confirms your theory, Adrian, that we've got >> too many partitions. In fact, for topic 2, only 12 partitions appear to >> have been created despite the requested 64. Does Kafka have the limit of >> 140 partitions total within a cluster? >> >> The doc doesn't appear to have any prescriptions as to how you go about >> calculating an optimal number of partitions. >> >> We'll definitely try with fewer, I'm just looking for a good formula to >> calculate how many. And no, Adrian, this hasn't worked yet, so we'll start >> with something like 12 partitions. It'd be good to know how high we can go >> with that... >> >> Topic:topic1 PartitionCount:64 ReplicationFactor:1 Configs: >> >> Topic: topic1 Partition: 0 Leader: 1 Replicas: 1 Isr: 1 >> >> Topic: topic2 Partition: 1 Leader: 2 Replicas: 2 Isr: 2 >> >> >> ................................................................................................ >> >> Topic: topic3 Partition: 63 Leader: 2 Replicas: 2 Isr: 2 >> >> >> --------------------------------------------------------------------------------------------------- >> >> Topic:topic2 PartitionCount:12 ReplicationFactor:1 Configs: >> >> Topic: topic2 Partition: 0 Leader: 2 Replicas: 2 Isr: 2 >> >> Topic: topic2 Partition: 1 Leader: 1 Replicas: 1 Isr: 1 >> >> >> ................................................................................................ >> >> Topic: topic2 Partition: 11 Leader: 1 Replicas: 1 Isr: 1 >> >> >> --------------------------------------------------------------------------------------------------- >> >> Topic:topic3 PartitionCount:64 ReplicationFactor:1 Configs: >> >> Topic: topic3 Partition: 0 Leader: 2 Replicas: 2 Isr: 2 >> >> Topic: topic3 Partition: 1 Leader: 1 Replicas: 1 Isr: 1 >> >> >> ................................................................................................ >> >> Topic: topic3 Partition: 63 Leader: 1 Replicas: 1 Isr: 1 >> >> >> On Tue, Sep 29, 2015 at 8:47 AM, Adrian Tanase <atan...@adobe.com> wrote: >> >>> The error message is very explicit (partition is under replicated), I >>> don’t think it’s related to networking issues. >>> >>> Try to run /home/kafka/bin/kafka-topics.sh —zookeeper localhost/kafka >>> —describe topic_name and see which brokers are missing from the replica >>> assignment. >>> *(replace home, zk-quorum etc with your own set-up)* >>> >>> Lastly, has this ever worked? Maybe you’ve accidentally created the >>> topic with more partitions and replicas than available brokers… try to >>> recreate with fewer partitions/replicas, see if it works. >>> >>> -adrian >>> >>> From: Dmitry Goldenberg >>> Date: Tuesday, September 29, 2015 at 3:37 PM >>> To: Adrian Tanase >>> Cc: "user@spark.apache.org" >>> Subject: Re: Kafka error "partitions don't have a leader" / >>> LeaderNotAvailableException >>> >>> Adrian, >>> >>> Thanks for your response. I just looked at both machines we're testing >>> on and on both the Kafka server process looks OK. Anything specific I can >>> check otherwise? >>> >>> From googling around, I see some posts where folks suggest to check the >>> DNS settings (those appear fine) and to set the advertised.host.name in >>> Kafka's server.properties. Yay/nay? >>> >>> Thanks again. >>> >>> On Tue, Sep 29, 2015 at 8:31 AM, Adrian Tanase <atan...@adobe.com> >>> wrote: >>> >>>> I believe some of the brokers in your cluster died and there are a >>>> number of partitions that nobody is currently managing. >>>> >>>> -adrian >>>> >>>> From: Dmitry Goldenberg >>>> Date: Tuesday, September 29, 2015 at 3:26 PM >>>> To: "user@spark.apache.org" >>>> Subject: Kafka error "partitions don't have a leader" / >>>> LeaderNotAvailableException >>>> >>>> I apologize for posting this Kafka related issue into the Spark list. >>>> Have gotten no responses on the Kafka list and was hoping someone on this >>>> list could shed some light on the below. >>>> >>>> ------------------------------------------------------------ >>>> --------------------------- >>>> >>>> We're running into this issue in a clustered environment where we're >>>> trying to send messages to Kafka and are getting the below error. >>>> >>>> Can someone explain what might be causing it and what the error message >>>> means (Failed to send data since partitions [<topic-name>,8] don't have a >>>> leader) ? >>>> >>>> >>>> --------------------------------------------------------------------------------------- >>>> >>>> WARN kafka.producer.BrokerPartitionInfo: Error while fetching >>>> metadata partition 10 leader: none replicas: isr: isUnderReplicated: false >>>> for topic partition [<topic-name>,10]: [class >>>> kafka.common.LeaderNotAvailableException] >>>> >>>> ERROR kafka.producer.async.DefaultEventHandler: Failed to send requests >>>> for topics <topic-name> with correlation ids in [2398792,2398801] >>>> >>>> ERROR com.acme.core.messaging.kafka.KafkaMessageProducer: Error while >>>> sending a message to the message >>>> store. kafka.common.FailedToSendMessageException: Failed to send messages >>>> after 3 tries. >>>> at >>>> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:90) >>>> ~[kafka_2.10-0.8.2.0.jar:?] >>>> at kafka.producer.Producer.send(Producer.scala:77) >>>> ~[kafka_2.10-0.8.2.0.jar:?] >>>> at kafka.javaapi.producer.Producer.send(Producer.scala:33) >>>> ~[kafka_2.10-0.8.2.0.jar:?] >>>> >>>> WARN kafka.producer.async.DefaultEventHandler: Failed to send data >>>> since partitions [<topic-name>,8] don't have a leader >>>> >>>> What do these errors and warnings mean and how do we get around them? >>>> >>>> >>>> --------------------------------------------------------------------------------------- >>>> >>>> The code for sending messages is basically as follows: >>>> >>>> public class KafkaMessageProducer { >>>> private Producer<String, String> producer; >>>> >>>> ..................... >>>> >>>> public void sendMessage(String topic, String key, >>>> String message) throws IOException, MessagingException { >>>> KeyedMessage<String, String> data = new KeyedMessage<String, >>>> String>(topic, key, message); >>>> try { >>>> producer.send(data); >>>> } catch (Exception ex) { >>>> throw new MessagingException("Error while sending a message to >>>> the message store.", ex); >>>> } >>>> } >>>> >>>> Is it possible that the producer gets "stale" and needs to be >>>> re-initialized? Do we want to re-create the producer on every message (??) >>>> or is it OK to hold on to one indefinitely? >>>> >>>> >>>> --------------------------------------------------------------------------------------- >>>> >>>> The following are the producer properties that are being set into the >>>> producer >>>> >>>> batch.num.messages => 200 >>>> client.id => Acme >>>> compression.codec => none >>>> key.serializer.class => kafka.serializer.StringEncoder >>>> message.send.max.retries => 3 >>>> metadata.broker.list => data2.acme.com:9092,data3.acme.com:9092 >>>> partitioner.class => kafka.producer.DefaultPartitioner >>>> producer.type => sync >>>> queue.buffering.max.messages => 10000 >>>> queue.buffering.max.ms => 5000 >>>> queue.enqueue.timeout.ms => -1 >>>> request.required.acks => 1 >>>> request.timeout.ms => 10000 >>>> retry.backoff.ms => 1000 >>>> send.buffer.bytes => 102400 >>>> serializer.class => kafka.serializer.StringEncoder >>>> topic.metadata.refresh.interval.ms => 600000 >>>> >>>> >>>> Thanks. >>>> >>> >>> >> >