Re: [Kafka-users] Producer not distributing across all partitions

2014-10-22 Thread Mongeol Heo
Hi, First of all, thank you for replaying. And I am using 0.8.1.1. I am expecting the new producer will solve this kind of problem. Thanks, Mungeol On Wed, Oct 22, 2014 at 9:51 AM, Jun Rao wrote: > Yes, what you did is correct. See details in > > https://cwiki.apache.org/confluence/display/KA

Re: [Kafka-users] Producer not distributing across all partitions

2014-10-21 Thread Jun Rao
Yes, what you did is correct. See details in https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyisdatanotevenlydistributedamongpartitionswhenapartitioningkeyisnotspecified ? It seems that it doesn't work all the time. What version of Kafka are you using? Thanks, Jun On Mon, Oct 20, 20

[Kafka-users] Producer not distributing across all partitions

2014-10-20 Thread Mungeol Heo
Hi, I have a question about 'topic.metadata.refresh.interval.ms' configuration. As I know, the default value of it is 10 minutes. Does it means that producer will change the partition at every 10 minutes? What I am experiencing is producer does not change to another partition at every 10 minutes.

Re: [Kafka-users] Producer not distributing across all partitions

2014-10-16 Thread Neha Narkhede
A topic.metadata.refresh.interval.ms of 10 mins means that the producer will take 10 mins to detect new partitions. So newly added or reassigned partitions might not get data for 10 mins. In general, if you're still at prototyping stages, I'd recommend using the new producer available on kafka trun

[Kafka-users] Producer not distributing across all partitions

2014-10-16 Thread Mungeol Heo
Hi, I have a question about 'topic.metadata.refresh.interval.ms' configuration. As I know, the default value of it is 10 minutes. Does it means that producer will change the partition at every 10 minutes? What I am experiencing is producer does not change to another partition at every 10 minutes.

Re: Producer not distributing across all partitions

2013-09-15 Thread Swapnil Ghike
I meant to say that messages were appended to two different partitions, so one partition received 5 messages and other received 5 messages out of 10 messages that were produced, say. No messages were duplicated across partitions. Swapnil On 9/14/13 11:03 PM, "chetan conikee" wrote: >Swapnil > >

Re: Producer not distributing across all partitions

2013-09-14 Thread chetan conikee
Swapnil What do you mean by "I did a local test today that showed that choosing DefaultPartitioner with null key in the messages appended data to multiple partitions"? Are messages being duplicated across partitions? -Chetan On Sat, Sep 14, 2013 at 9:02 PM, Swapnil Ghike wrote: > Hi Joe, Dre

Re: Producer not distributing across all partitions

2013-09-14 Thread Swapnil Ghike
Hi Joe, Drew, In 0.8 HEAD, if the key is null, the DefaultEventHandler randomly chooses an available partition and never calls the partitioner.partition(key, numPartitions) method. This is done in lines 204 to 212 of the github commit Drew pointed to, though that piece of code is slightly differen

Re: Producer not distributing across all partitions

2013-09-14 Thread chetan conikee
Prashant I recall you mentioning that you are on the 0.8 branch .. If so, can you check your producer to verify if you are using DefaultParitioner, SimplePartitioner or null (which defaults to RandomParitioner)? *kafkaProps.put("partitioner.class", "kafka.producer.DefaultPartitioner") * Also ,

Re: Producer not distributing across all partitions

2013-09-14 Thread Swapnil Ghike
Hi Prashant, I tried a local test using a very short topic.metadata.refresh.interval.ms on the producer. The server had two partitions and both of them appended data. Could you check if you have set the topic.metadata.refresh.interval.ms on your producer to a very high value? Swapnil On 9/13/13

Re: Producer not distributing across all partitions

2013-09-13 Thread Jun Rao
Without fixing KAFKA-1017, the issue is that the producer will maintain a socket connection per min(#partitions, #brokers). If you have lots of producers, the open file handlers on the broker could be an issue. So, what KAFKA-1017 fixes is to pick a random partition and stick to it for a configura

Re: Producer not distributing across all partitions

2013-09-13 Thread prashant amar
Hi Guozhang, Joe, Drew In our case we have been running for the past 3 weeks and it has been consistently writing only to to the first partition. The rest of the partitions have empty index files. Not sure if I am hitting any issue here. I am using offset checker as my barometer. Also introspec

Re: Producer not distributing across all partitions

2013-09-13 Thread Guozhang Wang
Hello Joe, The reason we make the producers to produce to a fixed partition for each metadata-refresh interval are the following: https://issues.apache.org/jira/browse/KAFKA-1017 https://issues.apache.org/jira/browse/KAFKA-959 So in a word the randomness is still preserved but within one metada

Re: Producer not distributing across all partitions

2013-09-13 Thread prashant amar
I am using kafka 08 version ... On Thu, Sep 12, 2013 at 8:44 PM, Jun Rao wrote: > Which revision of 0.8 are you using? In a recent change, a producer will > stick to a partition for topic.metadata.refresh.interval.ms (defaults to > 10 > mins) time before picking another partition at random. > T

Re: Producer not distributing across all partitions

2013-09-13 Thread Joe Stein
Isn't this a bug? I don't see why we would want users to have to code and generate random partition keys to randomly distributed the data to partitions, that is Kafka's job isn't it? Or if supplying a null value tell the user this is not supported (throw exception) in KeyedMessage like we do for

Re: Producer not distributing across all partitions

2013-09-13 Thread Drew Goya
I ran into this problem as well Prashant. The default partition key was recently changed: https://github.com/apache/kafka/commit/b71e6dc352770f22daec0c9a3682138666f032be It no longer assigns a random partition to data with a null partition key. I had to change my code to generate random partiti

Re: Producer not distributing across all partitions

2013-09-13 Thread prashant amar
Thanks Neha I will try applying this property and circle back. Also, I have been attempting to execute kafka-producer-perf-test.sh and I receive the following error Error: Could not find or load main class kafka.perf.ProducerPerformance I am running against 0.8.0-beta1 Seems like perf i

Re: Producer not distributing across all partitions

2013-09-13 Thread chetan conikee
I am using kafka 0.8.0-beta1 .. Seems like messages are being delivered only to one partition (since installation) Should I upgrade or apply a patch to mitigate this issue. Please advice On Thu, Sep 12, 2013 at 8:44 PM, Jun Rao wrote: > Which revision of 0.8 are you using? In a recent change

Re: Producer not distributing across all partitions

2013-09-13 Thread Neha Narkhede
As Jun suggested, one reason could be that the topic.metadata.refresh.interval.ms is too high. Did you observe if the distribution improves after topic.metadata.refresh.interval.ms has passed ? Thanks Neha On Fri, Sep 13, 2013 at 4:47 AM, prashant amar wrote: > I am using kafka 08 version ...

Re: Producer not distributing across all partitions

2013-09-13 Thread Jun Rao
Which revision of 0.8 are you using? In a recent change, a producer will stick to a partition for topic.metadata.refresh.interval.ms (defaults to 10 mins) time before picking another partition at random. Thanks, Jun On Thu, Sep 12, 2013 at 1:56 PM, prashant amar wrote: > I created a topic with

Re: Producer not distributing across all partitions

2013-09-13 Thread Neha Narkhede
Are you using Kafka 07 or 08 ? On Thu, Sep 12, 2013 at 1:56 PM, prashant amar wrote: > I created a topic with 4 partitions and for some reason the producer is > pushing only to one partition. > > This is consistently happening across all topics that I created ... > > Is there a specific config

Producer not distributing across all partitions

2013-09-12 Thread prashant amar
I created a topic with 4 partitions and for some reason the producer is pushing only to one partition. This is consistently happening across all topics that I created ... Is there a specific configuration that I need to apply to ensure that load is evenly distributed across all partitions? Grou