I meant to say that messages were appended to two different partitions, so one partition received 5 messages and other received 5 messages out of 10 messages that were produced, say. No messages were duplicated across partitions.
Swapnil On 9/14/13 11:03 PM, "chetan conikee" <coni...@gmail.com> wrote: >Swapnil > >What do you mean by "I did a local test today that showed that choosing >DefaultPartitioner with >null key in the messages appended data to multiple partitions"? > >Are messages being duplicated across partitions? > >-Chetan > > >On Sat, Sep 14, 2013 at 9:02 PM, Swapnil Ghike <sgh...@linkedin.com> >wrote: > >> Hi Joe, Drew, >> >> In 0.8 HEAD, if the key is null, the DefaultEventHandler randomly >>chooses >> an available partition and never calls the partitioner.partition(key, >> numPartitions) method. This is done in lines 204 to 212 of the github >> commit Drew pointed to, though that piece of code is slightly different >>now >> because of KAFKA-1017 and KAFKA-959. >> >> I did a local test today that showed that choosing DefaultPartitioner >>with >> null key in the messages appended data to multiple partitions. For this >> Test, I set topic.metadata.refresh.interval.ms to 1 second because 0.8 >> HEAD >> Sticks to a partition in a given topic.metadata.refresh.interval.ms (as >>is >> being discussed in the other e-mail thread on dev@kafka). >> >> Please let me know if you see different results. >> >> Thanks, >> Swapnil >> >> >> >> On 9/13/13 1:48 PM, "Joe Stein" <crypt...@gmail.com> wrote: >> >> >Isn't this a bug? >> > >> >I don't see why we would want users to have to code and generate random >> >partition keys to randomly distributed the data to partitions, that is >> >Kafka's job isn't it? >> > >> >Or if supplying a null value tell the user this is not supported (throw >> >exception) in KeyedMessage like we do for topic and not treat null as a >> >key >> >to hash? >> > >> >My preference is to put those three lines back in and let key be null >>and >> >give folks randomness unless its not a bug and there is a good reason >>for >> >it? >> > >> >Is there something about >> >https://issues.apache.org/jira/browse/KAFKA-691that requires the lines >> >taken out? I haven't had a chance to look through >> >it yet >> > >> >My thought is a new person coming in they would expect to see the >> >partitions filling up in a round robin fashion as our docs says and >>unless >> >we force them in the API to know they have to-do this or give them the >> >ability for this to happen when passing nothing in >> > >> >/******************************************* >> > Joe Stein >> > Founder, Principal Consultant >> > Big Data Open Source Security LLC >> > http://www.stealth.ly >> > Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop> >> >********************************************/ >> > >> > >> >On Fri, Sep 13, 2013 at 4:17 PM, Drew Goya <d...@gradientx.com> wrote: >> > >> >> I ran into this problem as well Prashant. The default partition key >>was >> >> recently changed: >> >> >> >> >> >> >> >> >> >>https://github.com/apache/kafka/commit/b71e6dc352770f22daec0c9a3682138666 >> >>f032be >> >> >> >> It no longer assigns a random partition to data with a null partition >> >>key. >> >> I had to change my code to generate random partition keys to get the >> >> randomly distributed behavior the producer used to have. >> >> >> >> >> >> On Fri, Sep 13, 2013 at 11:42 AM, prashant amar <amasin...@gmail.com> >> >> wrote: >> >> >> >> > Thanks Neha >> >> > >> >> > I will try applying this property and circle back. >> >> > >> >> > Also, I have been attempting to execute kafka-producer-perf-test.sh >> >>and I >> >> > receive the following error >> >> > >> >> > Error: Could not find or load main class >> >> > kafka.perf.ProducerPerformance >> >> > >> >> > I am running against 0.8.0-beta1 >> >> > >> >> > Seems like perf is a separate project in the workspace. >> >> > >> >> > Does sbt package-assembly bundle the perf jar as well? >> >> > >> >> > Neither producer-perf-test not consumer-test are working with this >> >>build >> >> > >> >> > >> >> > >> >> > On Fri, Sep 13, 2013 at 9:56 AM, Neha Narkhede >> >><neha.narkh...@gmail.com >> >> > >wrote: >> >> > >> >> > > As Jun suggested, one reason could be that the >> >> > > topic.metadata.refresh.interval.ms is too high. Did you observe >>if >> >>the >> >> > > distribution improves after topic.metadata.refresh.interval.ms >>has >> >> > passed >> >> > > ? >> >> > > >> >> > > Thanks >> >> > > Neha >> >> > > >> >> > > >> >> > > On Fri, Sep 13, 2013 at 4:47 AM, prashant amar >><amasin...@gmail.com >> > >> >> > > wrote: >> >> > > >> >> > > > I am using kafka 08 version ... >> >> > > > >> >> > > > >> >> > > > On Thu, Sep 12, 2013 at 8:44 PM, Jun Rao <jun...@gmail.com> >> wrote: >> >> > > > >> >> > > > > Which revision of 0.8 are you using? In a recent change, a >> >>producer >> >> > > will >> >> > > > > stick to a partition for topic.metadata.refresh.interval.ms >> >> (defaults >> >> > > to >> >> > > > > 10 >> >> > > > > mins) time before picking another partition at random. >> >> > > > > Thanks, >> >> > > > > Jun >> >> > > > > >> >> > > > > >> >> > > > > On Thu, Sep 12, 2013 at 1:56 PM, prashant amar < >> >> amasin...@gmail.com> >> >> > > > > wrote: >> >> > > > > >> >> > > > > > I created a topic with 4 partitions and for some reason the >> >> > producer >> >> > > is >> >> > > > > > pushing only to one partition. >> >> > > > > > >> >> > > > > > This is consistently happening across all topics that I >> >>created >> >> ... >> >> > > > > > >> >> > > > > > Is there a specific configuration that I need to apply to >> >>ensure >> >> > that >> >> > > > > load >> >> > > > > > is evenly distributed across all partitions? >> >> > > > > > >> >> > > > > > >> >> > > > > > Group Topic Pid Offset >> >> > > > > logSize >> >> > > > > > Lag Owner >> >> > > > > > perfgroup1 perfpayload1 0 10965 >> >> > > > 11220 >> >> > > > > > 255 perfgroup1_XXXX-0 >> >> > > > > > perfgroup1 perfpayload1 1 0 >> >> > 0 >> >> > > > > > 0 perfgroup1_XXXX-1 >> >> > > > > > perfgroup1 perfpayload1 2 0 >> >> > 0 >> >> > > > > > 0 perfgroup1_XXXXX-2 >> >> > > > > > perfgroup1 perfpayload1 3 0 >> >> > 0 >> >> > > > > > 0 perfgroup1_XXXXX-3 >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> >>