Guozhang, can you point me to the code that implements "periodic/sticky" random partitioner? I actually like to try it out in our env, even though I assume it is NOT ported to 0.8.2 java producer.
Thanks, Steven On Mon, Dec 8, 2014 at 1:43 PM, Guozhang Wang <wangg...@gmail.com> wrote: > Hi Yury, > > Originally the producer behavior under null-key is "random" random, but > later changed to this "periodic" random to reduce the number of sockets on > the server side: imagine if you have n brokers and m producers where m >>> > n, with random random distribution each server will need to maintain a > socket with each of the m producers. > > We realized that this change IS misleading and we have changed back to > random random in the new producer released in 0.8.2. > > > Guozhang > > On Fri, Dec 5, 2014 at 10:43 AM, Andrew Jorgensen < > ajorgen...@twitter.com.invalid> wrote: > > > If you look under Producer configs you see the following key ‘ > > topic.metadata.refresh.interval.ms’ with a default of 600 * 1000 (10 > > minutes). It is not entirely clear but this controls how often a producer > > will a null key partitioner will switch partitions that it is writing to. > > In my production app I set this down to 1 minute and haven’t seen any ill > > effects but it is good to note that the shorter you get *could* cause > some > > issues and extra overhead. I agree this could probably be a little more > > clear in the documentation. > > - > > Andrew Jorgensen > > @ajorgensen > > > > On December 5, 2014 at 1:34:00 PM, Yury Ruchin (yuri.ruc...@gmail.com) > > wrote: > > > > Hello, > > > > I've come across a (seemingly) strange situation when my Kafka producer > > gave so uneven distribution across partitions. I found that I used null > key > > to produce messages, guided by the following clause in the documentation: > > "If the key is null, then a random broker partition is picked." However, > > after looking at the code, I found that the broker partition is not truly > > random for every message - instead, the randomly picked partition number > > sticks and only refreshes after the topic.metadata.refresh.ms expires, > > which is 10 minutes by default. So, with null key the producer keeps > > writing to the same partition for 10 minutes. > > > > Is my understanding of partitioning with null key correct? If yes, > > shouldn't the documentation be fixed then to explicitly describe the > sticky > > pseudo-random partition assignment? > > > > Thanks, > > Yury > > > > > > -- > -- Guozhang >