If you look under Producer configs you see the following key 
‘topic.metadata.refresh.interval.ms’ with a default of 600 * 1000 (10 minutes). 
It is not entirely clear but this controls how often a producer will a null key 
partitioner will switch partitions that it is writing to. In my production app 
I set this down to 1 minute and haven’t seen any ill effects but it is good to 
note that the shorter you get *could* cause some issues and extra overhead. I 
agree this could probably be a little more clear in the documentation.
- 
Andrew Jorgensen
@ajorgensen

On December 5, 2014 at 1:34:00 PM, Yury Ruchin (yuri.ruc...@gmail.com) wrote:

Hello,  

I've come across a (seemingly) strange situation when my Kafka producer  
gave so uneven distribution across partitions. I found that I used null key  
to produce messages, guided by the following clause in the documentation:  
"If the key is null, then a random broker partition is picked." However,  
after looking at the code, I found that the broker partition is not truly  
random for every message - instead, the randomly picked partition number  
sticks and only refreshes after the topic.metadata.refresh.ms expires,  
which is 10 minutes by default. So, with null key the producer keeps  
writing to the same partition for 10 minutes.  

Is my understanding of partitioning with null key correct? If yes,  
shouldn't the documentation be fixed then to explicitly describe the sticky  
pseudo-random partition assignment?  

Thanks,  
Yury  

Reply via email to