Hey guys! We recently deployed our kafka data pipeline application over the 
weekend and it is working out quite well once we ironed out all the issues. 
There is one behavior that we've noticed that is mildly troubling, though not a 
deal breaker. We're using a single topic with many partitions (1200 total) to 
load balance our 300 consumers, but what seems to happen is that some 
partitions end up more backed up than others. This is probably due more to the 
specifics of the application since some messages take much longer than others 
to process. 

I'm thinking that the random partitioning in the producer is unsuited to our 
specific needs. One option I was considering was to write an alternate 
partitioner that looks at the consumer offsets from zookeeper (as in the 
ConsumerOffsetChecker) and probabilistically weights the partitions by their 
lag. Does this sound like a good idea to anyone else? Is there a better or 
preferably already built solution? If anyone has any ideas or feedback I'd 
sincerely appreciate it.

Thanks so much in advance.

P.S. thanks especially to everyone who's answered my dumb questions on this 
mailing list over the past few months, we couldn't have done it without you! 

-- 
Ian Friedman

Reply via email to