What if you don't know ahead of time how long a message will take to consume?
-- Ian Friedman On Sunday, August 25, 2013 at 10:45 AM, Neha Narkhede wrote: > Making producer side partitioning depend on consumer behavior might not be > such a good idea. If consumption is a bottleneck, changing producer side > partitioning may not help. To relieve consumption bottleneck, you may need > to increase the number of partitions for those topics and increase the > number of consumer instances. > > You mentioned that the consumers take longer to process certain kinds of > messages. What you can do is place the messages that require slower > processing in separate topics, so that you can scale the number of > partitions and number of consumer instances, for those messages > independently. > > Thanks, > Neha > > > On Sat, Aug 24, 2013 at 9:57 AM, Ian Friedman <i...@flurry.com > (mailto:i...@flurry.com)> wrote: > > > Hey guys! We recently deployed our kafka data pipeline application over > > the weekend and it is working out quite well once we ironed out all the > > issues. There is one behavior that we've noticed that is mildly troubling, > > though not a deal breaker. We're using a single topic with many partitions > > (1200 total) to load balance our 300 consumers, but what seems to happen is > > that some partitions end up more backed up than others. This is probably > > due more to the specifics of the application since some messages take much > > longer than others to process. > > > > I'm thinking that the random partitioning in the producer is unsuited to > > our specific needs. One option I was considering was to write an alternate > > partitioner that looks at the consumer offsets from zookeeper (as in the > > ConsumerOffsetChecker) and probabilistically weights the partitions by > > their lag. Does this sound like a good idea to anyone else? Is there a > > better or preferably already built solution? If anyone has any ideas or > > feedback I'd sincerely appreciate it. > > > > Thanks so much in advance. > > > > P.S. thanks especially to everyone who's answered my dumb questions on > > this mailing list over the past few months, we couldn't have done it > > without you! > > > > -- > > Ian Friedman > > > > >