Hi Jeff, It depends on the size of the messages as well as settings like `fetch.max.bytes` and `max.partition.fetch.bytes`. Assuming that the fetch returned all messages for all 4 topic partitions, the consumer would return 500 messages from one partition, then 500 from another partition and so on.
Ismael On Wed, Dec 14, 2016 at 2:04 PM, Jeff Widman <j...@netskope.com> wrote: > Scenario: > Four topics, each with one partition having 1000 messages. A single > consumer group subscribed to all four topics. Only two consumer processes > within the consumer group,.. > > Using the default strategy, each individual consumer will be subscribed to > two topic_partitions. > > Will a single call to poll() return 250 messages from each topic/partition? > Or will the first call return 500 messages from the first topic_partition, > and the next call return 500 messages from the second topic_partition? Or > is it totally random? > > Our underlying problem is that our consumers do some processing, then > insert the results into different databases depending on the topic name. > For efficiency, we're trying to do bulk writes to the destination DB, and > unsure if this is already built-in to the kafka consumer, or if we need to > add an additional queueing step. > > Cheers, > Jeff >