Scenario: Four topics, each with one partition having 1000 messages. A single consumer group subscribed to all four topics. Only two consumer processes within the consumer group,..
Using the default strategy, each individual consumer will be subscribed to two topic_partitions. Will a single call to poll() return 250 messages from each topic/partition? Or will the first call return 500 messages from the first topic_partition, and the next call return 500 messages from the second topic_partition? Or is it totally random? Our underlying problem is that our consumers do some processing, then insert the results into different databases depending on the topic name. For efficiency, we're trying to do bulk writes to the destination DB, and unsure if this is already built-in to the kafka consumer, or if we need to add an additional queueing step. Cheers, Jeff