Hello, If KafkaConsumer is subscribed to more than one topic or even for same topic, if the consumer is assigned more than one partition, what is the behavior of KafkaConsumer.poll()?
In our use case, we would like to use, for example, "user id" as a key for records for topics. Naturally, for some users, we receive lot more records than others, which would result in different partitions of the same topic having different record rate. Some partitions will have substantially more record rate than others. So the questions I had were: * Will KafkaConsumer.poll() return same number of records for each partition+topic combo? For example, if max records is set to 500, and if consumer is assigned 5 partitions from 5 topics (1 partition per topic), then will poll return 100 records for each partition+topic? * What happens if partitions have different rate and size for incoming records? I suspect if Kafka brokers return same number of records for each partition assigned to the consumer instance, then some partitions with high rate of incoming records may start falling behind? Or do brokers take the lag of each partition into account when returning records for poll() API? * In other case, what happens if partitions assigned to a consumer have different brokers as leaders? How does poll() behave? For example, if consumer has 3 partitions assigned - which are across 3 different brokers, and if max records for poll is set to 300, will consumer ask only 100 max records from each broker? Thanks, M