Hi everyone, I have a use-case where I need to use many topics (about 500) each topic have different number of partitions, some as low as 10, and some as many as 20K partitions. The reasons this is a requirement is data sensitivity, we cannot put data from different sources in same toppar.
My Kafka cluster is version 0.8.2.1 and I am using high-level consumer with TopicFilter to target multiple topics like this: Consumer-Group-A = (consumers a1, a2, a3) => topicFilter = "A1|A2|A3" Consumer-Group-B = (consumers b1, b2, b3) => topicFilter = "B1|...|B10" ... Consumer-Group-Z = (consumer z1, z2, z3) => topicFilter = "Z1|...|Z50" Two observations: *Issue1*: If Topic A1 was created after I started my consumers in Consumer-Group-A, it is not picked up, I do not receive messages from A1 until I restart consumers in that group. *Issue2*: This arrangement is working fine for topics that have less than 5K partitions, I noticed for all my topics that have 5K+ partitions, the consumer is unstable, sometimes it cannot connect to the cluster, long communications with Zookeeper, and I see many instances of the error ArrayIndexOutOfBoundsException as in here KAFKA-2174 Anyone have a similar design ran through these issues? Thank you Neil