Hi Josh, 1. I don't know for sure (haven't seen the code that does it), but it's probably the most "even" split possible for given number of brokers and partitions. So for 8 partitions and 3 brokers it would be [3, 3, 2]. 2. See "num.partitions" in broker config. BTW. only producer can create topic dynamically, not consumer. 3. See 3. The value has to be non-zero, so it's always specified. 4. Based on the ProducerRecord (message) key. See: https://kafka.apache.org/0110/javadoc/index.html?org/apache/kafka/clients/producer/KafkaProducer.html 5. Yes - you need to create multiple consumers with the same group.id. 6. Yes, there'll be at most one consumer (within a consumer group) handling given partition at a given time. 7. Yes, it's a process called "rebalancing" - it reassigns partitions to consumers when the number of consumers changes. 8. Your consumer will commit the last processed offset to special Kafka topic (or Zookeeper, but that's not a default) every so often (periodically or "on demand", when you tell it to), so for each partition and consumer group you know what was and wasn't processed yet. The new consumer will pick up from the place where the dead one left off. 9. If I understand your question correctly - no, Kafka is pull-based and not push-based by design.
Kind regards, MichaĆ On 5 October 2017 at 09:37, Josh Maidana <joshmaid...@gmail.com> wrote: > Hello > > I am quite new to KAFKA and come from a JMS/messaging background. Reading > through the documentation, I gather using partitions and consumer groups, > KAFKA achieves both P2P and pub/sub. I have a few questions on partitions, > though, I was wondering someone could kindly please point me in the right > directions. > > 1. In a multi-server scenario, how does KAFKA decide how many partitions of > a given topic is assigned to a given node? > 2. When a topic is created dynamically by a consumer or a producer, how is > the number of partitions specified? > 3. If it is not or can't be specified, how does KAFKA decide the number of > partitions to create? > 4. If a producer doesn't specify a partition, how does KAFKA decide to > which partition the message is allocated. > 5. On consumption, do I need to explicitly create multiple consumers to > attain parallelism? > 6. If yes, would KAFKA allocate different partition to different consumers > who are part of the same consumer group? > 7. If one of those consumers exit, would KAFKA reallocate the partitions to > remaining consumers? > 8. How are the offsets propagated from an exited to consumer to the new > consumer to which the partition is reallocated? > 9. Is there a listener based API for consumption instead os a blocking > poll? > > Kind regards > Josh >