Re: Topics and Partitions

Michal Michalski Fri, 06 Oct 2017 01:54:05 -0700

Hey Josh,

Consumption from non-existent topic will end up with "LEADER_NOT_AVAILABLE".


However (!) I just tested it locally (Kafka 0.11) and it seems like
consuming from a topic that doesn't exist with auto.create.topics.enable
set to true *will create it* as well (I'm checking it in Zookeeper's
/brokers/topics path).

I'm a bit surprised this works. Documentation states that:

"You have the option of either adding topics manually *or having them be
created automatically when data is first published to a non-existent topic.*
"

This (pretty old) email thread confirm that that's intentional:
http://grokbase.com/t/kafka/users/14a2rgj2h2/auto-topic-creation-not-working-for-attempts-to-consume-non-existing-topic
(Jun Rao: "In general, *only writers should trigger auto topic creation,
but not the readers*. So, a topic can be auto created by the producer, but
not the consumer.")

So I'm not sure now if it's a regression or a change made later that's not
reflected in the docs, but it looks like you *can* currently create topics
using consumer. I wouldn't rely on this "feature" though - to me,
personally, it seems wrong and I'm guessing it might be a bug.

Please correct me if I'm wrong / missing something :-)

Michał



On 6 October 2017 at 04:37, Josh Maidana <joshmaid...@gmail.com> wrote:

> Michal,
>
> You mentioned topics are only dynamically created with producers. Does that
> mean if a consumer starts on a non-existent topic, it throws an error?
>
> Kind regards
> Meeraj
>
> On Thu, Oct 5, 2017 at 9:20 PM, Josh Maidana <joshmaid...@gmail.com>
> wrote:
>
> > Thank you, Michal.
> >
> > That answers all my questions, many thanks.
> >
> > Josh
> >
> > On Thu, Oct 5, 2017 at 1:21 PM, Michal Michalski <
> > michal.michal...@zalando.ie> wrote:
> >
> >> Hi Josh,
> >>
> >> 1. I don't know for sure (haven't seen the code that does it), but it's
> >> probably the most "even" split possible for given number of brokers and
> >> partitions. So for 8 partitions and 3 brokers it would be [3, 3, 2].
> >> 2. See "num.partitions" in broker config. BTW. only producer can create
> >> topic dynamically, not consumer.
> >> 3. See 3. The value has to be non-zero, so it's always specified.
> >> 4. Based on the ProducerRecord (message) key. See:
> >> https://kafka.apache.org/0110/javadoc/index.html?org/apache/
> >> kafka/clients/producer/KafkaProducer.html
> >> 5. Yes - you need to create multiple consumers with the same group.id.
> >> 6. Yes, there'll be at most one consumer (within a consumer group)
> >> handling
> >> given partition at a given time.
> >> 7. Yes, it's a process called "rebalancing" - it reassigns partitions to
> >> consumers when the number of consumers changes.
> >> 8. Your consumer will commit the last processed offset to special Kafka
> >> topic (or Zookeeper, but that's not a default) every so often
> >> (periodically
> >> or "on demand", when you tell it to), so for each partition and consumer
> >> group you know what was and wasn't processed yet. The new consumer will
> >> pick up from the place where the dead one left off.
> >> 9. If I understand your question correctly - no, Kafka is pull-based and
> >> not push-based by design.
> >>
> >> Kind regards,
> >> Michał
> >>
> >> On 5 October 2017 at 09:37, Josh Maidana <joshmaid...@gmail.com> wrote:
> >>
> >> > Hello
> >> >
> >> > I am quite new to KAFKA and come from a JMS/messaging background.
> >> Reading
> >> > through the documentation, I gather using partitions and consumer
> >> groups,
> >> > KAFKA achieves both P2P and pub/sub. I have a few questions on
> >> partitions,
> >> > though, I was wondering someone could kindly please point me in the
> >> right
> >> > directions.
> >> >
> >> > 1. In a multi-server scenario, how does KAFKA decide how many
> >> partitions of
> >> > a given topic is assigned to a given node?
> >> > 2. When a topic is created dynamically by a consumer or a producer,
> how
> >> is
> >> > the number of partitions specified?
> >> > 3. If it is not or can't be specified, how does KAFKA decide the
> number
> >> of
> >> > partitions to create?
> >> > 4. If a producer doesn't specify a partition, how does KAFKA decide to
> >> > which partition the message is allocated.
> >> > 5. On consumption, do I need to explicitly create multiple consumers
> to
> >> > attain parallelism?
> >> > 6. If yes, would KAFKA allocate different partition to different
> >> consumers
> >> > who are part of the same consumer group?
> >> > 7. If one of those consumers exit, would KAFKA reallocate the
> >> partitions to
> >> > remaining consumers?
> >> > 8. How are the offsets propagated from an exited to consumer to the
> new
> >> > consumer to which the partition is reallocated?
> >> > 9. Is there a listener based API for consumption instead os a blocking
> >> > poll?
> >> >
> >> > Kind regards
> >> > Josh
> >> >
> >>
> >
> >
>

Re: Topics and Partitions

Reply via email to