Hi Roger,

I am going to briefly add to what others have already stated.

The recommendations made by Sunil and Luke are based on the fundamentals of
how Kafka stores and organizes events as well as the retrieval mechanism of
consumer groups.

Without additional details about the objectives of your implementation, it
may be hard to make recommendations that solve your concerns but I believe
Luke's recommendation given the limited description of your objectives
provided should solve what you need for the time being.

I am assuming that you are using keys for every event that you are
publishing to the Kafka topic, as this is essential in guaranteeing that
messages/events with the same key will always be routed to the same
partition id

Fundamentally, data/events that are stored in a Kafka partition are stored
in offsets within the same partition in the order received by the
producers. This guarantee is at the partition level and not at the topic
level.

If you have multiple partitions within a topic, the ordering is not
guaranteed at the topic level but only at the partition level. If the topic
has only one partition then you can essentially guarantee ordering at the
topic level globally.

Consequently, if you want to guarantee ordering at the topic level, you
cannot have more than one partition in your topic. If your requirements can
allow you limit the ordering guarantee to the specific event/message key
then ordering at the partition level should be sufficient regardless of
partition count and you should be able to have as many partitions as you
want because as long as each message has key, the same keys will always go
to the same partition and each consumer within the group will have its own
subset of partitions and the same consumer will process all the messages
from the same partition in the exact order received from the producer.

So going back to your scenario description, if you care about specific
sequential ordering at the topic level, then you cannot have more than one
partition (you have 4 at the moment)

If your ordering constraints are at the key-level (messages with the same
key) then the number of partitions should not matter as you should be able
to guarantee that the same consumer will always handle messages from the
same key and partition. Each consumer group has an identifier and the
partitions are divided as evenly as possible to each member of the consumer
group up to the number of available partitions. If you have more consumers
than partitions, then the extra consumers will not get any partition
assignments and will be idle.

Make sure you specify the auto.reset.offset as earliest in your logic and
you should be able to start from the beginning of the topic at the
consumers.

https://kafka.apache.org/documentation/#consumerconfigs_auto.offset.reset

*Question:* *Can anything be said about the order of the messages consumed
by my consumer? Is there a way to enforce the same order of messages for
every restart of my consumer?*

Now, to answer your question, what can be said about the order of messages
consumed by the consumer is that messages with the same key will always end
up in the same partition and consumer and all the messages from that
partition will be picked up in the order they were published by the
producer. If you have only one partition, then all the messages in that
topic will go to the same partition, and globally all messages in the topic
are organized by the order received from the producer. If you have more
than one partition, there is no way to guarantee the offsets and ordering
globally across the topic because ordering is only guaranteed at partition
level and not topic level.

I hope this adds more light to the basis of each recommendation you have
received from the community.

Israel Ekpo
Lead Instructor, IzzyAcademy.com
https://www.youtube.com/c/izzyacademy
https://izzyacademy.com/


On Thu, Jan 6, 2022 at 8:57 AM Roger Kasinsky <roger.kasin...@gmail.com>
wrote:

> Hi Luke,
>
> > The solution I can think of is to create only one partition for the
> topic.
>
> That would work, but then I lose the benefits of the partitions.
>
> > Or you can create 4 consumers in one group, to consume from 4 partitions.
> That works, too.
>
> That does not work, because I need only one consumer receiving all the
> messages in the same order on every run.
>
> Hi Suni,
>
> > Why dont you provide new name to consumer group each time you restart
> your consumer? This new consumer group will not conflict with the earlier
> one and it will be treated as new consumer thread next time to get all
> messages again.
>
> That does not solve my problem, which is to have the same consumer getting
> all the topic messages in the same order. I'm not worried about conflicts.
> In fact, I want the exact same consumer to run twice in a row. Renaming the
> consumer group does not help with anything related to message order.
>
> Thanks!
>
> -R
>
>
>
> On Thu, Jan 6, 2022 at 12:21 AM sunil chaudhari <
> sunilmchaudhar...@gmail.com>
> wrote:
>
> > hi,
> > Why dont you provide new name to consumer group each time you restart
> your
> > consumer?
> > This new consumer group will not conflict with the earlier one and it
> will
> > be treated as new consumer thread next time to get all messages again.
> >
> >
> > Regards,
> > Sunil.
> >
> > On Wed, 5 Jan 2022 at 10:45 PM, Roger Kasinsky <roger.kasin...@gmail.com
> >
> > wrote:
> >
> > > Hi,
> > >
> > > I have a topic divided into 4 partitions. I have a consumer that needs
> to
> > > consume all messages from the topic (all messages from all 4
> partitions).
> > > So to do that I have this consumer sitting by itself in its own
> consumer
> > > group. I'm not committing any offsets, because I want to read all
> > messages
> > > again on every restart of the consumer.
> > >
> > > *Question:* *Can anything be said about the order of the messages
> > consumed
> > > by my consumer? Is there a way to enforce the same order of messages
> for
> > > every restart of my consumer?*
> > >
> > > Thanks!
> > >
> > > -R
> > >
> >
>

Reply via email to