Hi Roger, I am going to briefly add to what others have already stated.
The recommendations made by Sunil and Luke are based on the fundamentals of how Kafka stores and organizes events as well as the retrieval mechanism of consumer groups. Without additional details about the objectives of your implementation, it may be hard to make recommendations that solve your concerns but I believe Luke's recommendation given the limited description of your objectives provided should solve what you need for the time being. I am assuming that you are using keys for every event that you are publishing to the Kafka topic, as this is essential in guaranteeing that messages/events with the same key will always be routed to the same partition id Fundamentally, data/events that are stored in a Kafka partition are stored in offsets within the same partition in the order received by the producers. This guarantee is at the partition level and not at the topic level. If you have multiple partitions within a topic, the ordering is not guaranteed at the topic level but only at the partition level. If the topic has only one partition then you can essentially guarantee ordering at the topic level globally. Consequently, if you want to guarantee ordering at the topic level, you cannot have more than one partition in your topic. If your requirements can allow you limit the ordering guarantee to the specific event/message key then ordering at the partition level should be sufficient regardless of partition count and you should be able to have as many partitions as you want because as long as each message has key, the same keys will always go to the same partition and each consumer within the group will have its own subset of partitions and the same consumer will process all the messages from the same partition in the exact order received from the producer. So going back to your scenario description, if you care about specific sequential ordering at the topic level, then you cannot have more than one partition (you have 4 at the moment) If your ordering constraints are at the key-level (messages with the same key) then the number of partitions should not matter as you should be able to guarantee that the same consumer will always handle messages from the same key and partition. Each consumer group has an identifier and the partitions are divided as evenly as possible to each member of the consumer group up to the number of available partitions. If you have more consumers than partitions, then the extra consumers will not get any partition assignments and will be idle. Make sure you specify the auto.reset.offset as earliest in your logic and you should be able to start from the beginning of the topic at the consumers. https://kafka.apache.org/documentation/#consumerconfigs_auto.offset.reset *Question:* *Can anything be said about the order of the messages consumed by my consumer? Is there a way to enforce the same order of messages for every restart of my consumer?* Now, to answer your question, what can be said about the order of messages consumed by the consumer is that messages with the same key will always end up in the same partition and consumer and all the messages from that partition will be picked up in the order they were published by the producer. If you have only one partition, then all the messages in that topic will go to the same partition, and globally all messages in the topic are organized by the order received from the producer. If you have more than one partition, there is no way to guarantee the offsets and ordering globally across the topic because ordering is only guaranteed at partition level and not topic level. I hope this adds more light to the basis of each recommendation you have received from the community. Israel Ekpo Lead Instructor, IzzyAcademy.com https://www.youtube.com/c/izzyacademy https://izzyacademy.com/ On Thu, Jan 6, 2022 at 8:57 AM Roger Kasinsky <roger.kasin...@gmail.com> wrote: > Hi Luke, > > > The solution I can think of is to create only one partition for the > topic. > > That would work, but then I lose the benefits of the partitions. > > > Or you can create 4 consumers in one group, to consume from 4 partitions. > That works, too. > > That does not work, because I need only one consumer receiving all the > messages in the same order on every run. > > Hi Suni, > > > Why dont you provide new name to consumer group each time you restart > your consumer? This new consumer group will not conflict with the earlier > one and it will be treated as new consumer thread next time to get all > messages again. > > That does not solve my problem, which is to have the same consumer getting > all the topic messages in the same order. I'm not worried about conflicts. > In fact, I want the exact same consumer to run twice in a row. Renaming the > consumer group does not help with anything related to message order. > > Thanks! > > -R > > > > On Thu, Jan 6, 2022 at 12:21 AM sunil chaudhari < > sunilmchaudhar...@gmail.com> > wrote: > > > hi, > > Why dont you provide new name to consumer group each time you restart > your > > consumer? > > This new consumer group will not conflict with the earlier one and it > will > > be treated as new consumer thread next time to get all messages again. > > > > > > Regards, > > Sunil. > > > > On Wed, 5 Jan 2022 at 10:45 PM, Roger Kasinsky <roger.kasin...@gmail.com > > > > wrote: > > > > > Hi, > > > > > > I have a topic divided into 4 partitions. I have a consumer that needs > to > > > consume all messages from the topic (all messages from all 4 > partitions). > > > So to do that I have this consumer sitting by itself in its own > consumer > > > group. I'm not committing any offsets, because I want to read all > > messages > > > again on every restart of the consumer. > > > > > > *Question:* *Can anything be said about the order of the messages > > consumed > > > by my consumer? Is there a way to enforce the same order of messages > for > > > every restart of my consumer?* > > > > > > Thanks! > > > > > > -R > > > > > >