Hey Apurva, I am including the batch_id inside the messages.
Could you give me an example of what you mean by custom control messages with a control topic please? On Sat, Dec 3, 2016 at 12:35 AM, Apurva Mehta <apu...@confluent.io> wrote: > That should work, though it sounds like you may be interested in : > https://cwiki.apache.org/confluence/display/KAFKA/KIP- > 98+-+Exactly+Once+Delivery+and+Transactional+Messaging > > If you can include the 'batch_id' inside your messages, and define custom > control messages with a control topic, then you would not need one topic > per batch, and you would be very close to the essence of the above > proposal. > > Thanks, > Apurva > > On Fri, Dec 2, 2016 at 5:02 AM, Ali Akhtar <ali.rac...@gmail.com> wrote: > > > Heya, > > > > I need to send a group of messages, which are all related, and then > process > > those messages, only when all of them have arrived. > > > > Here is how I'm planning to do this. Is this the right way, and can any > > improvements be made to this? > > > > 1) Send a message to a topic called batch_start, with a batch id (which > > will be a UUID) > > > > 2) Post the messages to a topic called batch_msgs_<batch_id>. Here > batch_id > > will be the batch id sent in batch_start. > > > > The number of messages sent will be recorded by the producer. > > > > 3) Send a message to batch_end with the batch id and the number of sent > > messages. > > > > 4) On the consumer side, using Kafka Streaming, I would listen to > > batch_end. > > > > 5) When the message there arrives, I will start another instance of Kafka > > Streaming, which will process the messages in batch_msgs_<batch_id> > > > > 6) Perhaps to be extra safe, whenever batch_end arrives, I will start a > > throwaway consumer which will just count the number of messages in > > batch_msgs_<batch_id>. If these don't match the # of messages specified > in > > the batch_end message, then it will assume that the batch hasn't yet > > finished arriving, and it will wait for some time before retrying. Once > the > > correct # of messages have arrived, THEN it will trigger step 5 above. > > > > Will the above method work, or should I make any changes to it? > > > > Is step 6 necessary? > > >