I want to use Kafka for notifications of changes to data in a dataservice/database. For each object that changes, a kafka message will be sent. This is easy and we've got that working no problem.
Here is my use case : I want to be able to fire up a process that will 1) determine the current location of the kafka topic (right now we use 2 partitions so that would be the offset for each partition) 2) do a long running process that will copy data from the database 3) once the process is over, put the location back into a kafka consumer and start processing notifications in sequence This isn't very hard either but there is a problem that we'd face if during step (2) partitions are added to the topic (say by our operations team). I know we can set up a ConsumerRebalanceListener but I don't think that will help because we'd need to back to a time when we had our original number of partitions and then we'd need to know exactly when to start reading from the new partition(s). for example start : 2 partitions (0,1) at offsets p0,100 and p1,100 1) we store the offsets and partitions : p0,100 and p1,100 2) we run the db ingest 3) messages are posted to p0 and p1 4) OPS team adds p2 and our ConsumerRebalanceListener would be notified 5) we are done and we set our consumer to p0,100 and p1,100 (and p2,0 thanks to the ConsumerRebalanceListener) how would we guarantee the order of messages received from our consumer across all 3 partitions?