Hi, I am wondering if there is something I am missing about my set up to facilitate long running jobs.
For my purposes it is ok to have `At most once` message delivery, this means it is not required to think about committing offsets (or at least it is ok to commit each message offset upon receiving it). I have the following in order to achieve the competing consumer pattern: * A topic * X consumers in the same group * P partitions in a topic (where P >= X always) My problem is that I have messages that can take ~15 minutes (but this may fluctuate by up to 50% lets say) in order to process. In order to avoid consumers having their partition assignments revoked I have increased the value of `max.poll.interval.ms` to reflect this. However this comes with some negative consequences: * if some message exceeds this length of time then in a worst case scenario a the consumer processing this message will have to wait up to the value of `max.poll.interval.ms` for a rebalance * if I need to scale and increase the number of consumers based on load then any new consumers might also have to wait the value of `max.poll.interval.ms` for a rebalance to occur in order to process any messages As it stands at the moment I see that I can proceed as follows: * Set `max.poll.interval.ms` to be a small value and accept that every consumer processing every message will time out and go through the process of having assignments revoked and waiting a small amount of time for a rebalance However I do not like this, and am considering looking at alternative technology for my message queue as I do not see any obvious way around this. Admittedly I am new to Kafka, and it is just a gut feeling that the above is not desirable. I have used RabbitMQ in the past for these scenarios, however we need Kafka in our architecture for other purposes at the moment and it would be nice not to have to introduce another technology if Kafka can achieve this. I appreciate any advise that anybody can offer on this subject. Regards, Ger