Thanks for reporting this. What is your `offsets.topic.replication.factor`?
-Matthias On 12/19/17 8:32 AM, Adam Gurson wrote: > I am running two kafka 0.11 clusters. Cluster A has two 0.11.0.0 brokers > with 3 zookeepers. Cluster B has 4 0.11.0.1 brokers with 5 zookeepers. > > We have recently updated from running 0.8.2 client and brokers to 0.11. In > addition, we added two kafka streams group.id that process data from one of > the topics that all of the old code processes from. > > Most of the time, scaling the streams clients up or down works ask > expected. The streams clients go into a rebalance and come up with all > consumer offsets correct for the topic. > > However, I have found two cases were a sever loss of offsets is occuring: > > On Cluster A (min.insync.replicas=1), I do a normal "cycle" of the brokers, > to stop/start them one at a time, giving time for the brokers to handshake > and exchange leadership as necessary. Twice now I have done this, and both > kafka streams consumers have rebalanced only to come up with totally messed > up offsets. The offsets for one group.id is set to 5,000,000 for all > partitions, and the other group.id offsets were set to a number just short > of 7,000,000. > > On Cluster B (min.insync.replicas=2), I am running the exact same streams > code. I have seen cases where if I scale up or down twoo quickly (i.e. add > or remove too many streams clients at once) before a rebalance has > finished, the offsets for the group.ids are completely lost. This causes > the streams consumers to reset according to "auto.offset.reset". > > In both cases, streams is calculating real-time metrics for data flowing > through our brokers. These are serious issues because it causes them to > completely get the counting wrong, either doubly counting or skipping data > altogether. I have scoured the web and have been unable to find anyone else > having this issue with streams. > > I should also mention that all of our old 0.8.2 consumer code (which is > updated to 0.11 client library) never has any problems with offsets. My > guess is because they are still using zookeeper to store their offsets. > > This implies to me that the __consumer_offsets topic isn't being utilized > by streams clients correctly. > > I'm at a total loss at this point and would greatly appreciate any advice. > Thank you. >
signature.asc
Description: OpenPGP digital signature