I would upgrade to Kafka 2.1.1 (should be announced tomorrow). It includes a fix for a deadlock in 2.1.0 that could be the reason for the behaviour you are seeing.
Ismael On Sat, Feb 16, 2019 at 7:28 PM Ankur Rana <ankur.r...@getfareye.com> wrote: > Hi Ismael, > > Thank you for replying. > > We are using kafka version 2.1.0 > and Kafka streams version 2.0.0 > > Just you let you know, I was able to fix the problem by changing processing > guarantee config from exactly once to at least once but now I'll need to > verify if results are affected. > > > > > > > On Sat, Feb 16, 2019 at 10:32 PM Ismael Juma <ism...@juma.me.uk> wrote: > > > Hi, > > > > What version of Kafka are you using? > > > > Ismael > > > > On Fri, Feb 15, 2019 at 8:32 PM Ankur Rana <ankur.r...@getfareye.com> > > wrote: > > > > > Any comments anyone? > > > > > > On Fri, Feb 15, 2019 at 6:08 PM Ankur Rana <ankur.r...@getfareye.com> > > > wrote: > > > > > >> Hi everyone, > > >> > > >> We have a Kafka cluster with 5 brokers with all topics having at > least 2 > > >> replication factor. We have multiple Kafka consumers applications > > running > > >> on this cluster. Most of these consumers are build using consumer APIs > > and > > >> quite recently we have started using Stream applications. > > >> > > >> We are facing a really weird issue. Just Sometimes it happens that our > > >> Kafka cluster breaks down, By breaking down I mean that consumers and > > >> producers start throwing disconnection exception and all of them just > > stop. > > >> > > >> We use debezium connector to push Postgres events to Kafka topics. > > >> Debezium throws the error below: > > >> [image: image.png] > > >> > > >> > > >> Kafka broker throws the error below: > > >> COORDINATOR_NOT_AVAILABLE > > >> [image: image.png] > > >> > > >> > > >> Error on the consumer side : > > >> > > >> [image: image.png] > > >> > > >> > > >> In order to fix, I stop the disconnected broker and everything fixes > > >> itself. Debezium starts flushing messages and all consumers start > > working > > >> normally. I bring the disconnected broker up and everything works as > > >> before without any problem. > > >> > > >> I don't understand a few things here : > > >> > > >> > > >> 1. what could be the reason behind this disconnection exception. > Even > > >> if one of the broker was somehow disconnected, Isn't kafka suppose > to > > >> handle it in a cluster where all topics have a replication factor > of > > 2. > > >> 2. It appears that the malfunctioning broker was in a state where > it > > >> was neither disconnected nor connected to the cluster. I could > still > > see > > >> the broker visible in Kafka manager with zero bytes In, while it > was > > >> disconnected from all the producers and consumers. > > >> 3. Weirdly, I have noticed that this situation usually occurs when > I > > >> start the multiple consumers of the stream application. Not sure > > about this > > >> as this error has only occurred a few times. It happened twice > today > > and > > >> both the times I started 3 consumers of the same stream > application. > > >> > > >> > > >> Can anyone help me debug this problem. I don't know where to look for > > >> possible issues with our cluster or stream application. I am attaching > > >> streams config and stream application code for your reference. > > >> Please feel free to ask for any more details. > > >> > > >> > > >> Stream config : > > >> [image: image.png] > > >> > > >> > > >> Stream application code : https://codeshare.io/Gq6pLB > > >> > > >> -- > > >> Thanks, > > >> > > >> Ankur Rana > > >> Software Developer > > >> FarEye > > >> > > > > > > > > > -- > > > Thanks, > > > > > > Ankur Rana > > > Software Developer > > > FarEye > > > > > > > > -- > Thanks, > > Ankur Rana > Software Developer > FarEye >