Hi Eno,
I am afraid I played too much with the configuration to make this
productive investigation :(

This is a QA environment which includes 2 kafka instances and 3 zookeeper
instances in AWS. There are only 3 partition for this topic.
Kafka broker and kafka-stream are version 0.10.1.1
Our kafka-stream app run on docker using kubernetes.
I played around with with 1 to 3  kafka-stream processes, but I got the
same results. It is too easy to scale with kubernetes :)
Since there are only 3 partitions, I didn't start more then 3 instances.

I was too quick to upgraded only the kafka-stream app to 0.10.2.1 with hope
that it will solve the problem, It didn't.
The log I sent before are from this version.

I did notice "unknown" offset for the main topic with kafka-stream version
0.10.2.1
$ ./bin/kafka-consumer-groups.sh   --bootstrap-server localhost:9092
--describe --group sa
GROUP                          TOPIC                          PARTITION
CURRENT-OFFSET  LOG-END-OFFSET  LAG             OWNER
sa             sa-events                 0          842199
842199          0
sa-4557bf2d-ba79-42a6-aa05-5b4c9013c022-StreamThread-1-consumer_/10.0.10.9
sa             sa-events                 1          1078428
1078428         0
sa-4557bf2d-ba79-42a6-aa05-5b4c9013c022-StreamThread-1-consumer_/10.0.10.9
sa             sa-events                 2          unknown
26093910        unknown
sa-4557bf2d-ba79-42a6-aa05-5b4c9013c022-StreamThread-1-consumer_/10.0.10.9

After that I downgraded the kafka-stream app back to version 0.10.1.1
After a LONG startup time (more than an hour) where the status of the group
was rebalancing, all the 3 processes started processing messages again.

This all thing started after we hit a bug in our code (NPE) that crashed
the stream processing thread.
So now after 4 days, everything is back to normal.
This worries me since it can happen again


On Mon, May 1, 2017 at 11:45 AM, Eno Thereska <eno.there...@gmail.com>
wrote:

> Hi Shimi,
>
> Could you provide more info on your setup? How many kafka streams
> processes do you have and from how many partitions are they consuming from.
> If you have more processes than partitions some of the processes will be
> idle and won’t do anything.
>
> Eno
> > On Apr 30, 2017, at 5:58 PM, Shimi Kiviti <shim...@gmail.com> wrote:
> >
> > Hi Everyone,
> >
> > I have a problem and I hope one of you can help me figuring it out.
> > One of our kafka-streams processes stopped processing messages
> >
> > When I turn on debug log I see lots of these messages:
> >
> > 2017-04-30 15:42:20,228 [StreamThread-1] DEBUG o.a.k.c.c.i.Fetcher:
> Sending
> > fetch for partitions [devlast-changelog-2] to broker ip-x-x-x-x
> > .ec2.internal:9092 (id: 1 rack: null)
> > 2017-04-30 15:42:20,696 [StreamThread-1] DEBUG o.a.k.c.c.i.Fetcher:
> > Ignoring fetched records for devlast-changelog-2 at offset 2962649 since
> > the current position is 2963379
> >
> > After a LONG time, the only messages in the log are these:
> >
> > 2017-04-30 16:46:33,324 [kafka-coordinator-heartbeat-thread | sa] DEBUG
> > o.a.k.c.c.i.AbstractCoordinator: Sending Heartbeat request for group sa
> to
> > coordinator ip-x-x-x-x.ec2.internal:9092 (id: 2147483646 rack: null)
> > 2017-04-30 16:46:33,425 [kafka-coordinator-heartbeat-thread | sa] DEBUG
> > o.a.k.c.c.i.AbstractCoordinator: Received successful Heartbeat response
> for
> > group same
> >
> > Any idea?
> >
> > Thanks,
> > Shimi
>
>

Reply via email to