Is this one looking suspicious?

https://gist.github.com/dvsekhvalnov/146ba41c8e78316941098997c9d2f18a#file-thread-dump

On Fri, Jun 30, 2017 at 6:10 PM, Damian Guy <damian....@gmail.com> wrote:

> It looks like Task [4_2] is stuck restoring state, though it doesn't look
> like there is much state to restore.
> It might be helpful if you take some thread dumps to see where it is
> blocked.
>
> Thanks,
> Damian
>
> On Fri, 30 Jun 2017 at 16:04 Dmitriy Vsekhvalnov <dvsekhval...@gmail.com>
> wrote:
>
> > Set  org.apache.kafka.streams to DEBUG.
> >
> > Here is gist:
> >
> > https://gist.github.com/dvsekhvalnov/b84b72349837f6c6394f1adfe18cdb
> 61#file-debug-logs
> >
> > On Fri, Jun 30, 2017 at 12:37 PM, Dmitriy Vsekhvalnov <
> > dvsekhval...@gmail.com> wrote:
> >
> > > Sure, how to enable debug logs? Just adjust logback logger?
> > >
> > >
> > > On Fri, Jun 30, 2017 at 11:24 Eno Thereska <eno.there...@gmail.com>
> > wrote:
> > >
> > >> It’s hard to tell, the logs do not contain much, I agree. It could be
> a
> > >> number of things.
> > >>
> > >> If it’s happening as you say on restart as well (so it’s
> reproducible),
> > >> any chance you could start streaming with DEBUG logs on and collect
> > those
> > >> logs? I’m hoping something shows up there.
> > >>
> > >> Thanks,
> > >> Eno
> > >>
> > >>
> > >> > On Jun 28, 2017, at 5:30 PM, Dmitriy Vsekhvalnov <
> > >> dvsekhval...@gmail.com> wrote:
> > >> >
> > >> > Nothing for stat-change.log for giving time window. Last line
> logged 4
> > >> > hours before app stopped.
> > >> >
> > >> > Any ideas so far? Personally i don't see anything relevant in logs.
> > >> >
> > >> > On Wed, Jun 28, 2017 at 6:33 PM, Bill Bejeck <b...@confluent.io>
> > wrote:
> > >> >
> > >> >> Sure, couldn't hurt.
> > >> >>
> > >> >> Thanks,
> > >> >> Bill
> > >> >>
> > >> >> On Wed, Jun 28, 2017 at 9:51 AM, Dmitriy Vsekhvalnov <
> > >> >> dvsekhval...@gmail.com
> > >> >>> wrote:
> > >> >>
> > >> >>> Here are logs:
> > >> >>>
> > >> >>> app:
> > >> >>> https://gist.github.com/dvsekhvalnov/
> f98afc3463f0c63b1722417e3710a8
> > >> >>> e7#file-kafka-streams-log
> > >> >>> brokers:
> > >> >>> https://gist.github.com/dvsekhvalnov/
> 8e870f7347394e8d004c282880ef38
> > >> >>> 5a#file-kafka-broker-1-2-3-log
> > >> >>>
> > >> >>> All broker logs are same, so single gist.
> > >> >>>
> > >> >>> There are also state-change.log files, do you want to take a look
> at
> > >> >> those
> > >> >>> as well?
> > >> >>>
> > >> >>> On Wed, Jun 28, 2017 at 4:31 PM, Bill Bejeck <b...@confluent.io>
> > >> wrote:
> > >> >>>
> > >> >>>> Hi Dmitry,
> > >> >>>>
> > >> >>>> At the moment I don't have anything specific to look for, just
> > trying
> > >> >> to
> > >> >>>> get more context around the issue.
> > >> >>>>
> > >> >>>> As for the logs maybe broker and streams logs for the last 30
> > minutes
> > >> >> up
> > >> >>> to
> > >> >>>> the time the application stopped processing records.
> > >> >>>>
> > >> >>>> Thanks,
> > >> >>>> Bill
> > >> >>>>
> > >> >>>> On Wed, Jun 28, 2017 at 9:04 AM, Dmitriy Vsekhvalnov <
> > >> >>>> dvsekhval...@gmail.com
> > >> >>>>> wrote:
> > >> >>>>
> > >> >>>>> Hi Bill,
> > >> >>>>>
> > >> >>>>> 1. sure, can extract some logs, what exactly to look for? There
> > are
> > >> >> 11
> > >> >>>>> hours of logs and most of them looks like:
> > >> >>>>>
> > >> >>>>> [2017-06-27 03:30:50,553] [] [INFO ] [StreamThread-1]
> > >> >>>>> [org.apache.kafka.streams.processor.internals.StreamThread]
> > >> >>>> [stream-thread
> > >> >>>>> [StreamThread-1] Committing all tasks because the commit
> interval
> > >> >>> 5000ms
> > >> >>>>> has elapsed]
> > >> >>>>>
> > >> >>>>> [2017-06-27 03:30:50,553] [] [INFO ] [StreamThread-1]
> > >> >>>>> [org.apache.kafka.streams.processor.internals.StreamThread]
> > >> >>>> [stream-thread
> > >> >>>>> [StreamThread-1] Committing task StreamTask 0_0]
> > >> >>>>>
> > >> >>>>> [2017-06-27 03:30:50,554] [] [INFO ] [StreamThread-1]
> > >> >>>>> [org.apache.kafka.streams.processor.internals.StreamThread]
> > >> >>>> [stream-thread
> > >> >>>>> [StreamThread-1] Committing task StreamTask 2_0]
> > >> >>>>>
> > >> >>>>> Something specific to search for?
> > >> >>>>>
> > >> >>>>> 2. Yes, there are more messages coming to topic.
> > >> >>>>>
> > >> >>>>> On Wed, Jun 28, 2017 at 3:43 PM, Bill Bejeck <b...@confluent.io
> >
> > >> >>> wrote:
> > >> >>>>>
> > >> >>>>>> Hi Dimitry,
> > >> >>>>>>
> > >> >>>>>> I'm happy to help, but I could use more information.  Can you
> > share
> > >> >>> the
> > >> >>>>>> streams logs and broker logs?
> > >> >>>>>>
> > >> >>>>>> Have you confirmed messages are still being delivered to topics
> > >> >> (via
> > >> >>>>>> console consumer)?
> > >> >>>>>>
> > >> >>>>>> Thanks,
> > >> >>>>>> Bill
> > >> >>>>>>
> > >> >>>>>> On Wed, Jun 28, 2017 at 8:24 AM, Dmitriy Vsekhvalnov <
> > >> >>>>>> dvsekhval...@gmail.com
> > >> >>>>>>> wrote:
> > >> >>>>>>
> > >> >>>>>>> Hi all,
> > >> >>>>>>>
> > >> >>>>>>> looking for some assistance in debugging kafka-streams
> > >> >> application.
> > >> >>>>>>>
> > >> >>>>>>> Kafka broker  0.10.2.1  - x3 Node cluster
> > >> >>>>>>> kafka-streams 0.10.2.1 -  x2 application nodes x 1 stream
> thread
> > >> >>>> each.
> > >> >>>>>>>
> > >> >>>>>>> In streams configuration only:
> > >> >>>>>>> - SSL transport
> > >> >>>>>>> - kafka.streams.commitIntervalMs set to 5000 (instead of
> default
> > >> >>>> 30s).
> > >> >>>>>>>
> > >> >>>>>>> We running simple aggregation app with several grouping
> streams.
> > >> >>>>> Running
> > >> >>>>>> 2
> > >> >>>>>>> instances of an app for redundancy. Both instances were
> working
> > >> >>>> pretty
> > >> >>>>>> fine
> > >> >>>>>>> for 11 hours 15 minutes then stopped consuming new events from
> > >> >>> topic.
> > >> >>>>>>>
> > >> >>>>>>> Hosting JVM processes were working fine, just streams stopped
> > >> >>>> reacting
> > >> >>>>> to
> > >> >>>>>>> new data. No exceptions, errors, e.t.c. in logs.  After
> restart
> > >> >>>> streams
> > >> >>>>>>> still not consuming new messages.
> > >> >>>>>>>
> > >> >>>>>>> Below is 2 last entries from kafka-streams logs from both
> hosts:
> > >> >>>>>>>
> > >> >>>>>>> [2017-06-27 14:45:09,663] [] [INFO ] [StreamThread-1]
> > >> >>>>>>> [org.apache.kafka.streams.processor.internals.StreamThread]
> > >> >>>>>> [stream-thread
> > >> >>>>>>> [StreamThread-1] Committing task StreamTask 4_2]
> > >> >>>>>>>
> > >> >>>>>>> [2017-06-27 14:45:09,723] [] [INFO ] [StreamThread-1]
> > >> >>>>>>> [org.apache.kafka.streams.processor.internals.StreamThread]
> > >> >>>>>> [stream-thread
> > >> >>>>>>> [StreamThread-1] Committing task StreamTask 2_1]
> > >> >>>>>>>
> > >> >>>>>>> Pretty puzzling why they stopped exactly same moment (with
> > >> >> respect
> > >> >>> to
> > >> >>>>>>> millis).
> > >> >>>>>>>
> > >> >>>>>>> Really appreciate any ideas where to dig to.
> > >> >>>>>>>
> > >> >>>>>>> Thank you.
> > >> >>>>>>>
> > >> >>>>>>
> > >> >>>>>
> > >> >>>>
> > >> >>>
> > >> >>
> > >>
> > >>
> >
>

Reply via email to