Yep, if you take another thread dump is it in the same spot? Which version of streams are you running? Are you using docker?
Thanks, Damian On Fri, 30 Jun 2017 at 16:22 Dmitriy Vsekhvalnov <dvsekhval...@gmail.com> wrote: > Is this one looking suspicious? > > > https://gist.github.com/dvsekhvalnov/146ba41c8e78316941098997c9d2f18a#file-thread-dump > > On Fri, Jun 30, 2017 at 6:10 PM, Damian Guy <damian....@gmail.com> wrote: > > > It looks like Task [4_2] is stuck restoring state, though it doesn't look > > like there is much state to restore. > > It might be helpful if you take some thread dumps to see where it is > > blocked. > > > > Thanks, > > Damian > > > > On Fri, 30 Jun 2017 at 16:04 Dmitriy Vsekhvalnov <dvsekhval...@gmail.com > > > > wrote: > > > > > Set org.apache.kafka.streams to DEBUG. > > > > > > Here is gist: > > > > > > https://gist.github.com/dvsekhvalnov/b84b72349837f6c6394f1adfe18cdb > > 61#file-debug-logs > > > > > > On Fri, Jun 30, 2017 at 12:37 PM, Dmitriy Vsekhvalnov < > > > dvsekhval...@gmail.com> wrote: > > > > > > > Sure, how to enable debug logs? Just adjust logback logger? > > > > > > > > > > > > On Fri, Jun 30, 2017 at 11:24 Eno Thereska <eno.there...@gmail.com> > > > wrote: > > > > > > > >> It’s hard to tell, the logs do not contain much, I agree. It could > be > > a > > > >> number of things. > > > >> > > > >> If it’s happening as you say on restart as well (so it’s > > reproducible), > > > >> any chance you could start streaming with DEBUG logs on and collect > > > those > > > >> logs? I’m hoping something shows up there. > > > >> > > > >> Thanks, > > > >> Eno > > > >> > > > >> > > > >> > On Jun 28, 2017, at 5:30 PM, Dmitriy Vsekhvalnov < > > > >> dvsekhval...@gmail.com> wrote: > > > >> > > > > >> > Nothing for stat-change.log for giving time window. Last line > > logged 4 > > > >> > hours before app stopped. > > > >> > > > > >> > Any ideas so far? Personally i don't see anything relevant in > logs. > > > >> > > > > >> > On Wed, Jun 28, 2017 at 6:33 PM, Bill Bejeck <b...@confluent.io> > > > wrote: > > > >> > > > > >> >> Sure, couldn't hurt. > > > >> >> > > > >> >> Thanks, > > > >> >> Bill > > > >> >> > > > >> >> On Wed, Jun 28, 2017 at 9:51 AM, Dmitriy Vsekhvalnov < > > > >> >> dvsekhval...@gmail.com > > > >> >>> wrote: > > > >> >> > > > >> >>> Here are logs: > > > >> >>> > > > >> >>> app: > > > >> >>> https://gist.github.com/dvsekhvalnov/ > > f98afc3463f0c63b1722417e3710a8 > > > >> >>> e7#file-kafka-streams-log > > > >> >>> brokers: > > > >> >>> https://gist.github.com/dvsekhvalnov/ > > 8e870f7347394e8d004c282880ef38 > > > >> >>> 5a#file-kafka-broker-1-2-3-log > > > >> >>> > > > >> >>> All broker logs are same, so single gist. > > > >> >>> > > > >> >>> There are also state-change.log files, do you want to take a > look > > at > > > >> >> those > > > >> >>> as well? > > > >> >>> > > > >> >>> On Wed, Jun 28, 2017 at 4:31 PM, Bill Bejeck <b...@confluent.io > > > > > >> wrote: > > > >> >>> > > > >> >>>> Hi Dmitry, > > > >> >>>> > > > >> >>>> At the moment I don't have anything specific to look for, just > > > trying > > > >> >> to > > > >> >>>> get more context around the issue. > > > >> >>>> > > > >> >>>> As for the logs maybe broker and streams logs for the last 30 > > > minutes > > > >> >> up > > > >> >>> to > > > >> >>>> the time the application stopped processing records. > > > >> >>>> > > > >> >>>> Thanks, > > > >> >>>> Bill > > > >> >>>> > > > >> >>>> On Wed, Jun 28, 2017 at 9:04 AM, Dmitriy Vsekhvalnov < > > > >> >>>> dvsekhval...@gmail.com > > > >> >>>>> wrote: > > > >> >>>> > > > >> >>>>> Hi Bill, > > > >> >>>>> > > > >> >>>>> 1. sure, can extract some logs, what exactly to look for? > There > > > are > > > >> >> 11 > > > >> >>>>> hours of logs and most of them looks like: > > > >> >>>>> > > > >> >>>>> [2017-06-27 03:30:50,553] [] [INFO ] [StreamThread-1] > > > >> >>>>> [org.apache.kafka.streams.processor.internals.StreamThread] > > > >> >>>> [stream-thread > > > >> >>>>> [StreamThread-1] Committing all tasks because the commit > > interval > > > >> >>> 5000ms > > > >> >>>>> has elapsed] > > > >> >>>>> > > > >> >>>>> [2017-06-27 03:30:50,553] [] [INFO ] [StreamThread-1] > > > >> >>>>> [org.apache.kafka.streams.processor.internals.StreamThread] > > > >> >>>> [stream-thread > > > >> >>>>> [StreamThread-1] Committing task StreamTask 0_0] > > > >> >>>>> > > > >> >>>>> [2017-06-27 03:30:50,554] [] [INFO ] [StreamThread-1] > > > >> >>>>> [org.apache.kafka.streams.processor.internals.StreamThread] > > > >> >>>> [stream-thread > > > >> >>>>> [StreamThread-1] Committing task StreamTask 2_0] > > > >> >>>>> > > > >> >>>>> Something specific to search for? > > > >> >>>>> > > > >> >>>>> 2. Yes, there are more messages coming to topic. > > > >> >>>>> > > > >> >>>>> On Wed, Jun 28, 2017 at 3:43 PM, Bill Bejeck < > b...@confluent.io > > > > > > >> >>> wrote: > > > >> >>>>> > > > >> >>>>>> Hi Dimitry, > > > >> >>>>>> > > > >> >>>>>> I'm happy to help, but I could use more information. Can you > > > share > > > >> >>> the > > > >> >>>>>> streams logs and broker logs? > > > >> >>>>>> > > > >> >>>>>> Have you confirmed messages are still being delivered to > topics > > > >> >> (via > > > >> >>>>>> console consumer)? > > > >> >>>>>> > > > >> >>>>>> Thanks, > > > >> >>>>>> Bill > > > >> >>>>>> > > > >> >>>>>> On Wed, Jun 28, 2017 at 8:24 AM, Dmitriy Vsekhvalnov < > > > >> >>>>>> dvsekhval...@gmail.com > > > >> >>>>>>> wrote: > > > >> >>>>>> > > > >> >>>>>>> Hi all, > > > >> >>>>>>> > > > >> >>>>>>> looking for some assistance in debugging kafka-streams > > > >> >> application. > > > >> >>>>>>> > > > >> >>>>>>> Kafka broker 0.10.2.1 - x3 Node cluster > > > >> >>>>>>> kafka-streams 0.10.2.1 - x2 application nodes x 1 stream > > thread > > > >> >>>> each. > > > >> >>>>>>> > > > >> >>>>>>> In streams configuration only: > > > >> >>>>>>> - SSL transport > > > >> >>>>>>> - kafka.streams.commitIntervalMs set to 5000 (instead of > > default > > > >> >>>> 30s). > > > >> >>>>>>> > > > >> >>>>>>> We running simple aggregation app with several grouping > > streams. > > > >> >>>>> Running > > > >> >>>>>> 2 > > > >> >>>>>>> instances of an app for redundancy. Both instances were > > working > > > >> >>>> pretty > > > >> >>>>>> fine > > > >> >>>>>>> for 11 hours 15 minutes then stopped consuming new events > from > > > >> >>> topic. > > > >> >>>>>>> > > > >> >>>>>>> Hosting JVM processes were working fine, just streams > stopped > > > >> >>>> reacting > > > >> >>>>> to > > > >> >>>>>>> new data. No exceptions, errors, e.t.c. in logs. After > > restart > > > >> >>>> streams > > > >> >>>>>>> still not consuming new messages. > > > >> >>>>>>> > > > >> >>>>>>> Below is 2 last entries from kafka-streams logs from both > > hosts: > > > >> >>>>>>> > > > >> >>>>>>> [2017-06-27 14:45:09,663] [] [INFO ] [StreamThread-1] > > > >> >>>>>>> [org.apache.kafka.streams.processor.internals.StreamThread] > > > >> >>>>>> [stream-thread > > > >> >>>>>>> [StreamThread-1] Committing task StreamTask 4_2] > > > >> >>>>>>> > > > >> >>>>>>> [2017-06-27 14:45:09,723] [] [INFO ] [StreamThread-1] > > > >> >>>>>>> [org.apache.kafka.streams.processor.internals.StreamThread] > > > >> >>>>>> [stream-thread > > > >> >>>>>>> [StreamThread-1] Committing task StreamTask 2_1] > > > >> >>>>>>> > > > >> >>>>>>> Pretty puzzling why they stopped exactly same moment (with > > > >> >> respect > > > >> >>> to > > > >> >>>>>>> millis). > > > >> >>>>>>> > > > >> >>>>>>> Really appreciate any ideas where to dig to. > > > >> >>>>>>> > > > >> >>>>>>> Thank you. > > > >> >>>>>>> > > > >> >>>>>> > > > >> >>>>> > > > >> >>>> > > > >> >>> > > > >> >> > > > >> > > > >> > > > > > >