I will reply on that thread.

On Tue, May 16, 2017 at 2:37 AM, Sameer Kumar <sam.kum.w...@gmail.com>
wrote:

> Hi Guozhang,
>
> The errors have gone away after migrating both my brokers and api to
> 10.2.1.
> But, regardng the error the specific theads moved from running to not
> running state.
>
> -Sameer.
>
> On Tue, May 9, 2017 at 12:16 AM, Guozhang Wang <wangg...@gmail.com> wrote:
>
> > Hi Sameer,
> >
> > I looked at the logs, and there is only one suspicious entry:
> >
> > ```
> > 2017-05-03 14:26:54 WARN  StreamThread:1184 - Could not create task 0_21.
> > Will retry.
> > org.apache.kafka.streams.errors.LockException: task [0_21] Failed to
> lock
> > the state directory: /data/streampoc/LIC2-4/0_21
> > ```
> >
> > It replies three times and then did not show up, but I cannot tell for
> sure
> > since it is towards the end of the log file. This WARN entry is not
> > expected to be a fatal error and would go away after some time, and
> should
> > not hinder the apps. So my question is 1) did you see this WARN repeating
> > forever and 2) how long have you observed that the app is stuck, and
> while
> > it is stuck does the above entry never go away?
> >
> >
> > Guozhang
> >
> >
> > On Wed, May 3, 2017 at 10:50 PM, Sameer Kumar <sam.kum.w...@gmail.com>
> > wrote:
> >
> > > My brokers are on version 10.1.0 and my clients are on version 10.2.0.
> > > Also, do a reply to all, I am currently not subscribed to the mailing
> > list.
> > >
> > > -Sameer.
> > >
> > > On Wed, May 3, 2017 at 5:27 PM, Sameer Kumar <sam.kum.w...@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > >
> > > >
> > > > I want to report an issue where in addition of a server at runtime in
> > my
> > > > streams compute cluster caused errors and subsequent complete halting
> > of
> > > > the cluster. I am not sure if this is the actual issue, but this was
> > > > something I did differently while 18 hour smooth run of the streams
> > app.
> > > >
> > > >
> > > >
> > > > Initially, I had one machine working on my Kafka topic, which
> contains
> > > > impressions and clicks. The job was running overnight, in the
> morning I
> > > > just added another machine to the cluster and this is when every time
> > > stuck
> > > > after working fine for some time.
> > > >
> > > >
> > > >
> > > > Please find the kafka_log_snippet and poc_log_snippet attached.
> > > >
> > > >
> > > >
> > > > Thereafter, failing of these nodes, I tried to restart just one
> machine
> > > on
> > > > my compute cluster to see if it can initialize itself.
> > > >
> > > > Please the logs attached for the same as well. Following were the
> logs
> > I
> > > > saw quite often.
> > > >
> > > >
> > > >
> > > > 2017-05-03 14:15:53 DEBUG Fetcher:526 - Ignoring fetched records for
> > > > LIC2-4-licountci-4-changelog-38 at offset 556717 since the current
> > > > position is 557065
> > > >
> > > > 2017-05-03 14:15:53 DEBUG Fetcher:180 - Sending fetch for partitions
> > > > [LIC2-4-licountci-4-changelog-38] to broker 172.29.65.190:9092 (id:
> 0
> > > > rack: null)
> > > >
> > > > 2017-05-03 14:15:53 DEBUG Fetcher:526 - Ignoring fetched records for
> > > > LIC2-4-licountci-4-changelog-48 at offset 607657 since the current
> > > > position is 607880
> > > >
> > > > 2017-05-03 14:15:53 DEBUG Fetcher:180 - Sending fetch for partitions
> > > > [LIC2-4-licountci-4-changelog-48] to broker 172.29.65.192:9092 (id:
> 2
> > > > rack: null)
> > > >
> > > > 2017-05-03 14:15:53 DEBUG Fetcher:526 - Ignoring fetched records for
> > > > LIC2-4-licountci-4-changelog-31 at offset 282265 since the current
> > > > position is 282327
> > > >
> > > > 2017-05-03 14:15:53 DEBUG Fetcher:180 - Sending fetch for partitions
> > > > [LIC2-4-licountci-4-changelog-31] to broker 172.29.65.191:9092 (id:
> 1
> > > > rack: null)
> > > >
> > > > 2017-05-03 14:15:53 DEBUG Fetcher:526 - Ignoring fetched records for
> > > > LIC2-4-licountci-4-changelog-3 at offset 499952 since the current
> > > position
> > > > is 500324
> > > >
> > > > 2017-05-03 14:15:53 DEBUG Fetcher:180 - Sending fetch for partitions
> > > > [LIC2-4-licountci-4-changelog-3] to broker 172.29.65.192:9092 (id: 2
> > > > rack: null)
> > > >
> > > > 2017-05-03 14:15:53 DEBUG Fetcher:526 - Ignoring fetched records for
> > > > LIC2-4-licountci-4-changelog-21 at offset 587018 since the current
> > > > position is 587227
> > > >
> > > > 2017-05-03 14:15:53 DEBUG Fetcher:180 - Sending fetch for partitions
> > > > [LIC2-4-licountci-4-changelog-21] to broker 172.29.65.192:9092 (id:
> 2
> > > > rack: null)
> > > >
> > > > 2017-05-03 14:15:53 DEBUG Fetcher:526 - Ignoring fetched records for
> > > > LIC2-4-licountci-4-changelog-49 at offset 276209 since the current
> > > > position is 276271
> > > >
> > > > 2017-05-03 14:15:53 DEBUG Fetcher:180 - Sending fetch for partitions
> > > > [LIC2-4-licountci-4-changelog-49] to broker 172.29.65.191:9092 (id:
> 1
> > > > rack: null)
> > > >
> > > > 2017-05-03 14:15:53 DEBUG Fetcher:526 - Ignoring fetched records for
> > > > LIC2-4-licountci-4-changelog-16 at offset 592727 since the current
> > > > position is 592896
> > > >
> > > > 2017-05-03 14:15:53 DEBUG Fetcher:180 - Sending fetch for partitions
> > > > [LIC2-4-licountci-4-changelog-16] to broker 172.29.65.191:9092 (id:
> 1
> > > > rack: null)
> > > >
> > > > 2017-05-03 14:15:53 DEBUG Fetcher:526 - Ignoring fetched records for
> > > > LIC2-4-licountci-4-changelog-37 at offset 458224 since the current
> > > > position is 458343
> > > >
> > > > 2017-05-03 14:15:53 DEBUG Fetcher:180 - Sending fetch for partitions
> > > > [LIC2-4-licountci-4-changelog-37] to broker 172.29.65.191:9092 (id:
> 1
> > > > rack: null)
> > > >
> > > > 2017-05-03 14:15:53 DEBUG Fetcher:526 - Ignoring fetched records for
> > > > LIC2-4-licountci-4-changelog-59 at offset 495722 since the current
> > > > position is 496113
> > > >
> > > > 2017-05-03 14:15:53 DEBUG Fetcher:180 - Sending fetch for partitions
> > > > [LIC2-4-licountci-4-changelog-59] to broker 172.29.65.190:9092 (id:
> 0
> > > > rack: null)
> > > >
> > > > 2017-05-03 14:15:53 DEBUG Fetcher:526 - Ignoring fetched records for
> > > > LIC2-4-licountci-4-changelog-35 at offset 230310 since the current
> > > > position is 231236
> > > >
> > > > 2017-05-03 14:15:53 DEBUG Fetcher:180 - Sending fetch for partitions
> > > > [LIC2-4-licountci-4-changelog-35] to broker 172.29.65.190:9092 (id:
> 0
> > > > rack: null)
> > > >
> > > >
> > > >
> > > > Regards,
> > > >
> > > > -Sameer.
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang

Reply via email to