Thanks Matthias,

I think it's related to
https://issues.apache.org/jira/browse/KAFKA-8802. Once I disabled the
cache, bytes out  goes up a lot. Before the bytes out are kept at
14MB/S no matter bytes in is high or low.
I'm trying to download the 2.3.1-rc2 build and try it again.

On Thu, Oct 24, 2019 at 1:27 AM Matthias J. Sax <matth...@confluent.io> wrote:
>
> >> 1) The app has a huge consuming lag on both source topic and internal
> >> repartition topics , ~5 M messages and keeps growing. Will the lag
> >> lead to this timeout exception? My understanding is the app polls too
> >> many messages before it could send out even the lag indicates it still
> >> polls too slow?
>
> polling and writing messages are independent. Also, an increasing
> consumer lag itself, would not lead to a TimeoutException.
>
> It's hard to say in general, but it might be a network issue? Did you
> try to increase `retries` config as suggested by the error message?
> Did you also check the health of the brokers? If the network is stable
> but brokers are unhealthy, it may also lead to timeout exceptions.
>
> For the increasing consumer lag, scaling out the application should
> actually help.
>
>
> -Matthias
>
>
> On 9/27/19 2:03 PM, Xiyuan Hu wrote:
> > Hi,
> >
> > I'm running Kafka Streams v2.1.0 with windowing function and 3 threads
> > per node. The input traffic is about 120K messages/sec. Once deploy,
> > after couple minutes, some thread will get TimeoutException and goes
> > to DEAD state.
> >
> > 2019-09-27 13:04:34,449 ERROR [client-StreamThread-2]
> > o.a.k.s.p.i.AssignedStreamsTasks stream-thread [client-StreamThread-2]
> > Failed to commit stream task 1_35 due to the following error:
> > org.apache.kafka.streams.errors.StreamsException: task [1_35] Abort
> > sending since an error caught with a previous record (key
> > 174044298638db0 value [B@33423c0 timestamp 1569613689747) to topic
> > XXX-KSTREAM-REDUCE-STATE-STORE-0000000003-changelog due to
> > org.apache.kafka.common.errors.TimeoutException: Expiring 45 record(s)
> > for XXX-KSTREAM-REDUCE-STATE-STORE-0000000003-changelog-35:300102 ms
> > has passed since batch creation
> > You can increase producer parameter `retries` and `retry.backoff.ms`
> > to avoid this error.
> > at 
> > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:133)
> > Caused by: org.apache.kafka.common.errors.TimeoutException: Expiring
> > 45 record(s) for
> > XXX-KSTREAM-REDUCE-STATE-STORE-0000000003-changelog-35:300102 ms has
> > passed since batch creation
> >
> > I changed linger.ms to 100 and request.timeout.ms to 300000.
> >
> > A couple questions:
> > 1) The app has a huge consuming lag on both source topic and internal
> > repartition topics , ~5 M messages and keeps growing. Will the lag
> > lead to this timeout exception? My understanding is the app polls too
> > many messages before it could send out even the lag indicates it still
> > polls too slow?
> >
> > 2) I checked the thread status from streams.localThreadsMetadata() and
> > found that, a lot threads are switching between RUNNING and
> > PARTITIONS_REVOKED. Eventually, stuck at PARTITIONS_REVOKED. I didn't
> > see any error message from the log related to this. What might be a
> > good approach to trace the cause?
> >
> > Thanks a lot! Any help is appreciated!
> >
>

Reply via email to