[
https://issues.apache.org/jira/browse/KAFKA-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16587902#comment-16587902
]
Guozhang Wang commented on KAFKA-7214:
--------------------------------------
[~habdank] The issue reported in the original description and the issue
reported above in your comment are not similar, but quite different: the former
is some exception thrown from {{StreamTask.process}}, indicating sth. wrong
while processing a specific record (it may be Streams library's issue, or maybe
an ill-formatted record, or some edge cases in the user code), while the latter
is some exception thrown from {{StreamTask.commitOffsets}}, which throws a
{{CommitFailedException}} indicating that a rebalance has happened. I'll assume
your request is for trouble shooting the second issue, not the first one.
Since your config {{max.poll.interval.ms}} is already very large, I think it is
not the consumer caller thread that has a long pause, but maybe the underlying
heartbeat has a GC and hence not being able to send the heartbeat request in
time and get kicked out of the group as a result. As [~mjsax] mentioned, such
CommitFailedException will be captured as a TaskMigrationException and will be
handled gracefully (although it will log an ERROR, but it will not actually
die).
> Mystic FATAL error
> ------------------
>
> Key: KAFKA-7214
> URL: https://issues.apache.org/jira/browse/KAFKA-7214
> Project: Kafka
> Issue Type: Bug
> Components: streams
> Affects Versions: 0.11.0.3, 1.1.1
> Reporter: Seweryn Habdank-Wojewodzki
> Priority: Critical
>
> Dears,
> Very often at startup of the streaming application I got exception:
> {code}
> Exception caught in process. taskId=0_1, processor=KSTREAM-SOURCE-0000000000,
> topic=my_instance_medium_topic, partition=1, offset=198900203;
> [org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:212),
>
> org.apache.kafka.streams.processor.internals.AssignedTasks$2.apply(AssignedTasks.java:347),
>
> org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:420),
>
> org.apache.kafka.streams.processor.internals.AssignedTasks.process(AssignedTasks.java:339),
>
> org.apache.kafka.streams.processor.internals.StreamThread.processAndPunctuate(StreamThread.java:648),
>
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:513),
>
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:482),
>
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:459)]
> in thread
> my_application-my_instance-my_instance_medium-72ee1819-edeb-4d85-9d65-f67f7c321618-StreamThread-62
> {code}
> and then (without shutdown request from my side):
> {code}
> 2018-07-30 07:45:02 [ar313] [INFO ] StreamThread:912 - stream-thread
> [my_application-my_instance-my_instance-72ee1819-edeb-4d85-9d65-f67f7c321618-StreamThread-62]
> State transition from PENDING_SHUTDOWN to DEAD.
> {code}
> What is this?
> How to correctly handle it?
> Thanks in advance for help.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)