[ 
https://issues.apache.org/jira/browse/KAFKA-9270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16989410#comment-16989410
 ] 

Rohan Kulkarni edited comment on KAFKA-9270 at 12/6/19 5:41 AM:
----------------------------------------------------------------

Thanks [~vvcephei]. Appreciate your help on this.

Just to clarify, as per my understanding with KAFKA-8803 we are targeting to 
fix the TimeoutException in initTxn and Producer flow. Are we also going to 
target the consumer commitOffset flow with that issue?

Otherwise current ticket will be better for independent tracking...


was (Author: rohan26may):
Thanks [~vvcephei]. Appreciate your help on this.

> KafkaStream crash on offset commit failure
> ------------------------------------------
>
>                 Key: KAFKA-9270
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9270
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 2.0.1
>            Reporter: Rohan Kulkarni
>            Priority: Critical
>
> On our Production server we intermittently observe Kafka Streams get crashed 
> with TimeoutException while committing offset. The only workaround seems to 
> be restarting the application which is not a desirable solution for a 
> production environment.
>  
> While have already implemented ProductionExceptionHandler which does not 
> seems to address this.
>  
> Please provide a fix for this or a viable workaround.
>  
> +Application side logs:+
> 2019-11-17 08:28:48.055 +0000 
> [AggregateJob-614fe688-c9a4-4dad-a881-71488030918b-StreamThread-1] [ERROR] - 
> org.apache.kafka.streams.processor.internals.AssignedStreamsTasks 
> [org.apache.kafka.streams.processor.internals.AssignedTasks:applyToRunningTasks:373]
>  - stream-thread 
> [AggregateJob-614fe688-c9a4-4dad-a881-71488030918b-StreamThread-1] *Failed to 
> commit stream task 0_1 due to the following error:*
>  *org.apache.kafka.common.errors.TimeoutException: Timeout of 60000ms expired 
> before successfully committing offsets* 
> \{AggregateJob-1=OffsetAndMetadata{offset=176729402, metadata=''}}
>  
> 2019-11-17 08:29:00.891 +0000 
> [AggregateJob-614fe688-c9a4-4dad-a881-71488030918b-StreamThread-1] [ERROR] -  
>   [:lambda$init$2:130] - Stream crashed!!! StreamsThread threadId: 
> AggregateJob-614fe688-c9a4-4dad-a881-71488030918b-StreamThread-12019-11-17 
> 08:29:00.891 +0000 
> [AggregateJob-614fe688-c9a4-4dad-a881-71488030918b-StreamThread-1] [ERROR] -  
>   [:lambda$init$2:130] - Stream crashed!!! StreamsThread threadId: 
> AggregateJob-614fe688-c9a4-4dad-a881-71488030918b-StreamThread-1TaskManager 
> MetadataState: GlobalMetadata: [] GlobalStores: [] My HostInfo: 
> HostInfo\{host='unknown', port=-1} Cluster(id = null, nodes = [], partitions 
> = [], controller = null) Active tasks: Running: Suspended: Restoring: New: 
> Standby tasks: Running: Suspended: Restoring: New:
>  org.apache.kafka.common.errors.*TimeoutException: Timeout of 60000ms expired 
> before successfully committing offsets* 
> \{AggregateJob-0=OffsetAndMetadata{offset=189808059, metadata=''}}
>  
> +Kafka broker logs:+
> [2019-11-17 13:53:22,774] WARN *Client session timed out, have not heard from 
> server in 6669ms for sessionid 0x10068e4a2944c2f* 
> (org.apache.zookeeper.ClientCnxn)
>  [2019-11-17 13:53:22,809] INFO Client session timed out, have not heard from 
> server in 6669ms for sessionid 0x10068e4a2944c2f, closing socket connection 
> and attempting reconnect (org.apache.zookeeper.ClientCnxn)
>  
> Regards,
> Rohan



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to