[ 
https://issues.apache.org/jira/browse/KAFKA-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16607436#comment-16607436
 ] 

John Roesler edited comment on KAFKA-7214 at 9/7/18 6:00 PM:
-------------------------------------------------------------

Since you do not have an OutOfMemoryException in your logs, I can only assume 
your application did not run out of memory. How can we report "out of memory" 
if the application is not actually out of memory?

Your problem seems to be caused by long GC pauses, not running out of memory, 
but we cannot confirm this, since you have not reported your GC logs. You can 
enable GC logging (the JVM provides an option to do this) to investigate the 
problem further if you really wish to run the app in a memory constrained 
environment.

 

Every aspect of the application's runtime performance, including memory, will 
be dominated by what exactly your application does and what data it's 
processing. There's no fixed amount of "overhead" in Kafka Streams. Two 
different topologies will have different amounts of overhead based on the 
computations they need to do.

Honestly, I think the procedure you have followed to set your heap size is 
perfectly fine. It's very similar to what I would have done. If you really need 
to come up with a formal characterization of the memory usage for _your_ 
topology in terms of throughput, it's something that can only done by you. The 
approach I'd recommend is to run with a few different configurations and 
analyze the heap dumps at a few different points in the lifecycle.

This is the same procedure you would follow to characterize the required heap 
for any Java application, not just Streams.

 

About this:

> Memory consumption or memory model described in Kafka documentation does not 
>fit to reality.

Can you point me to the documentation that's incorrect? We can certainly fix 
anything that's documented wrong.


was (Author: vvcephei):
Since you do not have an OutOfMemoryException in your logs, I can only assume 
your application did not run out of memory. How can we report "out of memory" 
if the application is not actually out of memory?

Your problem seems to be caused by long GC pauses, not running out of memory, 
but we cannot confirm this, since you have not reported your GC logs. You can 
enable GC logging (the JVM provides an option to do this) to investigate the 
problem further if you really wish to run the app in a memory constrained 
environment.

 

Every aspect of the application's runtime performance, including memory, will 
be dominated by what exactly your application does and what data it's 
processing. There's no fixed amount of "overhead" in Kafka Streams. Two 
different topologies will have different amounts of overhead based on the 
computations they need to do.

Honestly, I think the procedure you have followed to set your heap size is 
perfectly fine. It's very similar to what I would have done. If you really need 
to come up with a formal characterization of the memory usage for _your_ 
topology in terms of throughput, it's something that can only done by you. The 
approach I'd recommend is to run with a few different configurations and 
analyze the heap dumps at a few different points in the lifecycle.

This is the same procedure you would follow to characterize the required heap 
for any Java application, not just Streams.

> Mystic FATAL error
> ------------------
>
>                 Key: KAFKA-7214
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7214
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 0.11.0.3, 1.1.1
>            Reporter: Seweryn Habdank-Wojewodzki
>            Priority: Critical
>
> Dears,
> Very often at startup of the streaming application I got exception:
> {code}
> Exception caught in process. taskId=0_1, processor=KSTREAM-SOURCE-0000000000, 
> topic=my_instance_medium_topic, partition=1, offset=198900203; 
> [org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:212),
>  
> org.apache.kafka.streams.processor.internals.AssignedTasks$2.apply(AssignedTasks.java:347),
>  
> org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:420),
>  
> org.apache.kafka.streams.processor.internals.AssignedTasks.process(AssignedTasks.java:339),
>  
> org.apache.kafka.streams.processor.internals.StreamThread.processAndPunctuate(StreamThread.java:648),
>  
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:513),
>  
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:482),
>  
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:459)]
>  in thread 
> my_application-my_instance-my_instance_medium-72ee1819-edeb-4d85-9d65-f67f7c321618-StreamThread-62
> {code}
> and then (without shutdown request from my side):
> {code}
> 2018-07-30 07:45:02 [ar313] [INFO ] StreamThread:912 - stream-thread 
> [my_application-my_instance-my_instance-72ee1819-edeb-4d85-9d65-f67f7c321618-StreamThread-62]
>  State transition from PENDING_SHUTDOWN to DEAD.
> {code}
> What is this?
> How to correctly handle it?
> Thanks in advance for help.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to