Jacob Sevart created FLINK-14618: ------------------------------------ Summary: Give more detailed debug information on akka framesize exception Key: FLINK-14618 URL: https://issues.apache.org/jira/browse/FLINK-14618 Project: Flink Issue Type: Improvement Components: Documentation, Runtime / Network Affects Versions: 1.6.3 Reporter: Jacob Sevart
I'm hitting the akka framesize limit in production with some regularity, often when the job has been running for a long time and we try to deploy or restart. I suspect it's checkpoint related because clearing the checkpoint enables the job to start up. The [guidance|[https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html]] says: {quote}If Flink fails because messages exceed this limit, then you should increase it. {quote} The [error message|[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/rpc/akka/AkkaInvocationHandler.java#L270]] is not very helpful towards that end. How large does it need to be? How do I know whether increasing the size will fix it, or if the message is unreasonably large due to a bug? I'd like to modify the exception message to report the value of size. -- This message was sent by Atlassian Jira (v8.3.4#803005)