Hi, There could be many reasons for exceeding akka framesize, for example 1. "inlined" state that is stored in checkpoint .metadata file (rather than "data" files - see [1]) 2. broadcast state as you mentioned (though only the metadata is sent unless the data fits the above limits) 3. too many state handles in case of aggressive downscaling 4. too many upstream/downstream tasks in case of high parallelism (as the task receives the needed information about them) 5. the task reports too many metrics or other details (e.g. exception stacktrace or maybe flamegraph)
[1] https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/config/#state-storage-fs-memory-threshold Regards, Roman On Tue, Sep 28, 2021 at 3:32 AM Deshpande, Omkar <omkar_deshpa...@intuit.com> wrote: > > Hello, > > We run a lot of flink applications. Some of them sometimes run into this > error on Job Manager- > The rpc invocation size exceeds the maximum akka framesize > > After we increase the framesize the application starts working again. > What factors determine the akka framesize? We sometimes see applications run > without this issue for months and then run into this error. > How can we determine the framesize before running into this error? > > Thanks, > Omkar