Hi Yuval

First of all, large savepoint metadata would not must need a very large akka 
frame size. Writing meta data to external file system calls IO-write method [1] 
instead of sending RPC message.

Secondly, savepoint would not store any confiuration, it would only store 
checkpointed state.

BTW, why you could have so large RPC message over than 1GB?

[1] 
https://github.com/apache/flink/blob/f705f0af6ba50f6e68c22484d1daeda842518d27/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/PendingCheckpoint.java#L313

Best
Yun Tang
________________________________
From: Yuval Itzchakov <yuva...@gmail.com>
Sent: Thursday, October 15, 2020 21:22
To: user <user@flink.apache.org>
Subject: akka.framesize configuration does not runtime execution

Hi,

Due to a very large savepoint metadata file (3GB +), I've set the 
akka.framesize that is being required to 5GB. I set this via flink.conf 
`akka.framesize` property.

When trying to recover from the savepoint, the JM emits the following error 
message:

"thread":"flink-akka.actor.default-dispatcher-30"
"level":"ERROR"
"loggerName":"akka.remote.EndpointWriter"
"message":"Transient "Discarding oversized payload sent to 
Actor[akka.tcp://flink@XXX:XXX/user/taskmanager_0#369979612]: max allowed size 
1073741824 bytes, actual size of encoded class 
org.apache.flink.runtime.rpc.messages.RemoteRpcInvocation was 1610683118 
"name":"akka.remote.OversizedPayloadException"

As I recall, while taking the savepoint the maximum framesize was indeed 
defined as 1GB.

Could it be that akka.framesize is being recovered from the stored savepoint, 
thus not allowing me to configure re-configure the maximum size of the payload?

--
Best Regards,
Yuval Itzchakov.

Reply via email to