A potential bug for Kubernetes HA in Flink 1.15.1

2023-05-31 Thread Wei Hou via user
Hello Team, I would like to bring attention to a potential bug regarding Kubernetes HA in Flink 1.15. In our implementation, we utilize the TRAP command in our entrypoint script to perform cleanup tasks based on the exit code of the Jobmanager. However, we have observed an issue where, when using

akka.remote.OversizedPayloadException after we upgrade to Flink 1.15

2023-05-08 Thread Wei Hou via user
Hi Team, We hit an issue after we upgrade our job from Flink 1.12 to 1.15, there's a consistent akka.remote.OversizedPayloadException after job restarts: Transient association error (association remains live) akka.remote.OversizedPayloadException: Discarding oversized payload sent to Actor[akka.

Re: Can I setup standby taskmanagers while using reactive mode?

2023-04-27 Thread Wei Hou via user
eactive mode doesn't support standby taskmanagers. As you said it >> always uses all available resources in the cluster. >> >> I can see it being useful though to not always scale to MAX but (MAX - >> some_offset). >> >> I'd suggest to file a

Can I setup standby taskmanagers while using reactive mode?

2023-04-25 Thread Wei Hou via user
Hi Flink community, We are trying to use Flinkā€™s reactive mode with Kubernetes HPA for autoscaling, however since the reactive mode will always use all available resources, it causes a problem when we need standby task managers for fast failure recover: The job will always use these extra stand