Scaling Flink Jobs without Restarting Job

Talat Uyarer via dev Sun, 23 Jul 2023 00:28:52 -0700

HI,

We are using Flink with Adaptive Scheduler(Reactive Mode) on Kubernetes
with Standalone deployment Application mode for our streaming
infrastructure. Our autoscaler is scaling up or down our jobs. However,
each scale action causes a job restart.


Our customers complain about fluctuating traffic that we are sending. Is
there any way to reschedule tasks and calculate graphs without restarting
the whole job ? Or Reduce restart time ?

Job is set max parallelism 2x of maxWorker and we use GCS for checkpointing
storage. I know rescaling stateful jobs requires keygroups to be
redistributed. But we have stateless jobs also Such as reading from Kafka
and extracting data and writing a sink. If you can provide some entry
points we can start implementation support for those jobs.

Thanks

Scaling Flink Jobs without Restarting Job

Reply via email to