HI,

We are using Flink with Adaptive Scheduler(Reactive Mode) on Kubernetes
with Standalone deployment Application mode for our streaming
infrastructure. Our autoscaler is scaling up or down our jobs. However,
each scale action causes a job restart.

Our customers complain about fluctuating traffic that we are sending. Is
there any way to reschedule tasks and calculate graphs without restarting
the whole job ? Or Reduce restart time ?

Job is set max parallelism 2x of maxWorker and we use GCS for checkpointing
storage. I know rescaling stateful jobs requires keygroups to be
redistributed. But we have stateless jobs also Such as reading from Kafka
and extracting data and writing a sink. If you can provide some entry
points we can start implementation support for those jobs.

Thanks

Reply via email to