Re: Fast restart of a job with a large state

2019-04-24 Thread Sergey Zhemzhitsky
Hi Till, Thanks for the info! It's good to know. Regards, Sergey On Wed, Apr 24, 2019, 13:08 Till Rohrmann wrote: > Hi Sergey, > > at the moment neither local nor incremental savepoints are supported in > Flink afaik. There were some ideas wrt incremental savepoints floating > around in the c

Re: Fast restart of a job with a large state

2019-04-24 Thread Till Rohrmann
Hi Sergey, at the moment neither local nor incremental savepoints are supported in Flink afaik. There were some ideas wrt incremental savepoints floating around in the community but nothing concrete yet. Cheers, Till On Tue, Apr 23, 2019 at 6:58 PM Sergey Zhemzhitsky wrote: > Hi Stefan, Paul,

Re: Fast restart of a job with a large state

2019-04-23 Thread Sergey Zhemzhitsky
Hi Stefan, Paul, Thanks for the tips! Currently I have not tried neither rescaling from checkpoints nor task local recovery. Now it's a subject to test. In case it will be necessary not to just rescale a job, but also to change its DAG - is there a way to have something like let's call it "local

Re: Fast restart of a job with a large state

2019-04-18 Thread Stefan Richter
Hi, If rescaling is the problem, let me clarify that you can currently rescale from savepoints and all types of checkpoints (including incremental). If that was the only problem, then there is nothing to worry about - the documentation is only a bit conservative about this because we will not c

Re: Fast restart of a job with a large state

2019-04-18 Thread Paul Lam
Hi, Have you tried task local recovery [1]? [1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/checkpoints.html#retained-checkpoints Best, Paul Lam > 在 2019年4月17日,17:46,Sergey Zhemzhitsky 写道: > > Hi Flinkers, > > Operating different flink jobs I've discovered that job rest

Re: Fast restart of a job with a large state

2019-04-18 Thread Paul Lam
The URL in my previous mail is wrong, and it should be: https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/large_state_tuning.html#task-local-recovery Best, Paul Lam >

Fast restart of a job with a large state

2019-04-17 Thread Sergey Zhemzhitsky
Hi Flinkers, Operating different flink jobs I've discovered that job restarts with a pretty large state (in my case this is up to 100GB+) take quite a lot of time. For example, to restart a job (e.g. to update it) the savepoint is created, and in case of savepoints all the state seems to be pushed