Hi Peter, there's no need to worry about transient members as the operator itself is not serialized - only the state itself, depending on the state back-end.
If you want your state to be recovered by checkpoints you should implement the open() method and initialise your state there as in your point (2) and as described in [1]. If you want to re-scale your job, you have to take a savepoint and may resume from there with a different parallelism [2] but be sure to set a maximum parallelism (per job / or operator) and set UUIDs for operators as described in [3]. Nico [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/ state.html [2] https://ci.apache.org/projects/flink/flink-docs-release-1.4/setup/ savepoints.html [3] https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/ production_ready.html On Thursday, 3 August 2017 12:11:14 CEST Peter Ertl wrote: > Hi, > > can someone elaborate on when I should set properties transient / > non-transient within operators (e.g. map / flatMap / reduce) ? > > I see these two possibilies: > > (1) initialize a non-transient property from the constructor > (2) initialize a transient property inside a Rich???Function when > open(ConfigurationParameters) is invoked > > on what criteria should I choose (1) or (2) ? > > how is this related to checkpointing / rebalancing? > > Thanks in advance > Peter
signature.asc
Description: This is a digitally signed message part.