Hi Seth, Thank you for the advice. The solution you mentioned is exactly what I did.
I wrote a small tutorial that explains how to repeat that pattern. You can read about my solution at https://github.com/minmay/flink-patterns/tree/master/bootstrap-keyed-state-into-stream <https://github.com/minmay/flink-patterns/tree/master/bootstrap-keyed-state-into-stream> Regarding the NullPointerException when running locally, thank you for filing a ticket. It would be very nice to get that fixed. Sincerely, Marco A. Villalobos > On Aug 12, 2020, at 9:40 AM, Seth Wiesman <sjwies...@gmail.com> wrote: > > Just to summarize the conversation so far: > > The state processor api reads data from a 3rd party system - such as JDBC in > this example - and generates a savepoint file that is written out to some > DFS. This savepoint can then be used to when starting a flink streaming > application. It is a two-step process, creating the savepoint in one job and > then starting a streaming application from that savepoint in another. > > These jobs do not have to be a single application, and in general, I > recommend they be developed as two separate jobs. The reason being, > bootstrapping state is a one-time process while your streaming application > runs forever. It will simplify your development and operations in the long > term if you do not mix concerns. > > Concerning the NullPointerException: > > The max parallelism must be at least 128. I've opened a ticket to track and > resolve this issue. > > Seth > > On Mon, Aug 10, 2020 at 6:38 PM Marco Villalobos <mvillalo...@kineteque.com > <mailto:mvillalo...@kineteque.com>> wrote: > I think there is a bug in Flink when running locally without a cluster. > > My code worked in a cluster, but failed when run locally. > > My code does not save null values in Map State. > > > On Aug 9, 2020, at 11:27 PM, Tzu-Li Tai <tzuli...@gmail.com > > <mailto:tzuli...@gmail.com>> wrote: > > > > Hi, > > > > For the NullPointerException, what seems to be happening is that you are > > setting NULL values in your MapState, that is not allowed by the API. > > > > Otherwise, the code that you showed for bootstrapping state seems to be > > fine. > > > >> I have yet to find a working example that shows how to do both > >> (bootstrapping state and start a streaming application with that state) > > > > Not entirely sure what you mean here by "doing both". > > The savepoint written using the State Processor API (what you are doing in > > the bootstrap() method) is a savepoint that may be restored from as you > > would with a typical Flink streaming job restore. > > So, usually the bootstrapping part happens as a batch "offline" job, while > > you keep your streaming job as a separate job. What are you trying to > > achieve with having both written within the same job? > > > > Cheers, > > Gordon > > > > > > > > -- > > Sent from: > > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ > > <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/> >