Hi Seth,

Thank you for the advice. The solution you mentioned is exactly what I did.

I wrote a small tutorial that explains how to repeat that pattern.

You can read about my solution at 
https://github.com/minmay/flink-patterns/tree/master/bootstrap-keyed-state-into-stream
 
<https://github.com/minmay/flink-patterns/tree/master/bootstrap-keyed-state-into-stream>

Regarding the NullPointerException when running locally, thank you for filing a 
ticket. It would be very nice to get that fixed.

Sincerely, 

Marco A. Villalobos



> On Aug 12, 2020, at 9:40 AM, Seth Wiesman <sjwies...@gmail.com> wrote:
> 
> Just to summarize the conversation so far: 
> 
> The state processor api reads data from a 3rd party system - such as JDBC in 
> this example - and generates a savepoint file that is written out to some 
> DFS.  This savepoint can then be used to when starting a flink streaming 
> application. It is a two-step process, creating the savepoint in one job and 
> then starting a streaming application from that savepoint in another. 
> 
> These jobs do not have to be a single application, and in general, I 
> recommend they be developed as two separate jobs. The reason being, 
> bootstrapping state is a one-time process while your streaming application 
> runs forever. It will simplify your development and operations in the long 
> term if you do not mix concerns. 
> 
> Concerning the NullPointerException:
> 
> The max parallelism must be at least 128. I've opened a ticket to track and 
> resolve this issue. 
> 
> Seth 
> 
> On Mon, Aug 10, 2020 at 6:38 PM Marco Villalobos <mvillalo...@kineteque.com 
> <mailto:mvillalo...@kineteque.com>> wrote:
> I think there is a bug in Flink when running locally without a cluster.
> 
> My code worked in a cluster, but failed when run locally.
> 
> My code does not save null values in Map State.
> 
> > On Aug 9, 2020, at 11:27 PM, Tzu-Li Tai <tzuli...@gmail.com 
> > <mailto:tzuli...@gmail.com>> wrote:
> > 
> > Hi,
> > 
> > For the NullPointerException, what seems to be happening is that you are
> > setting NULL values in your MapState, that is not allowed by the API.
> > 
> > Otherwise, the code that you showed for bootstrapping state seems to be
> > fine.
> > 
> >> I have yet to find a working example that shows how to do both
> >> (bootstrapping state and start a streaming application with that state)
> > 
> > Not entirely sure what you mean here by "doing both".
> > The savepoint written using the State Processor API (what you are doing in
> > the bootstrap() method) is a savepoint that may be restored from as you
> > would with a typical Flink streaming job restore.
> > So, usually the bootstrapping part happens as a batch "offline" job, while
> > you keep your streaming job as a separate job. What are you trying to
> > achieve with having both written within the same job?
> > 
> > Cheers,
> > Gordon
> > 
> > 
> > 
> > --
> > Sent from: 
> > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ 
> > <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/>
> 

Reply via email to