Re: Please help, I need to bootstrap keyed state into a stream

Seth Wiesman Wed, 12 Aug 2020 09:41:22 -0700

Just to summarize the conversation so far:

The state processor api reads data from a 3rd party system - such as JDBC
in this example - and generates a savepoint file that is written out to
some DFS.  This savepoint can then be used to when starting a flink
streaming application. It is a two-step process, creating the savepoint in
one job and then starting a streaming application from that savepoint in
another.


These jobs do not have to be a single application, and in general, I
recommend they be developed as two separate jobs. The reason being,
bootstrapping state is a one-time process while your streaming application
runs forever. It will simplify your development and operations in the long
term if you do not mix concerns.

Concerning the NullPointerException:

The max parallelism must be at least 128. I've opened a ticket to track and
resolve this issue.

Seth

On Mon, Aug 10, 2020 at 6:38 PM Marco Villalobos <[email protected]>
wrote:

> I think there is a bug in Flink when running locally without a cluster.
>
> My code worked in a cluster, but failed when run locally.
>
> My code does not save null values in Map State.
>
> > On Aug 9, 2020, at 11:27 PM, Tzu-Li Tai <[email protected]> wrote:
> >
> > Hi,
> >
> > For the NullPointerException, what seems to be happening is that you are
> > setting NULL values in your MapState, that is not allowed by the API.
> >
> > Otherwise, the code that you showed for bootstrapping state seems to be
> > fine.
> >
> >> I have yet to find a working example that shows how to do both
> >> (bootstrapping state and start a streaming application with that state)
> >
> > Not entirely sure what you mean here by "doing both".
> > The savepoint written using the State Processor API (what you are doing
> in
> > the bootstrap() method) is a savepoint that may be restored from as you
> > would with a typical Flink streaming job restore.
> > So, usually the bootstrapping part happens as a batch "offline" job,
> while
> > you keep your streaming job as a separate job. What are you trying to
> > achieve with having both written within the same job?
> >
> > Cheers,
> > Gordon
> >
> >
> >
> > --
> > Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>
>

Re: Please help, I need to bootstrap keyed state into a stream

Reply via email to