Re: Flink state reconciliation

Seth Wiesman Thu, 30 Jul 2020 07:39:15 -0700

That is doable via the state processor API, though Arvid's idea does sound
simpler :)


You could read the operator with the rules, change the data as necessary
and then rewrite it out as a new savepoint to start the job.


On Thu, Jul 30, 2020 at 5:24 AM Arvid Heise <ar...@ververica.com> wrote:

> Another idea: since your handling on Flink is idempotent, would it make
> sense to also periodically send the whole rule set anew?
>
> Going further, depending on the number of rules, their size, and the
> update frequency. Would it be possible to always transfer the complete rule
> set and just discard the old state on update (or do the reconsolidation in
> Flink).
>
> On Wed, Jul 29, 2020 at 2:49 PM Александр Сергеенко <
> aleksandr.sergee...@gmail.com> wrote:
>
>> Hi Kostas
>>
>> Thanks for a possible help!
>>
>> пт, 24 июл. 2020 г., 19:08 Kostas Kloudas <kklou...@gmail.com>:
>>
>>> Hi Alex,
>>>
>>> Maybe Seth (cc'ed) may have an opinion on this.
>>>
>>> Cheers,
>>> Kostas
>>>
>>> On Thu, Jul 23, 2020 at 12:08 PM Александр Сергеенко
>>> <aleksandr.sergee...@gmail.com> wrote:
>>> >
>>> > Hi,
>>> >
>>> > We use so-called "control stream" pattern to deliver settings to the
>>> Flink job using Apache Kafka topics. The settings are in fact an unlimited
>>> stream of events originating from the master DBMS, which acts as a single
>>> point of truth concerning the rules list.
>>> >
>>> > It may seems odd, since Flink guarantees the "exactly once" delivery
>>> semantics, while a service, which provides the rules publishing mechanism
>>> to Kafka is written using Akka Streams and guarantees the "at least once"
>>> semantics while the rule handling inside Flink Job implemented in an
>>> idempotent manner, but: we have to manage some cases when we need to
>>> execute a reconciliation between the current rules stored at the master
>>> DBMS and the existing Flink State.
>>> >
>>> > We've looked at the Flink's tooling and found out that the State
>>> Processor API can possibly solve our problem, so we basically have to
>>> implement a periodical process, which unloads the State to some external
>>> file (.csv) and then execute a comparison between the set and the
>>> information given at the master system.
>>> > Basically it looks like the lambda architecture approach while Flink
>>> is supposed to implement the kappa architecture and in that case our
>>> reconciliation problem looks a bit far-fetched.
>>> >
>>> > Are there any best practices or some patterns addressing such
>>> scenarios in Flink?
>>> >
>>> > Great thanks for any possible assistance and ideas.
>>> >
>>> > -----
>>> > Alex Sergeenko
>>> >
>>>
>>
>
> --
>
> Arvid Heise | Senior Java Developer
>
> <https://www.ververica.com/>
>
> Follow us @VervericaData
>
> --
>
> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
> Conference
>
> Stream Processing | Event Driven | Real Time
>
> --
>
> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>
> --
> Ververica GmbH
> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
> (Toni) Cheng
>

Re: Flink state reconciliation

Reply via email to