That is doable via the state processor API, though Arvid's idea does sound simpler :)
You could read the operator with the rules, change the data as necessary and then rewrite it out as a new savepoint to start the job. On Thu, Jul 30, 2020 at 5:24 AM Arvid Heise <ar...@ververica.com> wrote: > Another idea: since your handling on Flink is idempotent, would it make > sense to also periodically send the whole rule set anew? > > Going further, depending on the number of rules, their size, and the > update frequency. Would it be possible to always transfer the complete rule > set and just discard the old state on update (or do the reconsolidation in > Flink). > > On Wed, Jul 29, 2020 at 2:49 PM Александр Сергеенко < > aleksandr.sergee...@gmail.com> wrote: > >> Hi Kostas >> >> Thanks for a possible help! >> >> пт, 24 июл. 2020 г., 19:08 Kostas Kloudas <kklou...@gmail.com>: >> >>> Hi Alex, >>> >>> Maybe Seth (cc'ed) may have an opinion on this. >>> >>> Cheers, >>> Kostas >>> >>> On Thu, Jul 23, 2020 at 12:08 PM Александр Сергеенко >>> <aleksandr.sergee...@gmail.com> wrote: >>> > >>> > Hi, >>> > >>> > We use so-called "control stream" pattern to deliver settings to the >>> Flink job using Apache Kafka topics. The settings are in fact an unlimited >>> stream of events originating from the master DBMS, which acts as a single >>> point of truth concerning the rules list. >>> > >>> > It may seems odd, since Flink guarantees the "exactly once" delivery >>> semantics, while a service, which provides the rules publishing mechanism >>> to Kafka is written using Akka Streams and guarantees the "at least once" >>> semantics while the rule handling inside Flink Job implemented in an >>> idempotent manner, but: we have to manage some cases when we need to >>> execute a reconciliation between the current rules stored at the master >>> DBMS and the existing Flink State. >>> > >>> > We've looked at the Flink's tooling and found out that the State >>> Processor API can possibly solve our problem, so we basically have to >>> implement a periodical process, which unloads the State to some external >>> file (.csv) and then execute a comparison between the set and the >>> information given at the master system. >>> > Basically it looks like the lambda architecture approach while Flink >>> is supposed to implement the kappa architecture and in that case our >>> reconciliation problem looks a bit far-fetched. >>> > >>> > Are there any best practices or some patterns addressing such >>> scenarios in Flink? >>> > >>> > Great thanks for any possible assistance and ideas. >>> > >>> > ----- >>> > Alex Sergeenko >>> > >>> >> > > -- > > Arvid Heise | Senior Java Developer > > <https://www.ververica.com/> > > Follow us @VervericaData > > -- > > Join Flink Forward <https://flink-forward.org/> - The Apache Flink > Conference > > Stream Processing | Event Driven | Real Time > > -- > > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany > > -- > Ververica GmbH > Registered at Amtsgericht Charlottenburg: HRB 158244 B > Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji > (Toni) Cheng >