I provided an answer on stackoverflow, where I said the following: A few different mechanisms in Flink may be relevant to this use case, depending on your detailed requirements.
*Broadcast State* Jaya Ananthram <https://stackoverflow.com/users/2936216/jaya-ananthram> has already covered the idea of using broadcast state in his answer <https://stackoverflow.com/a/65848580/2000823>. This makes sense if the rules should be applied globally, for every key, and if you can find a way to collect and broadcast the updates. Note that the Context in the processBroadcastElement() of a KeyedBroadcastProcessFunction method contains the method applyToKeyedState(StateDescriptor<S, VS> stateDescriptor, KeyedStateFunction<KS, S> function). This means you can register a KeyedStateFunction that will be applied to all states of all keys associated with the provided stateDescriptor. *State Processor API* If you want to bootstrap state in a Flink savepoint from a database dump, you can do that with this library. You'll find a simple example of using the State Processor API to bootstrap state in this gist <https://gist.github.com/alpinegizmo/ff3d2e748287853c88f21259830b29cf>. *Change Data Capture* The Table/SQL API supports Debezium <https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/connectors/formats/debezium.html> , Canal <https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/connectors/formats/canal.html>, and Maxwell <https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/connectors/formats/maxwell.html> CDC streams, and Kafka upsert streams <https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/connectors/upsert-kafka.html>. This may be a solution. There's also flink-cdc-connectors <https://github.com/ververica/flink-cdc-connectors>. *Lookup Joins* Flink SQL can do temporal lookup joins against a JDBC database, with a configurable cache <https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/connectors/jdbc.html#lookup-cache>. Not sure this is relevant. On Fri, Jan 22, 2021 at 7:30 PM Jan Oelschlegel < oelschle...@integration-factory.de> wrote: > But then you need a way to consume a database as a DataStream. > > > > I found this one https://github.com/ververica/flink-cdc-connectors. > > > > I want to implement a similar use case, but I don’t know how to parse the > SourceRecord (which comes from the connector) into an PoJo for further > processing. > > > > Best, > > Jan > > > > *Von:* Selvaraj chennappan <selvarajchennap...@gmail.com> > *Gesendet:* Freitag, 22. Januar 2021 18:09 > *An:* Kumar Bolar, Harshith <hk...@arity.com> > *Cc:* user <user@flink.apache.org> > *Betreff:* Re: What is the best way to have a cache of an external > database in Flink? > > > > Hi, > > Perhaps broadcast state is natural fit for this scenario. > > > https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/stream/state/broadcast_state.html > > > > > > Thanks, > > Selvaraj C > > > > On Fri, 22 Jan 2021 at 8:45 PM, Kumar Bolar, Harshith <hk...@arity.com> > wrote: > > Hi all, > > The external database consists of a set of rules for each key, these rules > should be applied on each stream element in the Flink job. Because it is > very expensive to make a DB call for each element and retrieve the rules, I > want to fetch the rules from the database at initialization and store it in > a local cache. > > When rules are updated in the external database, a status change event is > published to the Flink job which should be used to fetch the rules and > refresh this cache. > > What is the best way to achieve what I've described? I looked into keyed > state but initializing all keys and refreshing the keys on update doesn't > seem possible. > > Thanks, > > Harshith > > -- > > > > > > > > > > > > Regards, > Selvaraj C > HINWEIS: Dies ist eine vertrauliche Nachricht und nur für den Adressaten > bestimmt. Es ist nicht erlaubt, diese Nachricht zu kopieren oder Dritten > zugänglich zu machen. Sollten Sie diese Nachricht irrtümlich erhalten > haben, bitte ich um Ihre Mitteilung per E-Mail oder unter der oben > angegebenen Telefonnummer. >