Hi all,

I've encountered a challenge within a Flink job that I'm currently working
on. The gist of it is that I have a job that listens to a series of events
from a Kafka topic and eventually sinks those down into Postgres via the
JDBCSink.

A requirement recently came up for the need to filter these events based on
some configurations that are currently being stored within another Kafka
topic. I'm wondering what the best approach might be to handle this type of
problem.

My initial naive approach was:

   - When Flink starts up, use a regular Kafka Consumer and read all of the
   configuration data from that topic in its entirety.
   - Store the messages from that topic in some type of thread-safe
   collection statically accessible by the operators downstream.
   - Expose the thread-safe collection within the operators to actually
   perform the filtering.

This doesn't seem right though. I was reading about BroadcastState which
seems like it might fit the bill (e.g. keep those mappings in Broadcast
state so that all of the downstream operations would have access to them,
which I'd imagine would handle keeping things up to date).

Does Flink have a good pattern / construct to handle this? *Basically, I
have a series of mappings that I want to keep relatively up to date in a
Kafka topic, and I'm consuming from another Kafka topic that will need
those mappings to filter against.*

I'd be happy to share some of the approaches I currently have or elaborate
a bit more if that isn't super clear.

Thanks much,

Rion

Reply via email to