Join DataStream with dimension tables?

Srikanth Wed, 20 Apr 2016 18:06:42 -0700

Hello,

I have a fairly typical streaming use case but not able to figure how to
implement it best in Flink.
I want to join records read from a kafka stream with one(or more) dimension
tables which are saved as flat files.


As per this jira <https://issues.apache.org/jira/browse/FLINK-2320> its not
possible to join DataStream with DataSet.
These tables are too big to do a collect() and join.

It will be good to read these files during startup, do a partitionByHash
and keep it cached.
On the DataStream may be do a keyBy and join.
Is something like this possible?

Srikanth

Join DataStream with dimension tables?

Reply via email to