Hi Reminia, What Hequn said is correct.
However, I would *not* use a regular but model the problem as a time-versioned table join. A regular join will materialize both inputs which is probably not want you want to do for a stream. For a time-versioned table join, only the time-versioned table would be stored (this should be your DataSet) and the stream is just streamed along. Best, Fabian Am Mo., 15. Apr. 2019 um 04:02 Uhr schrieb Hequn Cheng <chenghe...@gmail.com >: > Hi Reminia, > > Currently, we can't join a DataStream with a DataSet in Flink. However, > the DataSet is actually a kind of bounded stream. From the point of this > view, you can use a streaming job to achieve your goal. Flink Table API & > SQL support different kinds of join[1]. You can take a closer look at them. > Probably a regular join[2] is ok for you. > > Finally, I think you raised a very good point. It would be better if > Flink can support such kind of join more direct and efficient. > > Best, Hequn > > [1] > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/tableApi.html#joins > [2] > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming/joins.html#regular-joins > > On Thu, Apr 11, 2019 at 5:16 PM Reminia Scarlet <reminia.scar...@gmail.com> > wrote: > >> Spark streaming supports direct join from stream DataFrame and batch >> DataFrame , and it's >> easy to implement an enrich pipeline that joins a stream and a dimension >> table. >> >> I checked the doc of flink, seems that this feature is a jira ticket >> which haven't been resolved yet. >> >> So how can I implement such a pipeline easily in Flink? >> >> >>