Hi Jason,

The best option would indeed be to make the dimension data available in
something like a database which you can access via JDBC, HBase or Hive.
Those do support lookups.

Best regards,


On Thu, 20 Jan 2022 at 22:11, Jason Yi <93t...@gmail.com> wrote:

> Thanks for the quick response.
> Is there any best or suggested practice for the use case of when we have
> data sets in a filesystem that we want to use in Flink as reference data
> (like dimension data)?
>    - Would making dimension data a Hive table or loading it into a table
>    in RDBMS (like MySQL) be the best option for the use case?
>    - Or should we consider having a stage area where output of Flink
>    would be stored, and then consider having another application (like Spark)
>    to join Flink's output to dimension data?
> Jason.
> On Thu, Jan 20, 2022 at 12:23 PM Martijn Visser <mart...@ververica.com>
> wrote:
>> Hi Jason,
>> It's not (properly) supported and we should update the documentation.
>> There is no out of the box possibility to use a file from filesystem as a
>> lookup table as far as I know.
>> Best regards,
>> Martijn
>> Op do 20 jan. 2022 om 18:44 schreef Jason Yi <93t...@gmail.com>
>>> Hello,
>>> I have data sets in s3 and want to use them as lookup tables in Flink. I
>>> defined tables with the filesystem connector and joined the tables to a
>>> table, defined with the Kinesis connector, in my Flink application. I
>>> expected its output to be written to s3, but no data was written to a sink
>>> table.
>>> According to the Flink doc (
>>> https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/overview/#supported-connectors),
>>> filesystem is available for a lookup source. I wonder if this is true.
>>> If the filesystem connector is not available for lookup tables, is there
>>> any alternative way to use data from s3 as a lookup table in Flink?
>>> Flink version: 1.14.0 (on EMR 6.5)
>>> Kinesis source table: a watermark was defined.
>>> Lookup data: CSV data in s3.
>>> Sink table: Hudi connector
>>> Please let me know if I'm missing anything.
>>> Thanks in advance.
>>> Jason.
>> --
>> Martijn Visser | Product Manager
>> mart...@ververica.com
>> <https://www.ververica.com/>
>> Follow us @VervericaData
>> --
>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>> Conference
>> Stream Processing | Event Driven | Real Time

Reply via email to