Re: Question - Filesystem connector for lookup table

Martijn Visser Thu, 20 Jan 2022 22:06:54 -0800

Hi Jason,

The best option would indeed be to make the dimension data available in
something like a database which you can access via JDBC, HBase or Hive.
Those do support lookups.


Best regards,

Martijn

On Thu, 20 Jan 2022 at 22:11, Jason Yi <93t...@gmail.com> wrote:

> Thanks for the quick response.
>
> Is there any best or suggested practice for the use case of when we have
> data sets in a filesystem that we want to use in Flink as reference data
> (like dimension data)?
>
>    - Would making dimension data a Hive table or loading it into a table
>    in RDBMS (like MySQL) be the best option for the use case?
>    - Or should we consider having a stage area where output of Flink
>    would be stored, and then consider having another application (like Spark)
>    to join Flink's output to dimension data?
>
> Jason.
>
> On Thu, Jan 20, 2022 at 12:23 PM Martijn Visser <mart...@ververica.com>
> wrote:
>
>> Hi Jason,
>>
>> It's not (properly) supported and we should update the documentation.
>>
>> There is no out of the box possibility to use a file from filesystem as a
>> lookup table as far as I know.
>>
>> Best regards,
>>
>> Martijn
>>
>> Op do 20 jan. 2022 om 18:44 schreef Jason Yi <93t...@gmail.com>
>>
>>> Hello,
>>>
>>> I have data sets in s3 and want to use them as lookup tables in Flink. I
>>> defined tables with the filesystem connector and joined the tables to a
>>> table, defined with the Kinesis connector, in my Flink application. I
>>> expected its output to be written to s3, but no data was written to a sink
>>> table.
>>>
>>> According to the Flink doc (
>>> https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/overview/#supported-connectors),
>>> filesystem is available for a lookup source. I wonder if this is true.
>>>
>>> If the filesystem connector is not available for lookup tables, is there
>>> any alternative way to use data from s3 as a lookup table in Flink?
>>>
>>> Flink version: 1.14.0 (on EMR 6.5)
>>> Kinesis source table: a watermark was defined.
>>> Lookup data: CSV data in s3.
>>> Sink table: Hudi connector
>>>
>>> Please let me know if I'm missing anything.
>>>
>>> Thanks in advance.
>>> Jason.
>>>
>> --
>>
>> Martijn Visser | Product Manager
>>
>> mart...@ververica.com
>>
>> <https://www.ververica.com/>
>>
>>
>> Follow us @VervericaData
>>
>> --
>>
>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>> Conference
>>
>> Stream Processing | Event Driven | Real Time
>>
>>

Re: Question - Filesystem connector for lookup table

Reply via email to