Re: How to stream CSV from S3?

John Smith Tue, 28 Jul 2020 00:54:22 -0700

Also this where I find the docs confusing in the "connectors" section. File
system isn't under Data streaming but env.readCsvFile seems like it can do
the trick?


On Tue., Jul. 28, 2020, 3:46 a.m. John Smith, <java.dev....@gmail.com>
wrote:

> Bassically I want to "monitor" a bucket on S3 and every file that gets
> created in that bucket read it and stream it.
>
> If I understand correctly, I can just use env.readCsvFile() and config to
> continuously read a folder path?
>
>
> On Tue., Jul. 28, 2020, 1:38 a.m. Jingsong Li, <jingsongl...@gmail.com>
> wrote:
>
>> Hi John,
>>
>> Do you mean you want to read S3 CSV files using partition/bucket pruning?
>>
>> If just using the DataSet API, you can use CsvInputFormat to read csv
>> files.
>>
>> If you want to use Table/Sql API, In 1.10, Csv format in table not
>> support partitioned table. So the only way is specific the partition/bucket
>> path, and read single directory.
>>
>> In 1.11, the Table/Sql filesystem connector with csv format supports
>> partitioned table, complete support partition semantics.
>>
>> [1]
>> https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/filesystem.html
>>
>> Best,
>> Jingsong
>>
>> On Mon, Jul 27, 2020 at 10:54 PM John Smith <java.dev....@gmail.com>
>> wrote:
>>
>>> Hi, using Flink 1.10
>>>
>>> 1- How do we go about reading CSV files that are copied to s3 buckets?
>>> 2- Is there a source that can tail S3 and start reading a CSV when it is
>>> copied to S3?
>>> 3- Is that part of the table APIs?
>>>
>>
>>
>> --
>> Best, Jingsong Lee
>>
>

Re: How to stream CSV from S3?

Reply via email to