Re: Table SQL Filesystem CSV recursive directory traversal

Timo Walther Mon, 09 Nov 2020 07:25:26 -0800

Hi Ruben,

by looking at the code, it seems you should be able to do that. At leastfor batch workloads we are usingorg.apache.flink.formats.csv.CsvFileSystemFormatFactory.CsvInputFormatwhich is a FileInputFormat that supports the mentioned configuration option.

The problem is that this might not have been exposed via SQL propertiesyet. So you would need to write your own property-to-InputFormat factorythat does it similar to:


https://github.com/apache/flink/blob/master/flink-formats/flink-csv/src/main/java/org/apache/flink/formats/csv/CsvFileSystemFormatFactory.java

What you could do create your own factory and extend from the above soyou can set additional properties. Not a nice solution but a workaroundfor now.


More information to how to write your own factory can also be found here:

https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/sourceSinks.html

I hope this helps.

Regards,
Timo

On 09.11.20 09:27, Ruben Laguna wrote:

Is it possible?

For Dataset I've found [1] :
|parameters.setBoolean("recursive.file.enumeration", true); // pass theconfiguration to the data source DataSet<String> logs =env.readTextFile("file:///path/with.nested/files").withParameters(parameters);|
But can I achieve something similar with the Table SQL?

I have the following directory structure
/myfiles/20201010/00/00restoffilename1.csv
/myfiles/20201010/00/00restoffilename2.csv
...
/myfiles/20201010/00/00restoffilename3000.csv
/myfiles/20201010/01/01restoffilename1.csv
....
/myfiles/20201010/00/00restoffilename3000.csv
So for each day I have 255 subdirectories from 00 to FF and each ofthose directories can have 1000-3000 files and I would like to load allthose files in one go.
[1]:https://ci.apache.org/projects/flink/flink-docs-stable/dev/batch/#recursive-traversal-of-the-input-path-directory<https://ci.apache.org/projects/flink/flink-docs-stable/dev/batch/#recursive-traversal-of-the-input-path-directory>
--
/Rubén

Re: Table SQL Filesystem CSV recursive directory traversal

Reply via email to