But still this workaround would only work when you have access to the
underlying /FileInputFormat/. For//SQL and Table APIs, you don't so
you'll be unable to apply this workaround. So what we could do is make a
PR to support glob at the FileInputFormat level to profit for all APIs.
I'm gonna do it if everyone agrees.
Best
Etienne Chauchot
On 25/03/2021 13:12, Etienne Chauchot wrote:
Hi all,
In case it is useful to some of you:
I have a big batch that needs to use globs (*.parquet for example) to
read input files. It seems that globs do not work out of the box (see
https://issues.apache.org/jira/browse/FLINK-6417)
But there is a workaround:
final FileInputFormat inputFormat =new FileInputFormat(new
Path(extractDir(filePath)));/* or any subclass of FileInputFormat*/ /*extact
parent dir*/
inputFormat.setFilesFilter(new GlobFilePathFilter(Collections.singletonList(filePath), Collections.emptyList()));/*filePath contains glob, the whole path needs to be provided to
GlobFilePathFilter*/
inputFormat.setNestedFileEnumeration(true);
Hope, it helps some people
Etienne Chauchot