Hi Lorenzo,

    Read from a previouse thread [1] and the source code, I think you may set 
inputFormat.setNestedFileEnumeration(true) to also scan the nested files.

Best,
Yun

[1] 
https://lists.apache.org/thread.html/86a23b4c44d92c3adeb9ff4a708365fe4099796fb32deb6319e0e17f%40%3Cuser.flink.apache.org%3E



------------------------------------------------------------------
Sender:Lorenzo Nicora<lorenzo.nic...@gmail.com>
Date:2020/06/11 21:31:20
Recipient:user<user@flink.apache.org>
Theme:Reading files from multiple subdirectories

Hi,

related to the same case I am discussing in another thread, but not related to 
AVRO this time :) 

I need to ingest files a S3 Sink Kafka Connector periodically adds to an S3 
bucket.
Files are bucketed by date time as it often happens.

Is there any way, using Flink only, to monitor a base-path and detect new files 
in any subdirectories? 
Or I need to use something external to move new files in a single directory?

I am currently using 
env.readFile(inputFormat, path, PROCESS_CONTINUOUSLY, 60000)
with AvroInputFormat, but it seems it can only monitor a single directory


Cheers
Lorenzo 

Reply via email to