Re: Does FileSource download all remote files for generating splits

2022-01-27 Thread Meghajit Mazumdar
Thanks Caizhi. This clarifies. On Fri, Jan 28, 2022 at 12:06 PM Caizhi Weng wrote: > Hi! > > FileEnumerator never reads the actual content of a file. FileEnumerator > lives in job managers and it only reads the necessary meta-data of the file > (for example how large is the file) so that it can

Re: Does FileSource download all remote files for generating splits

2022-01-27 Thread Caizhi Weng
Hi! FileEnumerator never reads the actual content of a file. FileEnumerator lives in job managers and it only reads the necessary meta-data of the file (for example how large is the file) so that it can split the work across all task managers. Corresponding file readers, in the other hand, lives i

Does FileSource download all remote files for generating splits

2022-01-27 Thread Meghajit Mazumdar
Hello, I had a question about the FileSource in Flink 1.14 . Considering FileSource is set to read from a remote GCS URL, I could read and understand that the FileEnumerator is actu