Re: FileSource Usage

2022-01-20 Thread Guowei Ma
Hi, Meghajit Thanks Meghajit for sharing your user case. I found a workaround way that you could try to name your file in a timestamp style. More details could be found here[1]. Another little concern is that Flink is a distributed system, which means that we could not assume any order even if we

Re: FileSource Usage

2022-01-20 Thread Meghajit Mazumdar
Hi Guowei, Thanks for your answer. Regarding your question, *> Currently there is no such public interface ,which you could extend to implement your own strategy. Would you like to share the specific problem you currently meet?* The GCS bucket that we are trying to read from is periodically popul

Re: FileSource Usage

2022-01-20 Thread Guowei Ma
Hi, Meghajit 1. From the implementation [1] the order of split depends on the implementation of the FileSystem. 2. From the implementation [2] the order of the file also depends on the implementation of the FileSystem. 3. Currently there is no such public interface ,which you could extend to imp

FileSource Usage

2022-01-19 Thread Meghajit Mazumdar
Hello, We are using FileSource to process Parquet Files and had a few doubts around it. Would really appreciate if somebody can help answer them: 1. For a given file, does FileSource read the contents inside it in order ? In o