I’m using both a local (Unix) file system and hdfs. I’m going to check those to get ideas, thank you!
I’m also checking the internal code of the class and my own older patch code. On Fri 12. Oct 2018 at 21:32, Fabian Hueske <fhue...@gmail.com> wrote: > Hi, > > Which file system are you reading from? If you are reading from S3, this > might be cause by S3's eventual consistency property. > Have a look at FLINK-9940 [1] for a more detailed discussion. > There is also an open PR [2], that you could try to patch the source > operator with. > > Best, Fabian > > [1] https://issues.apache.org/jira/browse/FLINK-9940 > [2] https://github.com/apache/flink/pull/6613 > > Am Fr., 12. Okt. 2018 um 20:41 Uhr schrieb Juan Miguel Cejuela < > jua...@tagtog.net>: > >> Dear flinksters, >> >> >> I'm using the class `ContinuousFileMonitoringFunction` as a source to >> monitor a folder for new incoming files.* I have the problem that not >> all the files that are sent to the folder get processed / triggered by the >> function*. Specific details of my workflow is that I send up to 1k files >> per minute, and that I consume the stream as a `AsyncDataStream`. >> >> I myself raised an unrelated issue with the >> `ContinuousFileMonitoringFunction` class some time ago ( >> https://issues.apache.org/jira/browse/FLINK-8046): if two or more files >> shared the very same timestamp, only the first one (non-deterministically >> chosen) would be processed. However, I patched the file myself to fix that >> problem by using a LinkedHashMap to remember which files had been really >> processed before or not. My patch is working fine as far as I can tell. >> >> The problem seems to be rather that some files (when many are sent at >> once to the same folder) do not even get triggered/activated/registered by >> the class. >> >> >> Am I properly explaining my problem? >> >> >> Any hints to solve this challenge would be greatly appreciated ! ❤ THANK >> YOU >> >> -- >> Juanmi, CEO and co-founder @ 🍃tagtog.net >> >> Follow tagtog updates on 🐦 Twitter: @tagtog_net >> <https://twitter.com/tagtog_net> >> >> -- Juanmi, CEO and co-founder @ 🍃tagtog.net Follow tagtog updates on 🐦 Twitter: @tagtog_net <https://twitter.com/tagtog_net>