Hi, Hadoop FileInputFormats (by default) also include hidden files (files starting with “.” or “_”). You can override this behaviour in Flink by subclassing TextInputFormat and overriding the accept() method. You can use a custom input format with ExecutionEnvironment.readFile().
Regarding BucketingSink, you can change both the prefixes and suffixes of the various files using configuration methods. Best, Aljoscha > On 27. Jun 2017, at 11:53, Adarsh Jain <eradarshj...@gmail.com> wrote: > > Thanks Stefan, my colleague Shashank has filed a bug for the same in jira > > https://issues.apache.org/jira/browse/FLINK-6993 > <https://issues.apache.org/jira/browse/FLINK-6993> > > Regards, > Adarsh > > On Fri, Jun 23, 2017 at 8:19 PM, Stefan Richter <s.rich...@data-artisans.com > <mailto:s.rich...@data-artisans.com>> wrote: > Hi, > > I suggest that you simply open an issue for this in our jira, describing the > improvement idea. That should be the fastest way to get this changed. > > Best, > Stefan > >> Am 23.06.2017 um 15:08 schrieb Adarsh Jain <eradarshj...@gmail.com >> <mailto:eradarshj...@gmail.com>>: >> >> Hi Stefan, >> >> I think I found the problem, try it with a file which starts with underscore >> in the name like "_part-1-0.csv". >> >> While saving Flink appends a "_" to the file name however while reading at >> folder level it does not pick those files. >> >> Can you suggest if we can do a setting so that it does not pre appends >> underscore while saving a file. >> >> Regards, >> Adarsh >> >> On Fri, Jun 23, 2017 at 3:24 PM, Stefan Richter <s.rich...@data-artisans.com >> <mailto:s.rich...@data-artisans.com>> wrote: >> No, that doesn’t make a difference and also works. >> >>> Am 23.06.2017 um 11:40 schrieb Adarsh Jain <eradarshj...@gmail.com >>> <mailto:eradarshj...@gmail.com>>: >>> >>> I am using "val env = ExecutionEnvironment.getExecutionEnvironment", can >>> this be the problem? >>> >>> With "import org.apache.flink.api.scala.ExecutionEnvironment" >>> >>> Using scala in my program. >>> >>> Regards, >>> Adarsh >>> >>> On Fri, Jun 23, 2017 at 3:01 PM, Stefan Richter >>> <s.rich...@data-artisans.com <mailto:s.rich...@data-artisans.com>> wrote: >>> I just copy pasted your code, adding the missing "val env = >>> LocalEnvironment.createLocalEnvironment()" and exchanged the string with a >>> local directory for some test files that I created. No other changes. >>> >>>> Am 23.06.2017 um 11:25 schrieb Adarsh Jain <eradarshj...@gmail.com >>>> <mailto:eradarshj...@gmail.com>>: >>>> >>>> Hi Stefan, >>>> >>>> Thanks for your efforts in checking the same, still doesn't work for me. >>>> >>>> Can you copy paste the code you used maybe I am doing some silly mistake >>>> and am not able to figure out the same. >>>> >>>> Thanks again. >>>> >>>> Regards, >>>> Adarsh >>>> >>>> >>>> On Fri, Jun 23, 2017 at 2:32 PM, Stefan Richter >>>> <s.rich...@data-artisans.com <mailto:s.rich...@data-artisans.com>> wrote: >>>> Hi, >>>> >>>> I tried this out on the current master and the 1.3 release and both work >>>> for me everything works exactly as expected, for file names, a directory, >>>> and even nested directories. >>>> >>>> Best, >>>> Stefan >>>> >>>>> Am 22.06.2017 um 21:13 schrieb Adarsh Jain <eradarshj...@gmail.com >>>>> <mailto:eradarshj...@gmail.com>>: >>>>> >>>>> Hi Stefan, >>>>> >>>>> Yes your understood right, when I give full path till the filename it >>>>> works fine however when I give path till >>>>> directory it does not read the data, doesn't print any exceptions too ... >>>>> I am also not sure why it is behaving like this. >>>>> >>>>> Should be easily replicable, in case you can try. Will be really helpful. >>>>> >>>>> Regards, >>>>> Adarsh >>>>> >>>>> On Thu, Jun 22, 2017 at 9:00 PM, Stefan Richter >>>>> <s.rich...@data-artisans.com <mailto:s.rich...@data-artisans.com>> wrote: >>>>> Hi, >>>>> >>>>> I am not sure I am getting the problem right: the code works if you use a >>>>> file name, but it does not work for directories? What exactly is not >>>>> working? Do you get any exceptions? >>>>> >>>>> Best, >>>>> Stefan >>>>> >>>>>> Am 22.06.2017 um 17:01 schrieb Adarsh Jain <eradarshj...@gmail.com >>>>>> <mailto:eradarshj...@gmail.com>>: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I am trying to use "Recursive Traversal of the Input Path Directory" in >>>>>> Flink 1.3 using scala. Snippet of my code below. If I give exact file >>>>>> name it is working fine. Ref >>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/batch/index.html >>>>>> >>>>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/batch/index.html> >>>>>> >>>>>> import org.apache.flink.api.java.utils.ParameterTool >>>>>> import org.apache.flink.api.java.{DataSet, ExecutionEnvironment} >>>>>> import org.apache.flink.configuration.Configuration >>>>>> >>>>>> val config = new Configuration >>>>>> config.setBoolean("recursive.file.enumeration",true) >>>>>> >>>>>> val featuresSource: String = >>>>>> "file:///Users/adarsh/Documents/testData/featurecsv/31c710ac40/2017/06/22 >>>>>> <>" >>>>>> >>>>>> val testInput = env.readTextFile(featuresSource).withParameters(config) >>>>>> testInput.print() >>>>>> >>>>>> Please guide how to fix this. >>>>>> >>>>>> Regards, >>>>>> Adarsh >>>>>> >>>>> >>>>> >>>> >>>> >>> >>> >> >> > >