Re: Is there a way to get file "metadata" as part of stream?

2020-08-03 Thread Till Rohrmann
Hi John, out of the box, Flink does not provide this functionality. However, you might be able to write your own CsvInputFormat which overrides fillRecord so that it generates a CSV record where the first field contains the filename. You can obtain the filename from the field currentSplit. I haven

Is there a way to get file "metadata" as part of stream?

2020-07-31 Thread John Smith
Hi, so reading a CSV file using env.readFile() with RowCsvInputFormat. Is there a way to get the filename as part of the row stream? The file contains a unique identifier to tag the rows with.

Re: Get file metadata

2015-07-01 Thread Robert Metzger
Okay. We filter files starting with underscores because that is the same behavior as Hadoop. Hadoop is always creating some underscore files, so when reading results of a MapReduce job, Flink would read these files. On Wed, Jul 1, 2015 at 12:15 PM, Ronny Bräunlich wrote: > Hi Robert, > > just ig

Re: Get file metadata

2015-07-01 Thread Ronny Bräunlich
Hi Robert, just ignore my previous question. My files started with underscore and I just found out that FileInputFormat does filter for underscores in acceptFile(). Cheers, Ronny Am 01.07.2015 um 11:35 schrieb Robert Metzger : > Hi Ronny, > > check out this answer on SO: > http://stackoverfl

Re: Get file metadata

2015-07-01 Thread Ronny Bräunlich
Hi Robert, thank you for your quick answer. Just one additional question: When I use the ExecutionEnvironment like this: DataSource files = env.readTextFile("file:///Users/me/path/to/file/dir“); Shouldn’t it read all the files in dir? I have three .json files there but when I print the result, n

Re: Get file metadata

2015-07-01 Thread Robert Metzger
Hi Ronny, check out this answer on SO: http://stackoverflow.com/questions/30599616/create-objects-from-input-files-in-apache-flink It is a similar use case ... I guess you can get the metadata from the input split as well. On Wed, Jul 1, 2015 at 11:30 AM, Ronny Bräunlich wrote: > Hello, > > I w

Get file metadata

2015-07-01 Thread Ronny Bräunlich
Hello, I want to read a file containing textfiles with Flink. As I already found out I can simply point the environment to the directory and it will read all the files. What I couldn’t find out is if it’s possible to keep the file metadata somehow. Concrete, I need the timestamp, the filename and