Re: Customer inputformat

2017-08-03 Thread Ted Yu
Did you use StreamExecutionEnvironment.createFileInput() ? What did the modification times of the 2 files look like (were they the newest) ? Cheers On Mon, Jul 31, 2017 at 12:42 PM, Mohit Anchlia wrote: > Thanks! When I give path to a directory flink is only reading 2 files. It > seems to be p

Re: Customer inputformat

2017-07-31 Thread Mohit Anchlia
Thanks! When I give path to a directory flink is only reading 2 files. It seems to be picking these 2 files randomly. On Mon, Jul 31, 2017 at 12:05 AM, Fabian Hueske wrote: > Hi Mohit, > > as Ted said, there are plenty of InputFormats which are based on > FileInputFormat. > FileInputFormat also

Re: Customer inputformat

2017-07-31 Thread Fabian Hueske
Hi Mohit, as Ted said, there are plenty of InputFormats which are based on FileInputFormat. FileInputFormat also supports reading all files in a directory. Simply specify the path of the directory. Check StreamExecutionEnvironment.createFileInput() which takes a several parameters such as a FileI

Re: Customer inputformat

2017-07-30 Thread Ted Yu
For #1, you can find quite a few classes which extend FileInputFormat. e.g. flink-connectors/flink-avro/src/main/java/org/apache/flink/api/java/io/AvroInputFormat.java:public class AvroInputFormat extends FileInputFormat implements ResultTypeQuer flink-core/src/main/java/org/apache/flink/api/commo

Re: Customer inputformat

2017-07-30 Thread Mohit Anchlia
Thanks. Few more questions: - Is there an example for FileInputFormat? - how to make it read all the files in a directory? - how to make an inputformat a streaming input instead of batch? Eg: read as new files come to a dir. Thanks again. On Sun, Jul 30, 2017 at 12:53 AM, Fabian Hueske wrote:

Re: Customer inputformat

2017-07-30 Thread Fabian Hueske
Hi, Flink calls the reachedEnd() method before it calls nextRecord() and closes the IF when reachedEnd() returns true. So, it should not return true until nextRecord() was called and the first and last record was emitted. You might also want to built your PDFFileInputFormat on FileInputFormat and

Customer inputformat

2017-07-29 Thread Mohit Anchlia
Hi, I created a custom input format. Idea behind this is to read all binary files from a directory and use each file as it's own split. Each split is read as one whole record. When I run it in flink I don't get any error but I am not seeing any output from .print. Am I missing something? *p