Thank you, Fabian. I'll try to do it. On Mon, Feb 8, 2016 at 4:37 PM, Fabian Hueske <fhue...@gmail.com> wrote:
> The SerializedInputFormat extends the BinaryInputFormat which expects a > special block-wise encoding and certain metadata fields. > It is not suited to read arbitrary binary files such as a file with 64 > short values. > I suggest to implement a custom input format based on FileInputFormat. > > Best, Fabian > > 2016-02-08 22:05 GMT+01:00 Saliya Ekanayake <esal...@gmail.com>: > >> Thank you, Fabian. It solved the compilation error, but at runtime I get >> an end-of-file exception. I've put up a sample code with data at Github >> https://github.com/esaliya/flinkit. The data file is a binary file >> containing 64 Short values. >> >> >> 02/08/2016 16:01:19 CHAIN DataSource (at main(WordCount.java:25) >> (org.apache.flink.api.common.io.SerializedInputFormat)) -> FlatMap >> (count())(4/8) switched to FAILED >> java.io.EOFException >> at java.io.DataInputStream.readShort(DataInputStream.java:315) >> at >> org.apache.flink.core.memory.InputViewDataInputStreamWrapper.readShort(InputViewDataInputStreamWrapper.java:92) >> at org.apache.flink.types.ShortValue.read(ShortValue.java:88) >> at >> org.apache.flink.api.common.io.SerializedInputFormat.deserialize(SerializedInputFormat.java:37) >> at >> org.apache.flink.api.common.io.SerializedInputFormat.deserialize(SerializedInputFormat.java:31) >> at >> org.apache.flink.api.common.io.BinaryInputFormat.nextRecord(BinaryInputFormat.java:274) >> at >> org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:169) >> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584) >> at java.lang.Thread.run(Thread.java:745) >> >> On Mon, Feb 8, 2016 at 3:50 PM, Fabian Hueske <fhue...@gmail.com> wrote: >> >>> Hi, >>> >>> please try to replace >>> DataSet<ShortValue> ds = env.createInput(sif); >>> by >>> DataSet<ShortValue> ds = env.createInput(sif, >>> ValueTypeInfo.SHORT_VALUE_TYPE_INFO); >>> >>> Best, Fabian >>> >>> 2016-02-08 19:33 GMT+01:00 Saliya Ekanayake <esal...@gmail.com>: >>> >>>> Till, >>>> >>>> I am still having trouble getting this to work. Here's my code ( >>>> https://github.com/esaliya/flinkit) >>>> >>>> String binaryFile = "src/main/resources/sample.bin"; >>>> SerializedInputFormat<ShortValue> sif = new SerializedInputFormat<>(); >>>> sif.setFilePath(binaryFile); >>>> DataSet<ShortValue> ds = env.createInput(sif); >>>> System.out.println(ds.count()); >>>> >>>> >>>> I still get the same error as shown below >>>> >>>> Exception in thread "main" >>>> org.apache.flink.api.common.InvalidProgramException: The type returned by >>>> the input format could not be automatically determined. Please specify the >>>> TypeInformation of the produced type explicitly by using the >>>> 'createInput(InputFormat, TypeInformation)' method instead. >>>> at >>>> org.apache.flink.api.java.ExecutionEnvironment.createInput(ExecutionEnvironment.java:511) >>>> at org.saliya.flinkit.WordCount.main(WordCount.java:24) >>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>> at >>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>>> at >>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>> at java.lang.reflect.Method.invoke(Method.java:497) >>>> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144) >>>> >>>> >>>> On Mon, Feb 8, 2016 at 5:42 AM, Till Rohrmann <trohrm...@apache.org> >>>> wrote: >>>> >>>>> Hi Saliya, >>>>> >>>>> in order to set the file path for the SerializedInputFormat you first >>>>> have to create it and then explicitly call setFilePath. >>>>> >>>>> final SerializedInputFormat<Record> inputFormat = new >>>>> SerializedInputFormat<Record>(); >>>>> inputFormat.setFilePath(PATH_TO_FILE); >>>>> >>>>> env.createInput(inputFormat, myTypeInfo); >>>>> >>>>> Cheers, >>>>> Till >>>>> >>>>> >>>>> On Mon, Feb 8, 2016 at 7:00 AM, Saliya Ekanayake <esal...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I was trying to read a simple binary file using SerializedInputFormat >>>>>> as suggested in a different thread, but encounters the following error. I >>>>>> tried to do what the exception suggests, but eventhough createInput() >>>>>> returns a DataSet object I couldn't find how to specify which file to >>>>>> read. >>>>>> >>>>>> Any help is appreciated. The file I am trying to read is a simple >>>>>> binary file with containing java short values. Is there any example on >>>>>> reading binary files available? >>>>>> >>>>>> Exception in thread "main" >>>>>> org.apache.flink.api.common.InvalidProgramException: The type returned by >>>>>> the input format could not be automatically determined. Please specify >>>>>> the >>>>>> TypeInformation of the produced type explicitly by using the >>>>>> 'createInput(InputFormat, TypeInformation)' method instead. >>>>>> >>>>>> Thank you, >>>>>> Saliya >>>>>> >>>>>> >>>>>> -- >>>>>> Saliya Ekanayake >>>>>> Ph.D. Candidate | Research Assistant >>>>>> School of Informatics and Computing | Digital Science Center >>>>>> Indiana University, Bloomington >>>>>> Cell 812-391-4914 >>>>>> http://saliya.org >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Saliya Ekanayake >>>> Ph.D. Candidate | Research Assistant >>>> School of Informatics and Computing | Digital Science Center >>>> Indiana University, Bloomington >>>> Cell 812-391-4914 >>>> http://saliya.org >>>> >>> >>> >> >> >> -- >> Saliya Ekanayake >> Ph.D. Candidate | Research Assistant >> School of Informatics and Computing | Digital Science Center >> Indiana University, Bloomington >> Cell 812-391-4914 >> http://saliya.org >> > > -- Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics and Computing | Digital Science Center Indiana University, Bloomington Cell 812-391-4914 http://saliya.org