The SerializedInputFormat extends the BinaryInputFormat which expects a special block-wise encoding and certain metadata fields. It is not suited to read arbitrary binary files such as a file with 64 short values. I suggest to implement a custom input format based on FileInputFormat.
Best, Fabian 2016-02-08 22:05 GMT+01:00 Saliya Ekanayake <esal...@gmail.com>: > Thank you, Fabian. It solved the compilation error, but at runtime I get > an end-of-file exception. I've put up a sample code with data at Github > https://github.com/esaliya/flinkit. The data file is a binary file > containing 64 Short values. > > > 02/08/2016 16:01:19 CHAIN DataSource (at main(WordCount.java:25) > (org.apache.flink.api.common.io.SerializedInputFormat)) -> FlatMap > (count())(4/8) switched to FAILED > java.io.EOFException > at java.io.DataInputStream.readShort(DataInputStream.java:315) > at > org.apache.flink.core.memory.InputViewDataInputStreamWrapper.readShort(InputViewDataInputStreamWrapper.java:92) > at org.apache.flink.types.ShortValue.read(ShortValue.java:88) > at > org.apache.flink.api.common.io.SerializedInputFormat.deserialize(SerializedInputFormat.java:37) > at > org.apache.flink.api.common.io.SerializedInputFormat.deserialize(SerializedInputFormat.java:31) > at > org.apache.flink.api.common.io.BinaryInputFormat.nextRecord(BinaryInputFormat.java:274) > at > org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:169) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584) > at java.lang.Thread.run(Thread.java:745) > > On Mon, Feb 8, 2016 at 3:50 PM, Fabian Hueske <fhue...@gmail.com> wrote: > >> Hi, >> >> please try to replace >> DataSet<ShortValue> ds = env.createInput(sif); >> by >> DataSet<ShortValue> ds = env.createInput(sif, >> ValueTypeInfo.SHORT_VALUE_TYPE_INFO); >> >> Best, Fabian >> >> 2016-02-08 19:33 GMT+01:00 Saliya Ekanayake <esal...@gmail.com>: >> >>> Till, >>> >>> I am still having trouble getting this to work. Here's my code ( >>> https://github.com/esaliya/flinkit) >>> >>> String binaryFile = "src/main/resources/sample.bin"; >>> SerializedInputFormat<ShortValue> sif = new SerializedInputFormat<>(); >>> sif.setFilePath(binaryFile); >>> DataSet<ShortValue> ds = env.createInput(sif); >>> System.out.println(ds.count()); >>> >>> >>> I still get the same error as shown below >>> >>> Exception in thread "main" >>> org.apache.flink.api.common.InvalidProgramException: The type returned by >>> the input format could not be automatically determined. Please specify the >>> TypeInformation of the produced type explicitly by using the >>> 'createInput(InputFormat, TypeInformation)' method instead. >>> at >>> org.apache.flink.api.java.ExecutionEnvironment.createInput(ExecutionEnvironment.java:511) >>> at org.saliya.flinkit.WordCount.main(WordCount.java:24) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:497) >>> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144) >>> >>> >>> On Mon, Feb 8, 2016 at 5:42 AM, Till Rohrmann <trohrm...@apache.org> >>> wrote: >>> >>>> Hi Saliya, >>>> >>>> in order to set the file path for the SerializedInputFormat you first >>>> have to create it and then explicitly call setFilePath. >>>> >>>> final SerializedInputFormat<Record> inputFormat = new >>>> SerializedInputFormat<Record>(); >>>> inputFormat.setFilePath(PATH_TO_FILE); >>>> >>>> env.createInput(inputFormat, myTypeInfo); >>>> >>>> Cheers, >>>> Till >>>> >>>> >>>> On Mon, Feb 8, 2016 at 7:00 AM, Saliya Ekanayake <esal...@gmail.com> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I was trying to read a simple binary file using SerializedInputFormat >>>>> as suggested in a different thread, but encounters the following error. I >>>>> tried to do what the exception suggests, but eventhough createInput() >>>>> returns a DataSet object I couldn't find how to specify which file to >>>>> read. >>>>> >>>>> Any help is appreciated. The file I am trying to read is a simple >>>>> binary file with containing java short values. Is there any example on >>>>> reading binary files available? >>>>> >>>>> Exception in thread "main" >>>>> org.apache.flink.api.common.InvalidProgramException: The type returned by >>>>> the input format could not be automatically determined. Please specify the >>>>> TypeInformation of the produced type explicitly by using the >>>>> 'createInput(InputFormat, TypeInformation)' method instead. >>>>> >>>>> Thank you, >>>>> Saliya >>>>> >>>>> >>>>> -- >>>>> Saliya Ekanayake >>>>> Ph.D. Candidate | Research Assistant >>>>> School of Informatics and Computing | Digital Science Center >>>>> Indiana University, Bloomington >>>>> Cell 812-391-4914 >>>>> http://saliya.org >>>>> >>>> >>>> >>> >>> >>> -- >>> Saliya Ekanayake >>> Ph.D. Candidate | Research Assistant >>> School of Informatics and Computing | Digital Science Center >>> Indiana University, Bloomington >>> Cell 812-391-4914 >>> http://saliya.org >>> >> >> > > > -- > Saliya Ekanayake > Ph.D. Candidate | Research Assistant > School of Informatics and Computing | Digital Science Center > Indiana University, Bloomington > Cell 812-391-4914 > http://saliya.org >