Re: streaming using DeserializationSchema

2016-02-13 Thread Martin Neumann
I ended up not using the DeserializationSchema and instead going for a AvrioInputFormat in case of reading From file. I would have preferred to keep the code simpler but the map solution was a lot more complicated since the raw data I have is in Avro binary format so I cannot just read it and map i

Re: streaming using DeserializationSchema

2016-02-12 Thread Nick Dimiduk
My input file contains newline-delimited JSON records, one per text line. The records on the Kafka topic are JSON blobs encoded to UTF8 and written as bytes. On Fri, Feb 12, 2016 at 1:41 PM, Martin Neumann wrote: > I'm trying the same thing now. > > I guess you need to read the file as byte arra

Re: streaming using DeserializationSchema

2016-02-12 Thread Martin Neumann
I'm trying the same thing now. I guess you need to read the file as byte arrays somehow to make it work. What read function did you use? The mapper is not hard to write but the byte array stuff gives me a headache. cheers Martin On Fri, Feb 12, 2016 at 9:12 PM, Nick Dimiduk wrote: > Hi Mart

Re: streaming using DeserializationSchema

2016-02-12 Thread Nick Dimiduk
Hi Martin, I have the same usecase. I wanted to be able to load from dumps of data in the same format as is on the kafak queue. I created a new application main, call it the "job" instead of the "flow". I refactored my code a bit for building the flow so all that can be reused via factory method.

Re: streaming using DeserializationSchema

2016-02-12 Thread Martin Neumann
Its not only about testing, I will also need to run things against different datasets. I want to reuse as much of the code as possible to load the same data from a file instead of kafka. Is there a simple way of loading the data from a File using the same conversion classes that I would use to tra

Re: streaming using DeserializationSchema

2016-02-11 Thread Gyula Fóra
Hey, A very simple thing you could do is to set up a simple kafka producer in a java program that will feed the data into a topic. This also has the additional benefit that you are actually testing against kafka. Cheers, Gyula Martin Neumann ezt írta (időpont: 2016. febr. 12., P, 0:20): > Hej,

streaming using DeserializationSchema

2016-02-11 Thread Martin Neumann
Hej, I have a stream program reading data from Kafka where the data is in avro. I have my own DeserializationSchema to deal with it. For testing reasons I want to read a dump from hdfs instead, is there a way to use the same DeserializationSchema to read from an avro file stored on hdfs? cheers