Hi Andrew, 1a, In general Flink can read and write Avro data through the AvroInputFormat and AvroOutputtFormat in both the batch and the streaming API. In general you can write the following:
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); DataStream dataStream = env.createInput(new AvroInputFormat(...)); //do something with dataSteam dataStream.write(new AvroOutputFormat(...)); 1b, When reading from Kafka you are expected to define a DeserializationSchema [1], in your case it is the only thing that you need to implement to get your topology running. 2, The Scala Api uses the same function names presented in 1a, and accepts the java input and output format implementations. [1] https://github.com/apache/flink/blob/master/flink-staging/flink-streaming/flink-streaming-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumer.java#L256 I hope this clarifies the situation. Best, Marton On Mon, Oct 19, 2015 at 4:21 PM, Andrew Whitaker <aawhita...@gmail.com> wrote: > I'm doing some research on Flink + Avro integration, and I've come across > "org.apache.flink.api.java.io.AvroInputFormat" as a way to create a stream > of Avro objects from a file. I had the following questions: > > 1. Is this the extent of Flink's integration with Avro? If I wanted to > read Avro-serialized objects from a Kafka stream, would I have to write > something to do this or is this functionality already built somewhere? > > 2. Is there an analogous InputFormat in Flink's Scala API? If not, what's > the recommended way to work with Avro objects in Scala using Flink? > > Thanks, > > -- > Andrew Whitaker > aawhita...@gmail.com | 540-521-5299 | @andrewwhitaker >