Avro source and sink

Buntu Dev Tue, 15 Sep 2015 10:42:51 -0700

Currently I have a single flume agent that converts apache logs into Avro
and writes to HDFS sink. I'm looking for ways to create tiered topology and
want to have the Avro records available to other flume agents. I used Kafka
channel/sink to write these Avro records but was running into this error
when using the Kafka source to read the records:


 Caused by: java.io.IOException: Not a data file.
    at
org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:105)
    at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97)


For using tiered topology, should I be using Avro sink and write to
host/port for other flume agent to read using Avro source? or is there any
other data format that I should consider if I want to stick with Kafka as
the channel/sink?

Thanks!

Avro source and sink

Reply via email to