Currently I have a single flume agent that converts apache logs into Avro
and writes to HDFS sink. I'm looking for ways to create tiered topology and
want to have the Avro records available to other flume agents. I used Kafka
channel/sink to write these Avro records but was running into this error
when using the Kafka source to read the records:

 Caused by: java.io.IOException: Not a data file.
    at
org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:105)
    at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97)


For using tiered topology, should I be using Avro sink and write to
host/port for other flume agent to read using Avro source? or is there any
other data format that I should consider if I want to stick with Kafka as
the channel/sink?

Thanks!

Reply via email to