Consuming plain JSON is a bit tricky for something like HDFS because all the output formats expect the data to have a schema. You can read the JSON data with the provided JsonConverter, but it'll be returned without a schema. The HDFS connector will currently fail on this because it expects a fixed structure.
Note however that it *does not* depend on already being in Avro format. Kafka Connect is specifically designed to abstract away the serialization format of data in Kafka so that connectors don't need to be written a half-dozen times to support different formats. There are a couple of possibilities to allow the HDFS connector to handle schemaless (i.e. JSON-like) data. One possibility is to infer the schema automatically based on the incoming data. If you can make guarantees about the compatibility of the data, this could work with the existing connector code. Alternatively, an option could be added to handle this type of data and force file rotation if a new schema was encountered. The risk with this is that if you have data interleaved with different schemas (as might happen as you transition an app to a new format) and no easy way to project between them, you'll have a lot of small HDFS files for awhile. Dealing with schemaless data will be tricky for connectors like HDFS, but is definitely possible. But its worth thinking through the right way to handle that data with a minimum of additional configuration options required. -Ewen On Wed, Feb 17, 2016 at 11:14 AM, Venkatesh Rudraraju < venkatengineer...@gmail.com> wrote: > Hi, > > I tried using the HDFS connector sink with kafka-connect and works as > described-> > http://docs.confluent.io/2.0.0/connect/connect-hdfs/docs/index.html > > My Scenario : > > I have plain Json data in a kafka topic. Can I still use HDFS connector > sink to read data from kafka-topic and write to HDFS in avro format ? > > As I read from the documentation, HDFS connector expects data in kafka > already in avro format? Is there a workaround where I can consume plain > Json and write to HDFS in avro ? Say I have a schema for the plain json > data. > > Thanks, > Venkatesh > -- Thanks, Ewen