Re: Kafka ETL for Parquet

2016-08-02 Thread Shikhar Bhushan
Hi Kidong, What specific issues did you run into when trying this out? I think the basic idea would be to depend on the avro-serializer package and proceed with implementing your custom Converter similarly to AvroConverter interface. You only need the deserialization bits (`toConnectData`), and c

Re: Kafka ETL for Parquet

2016-08-01 Thread Kidong Lee
Thanks for your interest Shikhar, Actually, I have questioned and discussed in the thread: https://mail-archives.apache.org/mod_mbox/kafka-users/201607.mbox/%3CCAE1jLMOnYb2ScNweoBdsXRHOxjYLe=ha-6igldntl95abuy...@mail.gmail.com%3E The problem was for me that it was not easy to understand the connec

Re: Kafka ETL for Parquet

2016-08-01 Thread Shikhar Bhushan
Er, mislinked HDFS connector :) https://github.com/confluentinc/kafka-connect-hdfs On Mon, Aug 1, 2016 at 3:39 PM Shikhar Bhushan wrote: > Hi Kidong, > > That's pretty cool! I'm curious what this offers over the Confluent HDFS > connector , though

Re: Kafka ETL for Parquet

2016-08-01 Thread Shikhar Bhushan
Hi Kidong, That's pretty cool! I'm curious what this offers over the Confluent HDFS connector , though. The README mentions not depending on the Schema Registry, and that the schema can be retrieved via the classpath and Consul. This functionality s

Kafka ETL for Parquet

2016-08-01 Thread Kidong Lee
Hi, I have written a simple Kafka ETL which consumes avro encoded data from Kafka and save them to Parquet on HDFS: https://github.com/mykidong/kafka-etl-consumer It is implemented with Kafka Consumer API and Parquet Writer API. - Kidong Lee.