What is the best way to write Kafka data into HDFS?

2016-02-10 Thread R P
hen your data is partitioned and some partitions generate sporadic data. What are some best practices and options to write data from Kafka to HDFS? Thanks, R P

Re: What is the best way to write Kafka data into HDFS?

2016-02-11 Thread R P
Hello Steve, Thanks for the suggestion. Looks like this Git repo is not updated for more than 10 months. Is this project still supported? Where can I find current usage and performance metrics ? Thanks, R P From: steve.mo...@gmail.com on behalf of

Re: What is the best way to write Kafka data into HDFS?

2016-02-11 Thread R P
andle open until file is committed? ( Flume keeps file handles open resulting into too many files open) Can I write custom serializer for kafka-connect ? Thanks, R P From: Jay Kreps Sent: Thursday, February 11, 2016 11:45 AM To: users@kafka.apache.org Su

Question regarding compression of topics in Kafka

2016-03-18 Thread R P
disk are not getting compressed. Size of data stored on disk is same with or without compression. I am using following configuration properties in server.properties config file. compression.type=gzip compressed.topics="gzip-topic" Thanks for reading and appreciate any responses. Thanks, R P

Re: Question regarding compression of topics in Kafka

2016-03-19 Thread R P
ctor Kafka instance on Mac OS. I didn't see any difference in the data size stored on disk. In both cases data stored on disk in log files had same size equals to the data sent to Kafka. How do I verify that compression is being used and data stored on disk has savings in space due to compressi

Re: Question regarding compression of topics in Kafka

2016-03-19 Thread R P
, - R P From: Ben Stopford Sent: Friday, March 18, 2016 9:21 AM To: users@kafka.apache.org Subject: Re: Question regarding compression of topics in Kafka Assuming you’re using the new producer (org.apache.kafka.clients.producer) the property is called