Thanks Gwen. I went with Confluent 2.0 as it has Kakfa 0.9 that matches with that in HDP 2.4. I installed confluent-kafka-connect-hdfs and confluent-common and softlinked a couple jar into kafka libs/.
I was able to start Kafka Connect but kafka.out was showing the following error: [2016-08-05 20:57:01,187] ERROR Exception emitting metrics (org.apache.hadoop.metrics2.sink.kafka.KafkaTimelineMetricsReporter) org.apache.hadoop.metrics2.sink.timeline.UnableToConnectException: java.net.ConnectException: Connection refused at org.apache.hadoop.metrics2.sink.timeline.AbstractTimelineMetricsSink.emitMetrics(AbstractTimelineMetricsSink.java:87) at org.apache.hadoop.metrics2.sink.kafka.KafkaTimelineMetricsReporter.access$200(KafkaTimelineMetricsReporter.java:58) at org.apache.hadoop.metrics2.sink.kafka.KafkaTimelineMetricsReporter$TimelineScheduledReporter.report(KafkaTimelineMetricsReporter.java:253) at org.apache.hadoop.metrics2.sink.kafka.ScheduledReporter.report(ScheduledReporter.java:185) at org.apache.hadoop.metrics2.sink.kafka.ScheduledReporter$1.run(ScheduledReporter.java:137) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Is there any configuration I should also look into? I started with ./connect-standalone.sh ../config/connect-standalone.properties /etc/kafka-connect-hdfs/quickstart-hdfs.properties And here is my quickstart-hdfs.properties: name=hdfs-sink connector.class=io.confluent.connect.hdfs.HdfsSinkConnector tasks.max=1 topics=hdfs hdfs.url=hdfs://sandbox.hortonworks.com:8020 flush.size=3 Thanks, -B On Fri, Aug 5, 2016 at 3:31 PM, Gwen Shapira <g...@confluent.io> wrote: > The installation instructions from Confluent will still work for you :) > > If you are using deb/rpm packages, basically add the repositories as > explained here: > http://docs.confluent.io/3.0.0/installation.html#rpm-packages-via-yum > > and then: > sudo yum install confluent-kafka-connect-hdfs > or > sudo apt-get install confluent-kafka-connect-hdfs > > This will put the connector config in /etc/kafka-connect-hdfs and the > connector jars in /usr/share/java/ > > You may need to move the jar so it is on the classpath for connect > (I'm not sure what's the default kafka classpath for HDP). > > BTW. We (Confluent) are testing the HDFS connector with HDP (we > basically install Confluent platform on one machine and HDP on another > and use Connect to move data) - so this setup should work :) > > Gwen > > > On Fri, Aug 5, 2016 at 8:47 AM, Banias H <banias4sp...@gmail.com> wrote: > > Hi, > > > > We are using Hortonworks HDP 2.4 with Apache Kafla 0.9 and we have an > > in-house solution to pull messages from Kafka to HDFS. I would like to > try > > using kakfa-connector-hdfs to push messages to HDFS. As far as I concern, > > Apache Kafka 0.9 doesn't come with kafka-connector-hdfs. What is a solid > > way to run kafka-connector-hdfs in HDP? I won't be able to install > > Confluent platform there though... I would appreciate any pointers. > Thanks. > > > > -B > > > > -- > Gwen Shapira > Product Manager | Confluent > 650.450.2760 | @gwenshap > Follow us: Twitter | blog >