Let me describe my environment. Working on two nodes currently: 1.Single-node hadoop cluster (will refer as Node1) 2.Single node Kafka cluster (will refer as Node2)
Node 2 has 1 broker started with a topic (iot.test.stream) and one command line producer and one command line consumer to test the kafka install. Producer can send messages and the Consumer is receiving it. Node 1 (hadoop cluster) has kafka hadoop consumer code built. Have edited the /kafka-0.8/contrib/hadoop-consumer/test/test.properties file with the following: kafka.etl.topic=iot.test.stream hdfs.default.classpath.dir=/tmp/kafka/lib hadoop.job.ugi=kafka,hadoop kafka.server.uri=tcp://idh251-kafka:9095 input=/tmp/kafka/data output=/tmp/kafka/output kafka.request.limit=-1 ........... I have copied the copy-jars.sh to /tmp/kafka/lib (on HDFS) Next I run the following on Node 1: ./run-class.sh kafka.etl.impl.SimpleKafkaETLJob test/test.properties from the /kafka-0.8/contrib/hadoop-consumer folder and get a classnotfoundexception for kafka.etl.impl.SimpleKafkaETLJob class. What am I missing? I was thinking that running the sh file would allow me to retrieve messages with the same topic name to HDFS from Node 2 to Node 1. I just want to do an end to end test to see that messages coming into Kafka are being stored in HDFS with the minimal amount of code change required. Thanks, Abhi -- Abhi Basu