Hello. I am new to Kafka.
I am wondering how to read log using kafka and get it parsed in spark. Please correct me if I am wrong: I want to create a model (pipeline) which takes files dropped in a specific folder or hive or whatever storage, and use the file as the input of kafka producer; On the other hand, the files will be consumed and passed to spark for processing. If my thought is realistic, I tried the following but it is not working: ./kafka-console-producer.sh --broker-list sandbox.hortonworks.com:6667 --topic kafka-topic1 < test.csv I am on hortonworks 2.5 and Kafka is 0.10, the following works for me: ./kafka-console-producer.sh --broker-list sandbox.hortonworks.com:6667 --topic kafka-topic1 Can anyone help? Thank you very much. *------------------------------------------------* *Sincerely yours,* *Raymond*