Last time I checked, Camus doesn't support storing data as parquet, which
is a deal breaker for me. Otherwise it works well for my Kafka topics with
low data volume.
I am currently using spark streaming to ingest data, generate semi-realtime
stats and publish to a dashboard, and dump full dataset
Because using spark streaming looks like a lot simpler. Whats the
difference between Camus and Kafka Streaming for this case? Why Camus excel?
Rendy
On Wed, May 6, 2015 at 2:15 PM, Saisai Shao wrote:
> Also Kafka has a Hadoop consumer API for doing such things, please refer
> to http://kafka.ap
Also Kafka has a Hadoop consumer API for doing such things, please refer to
http://kafka.apache.org/081/documentation.html#kafkahadoopconsumerapi
2015-05-06 12:22 GMT+08:00 MrAsanjar . :
> why not try https://github.com/linkedin/camus - camus is kafka to HDFS
> pipeline
>
> On Tue, May 5, 2015 a
why not try https://github.com/linkedin/camus - camus is kafka to HDFS
pipeline
On Tue, May 5, 2015 at 11:13 PM, Rendy Bambang Junior <
rendy.b.jun...@gmail.com> wrote:
> Hi all,
>
> I am planning to load data from Kafka to HDFS. Is it normal to use spark
> streaming to load data from Kafka to HD
Hi all,
I am planning to load data from Kafka to HDFS. Is it normal to use spark
streaming to load data from Kafka to HDFS? What are concerns on doing this?
There are no processing to be done by Spark, only to store data to HDFS
from Kafka for storage and for further Spark processing
Rendy