spark kafka batch integration

Koert Kuipers Sun, 14 Dec 2014 12:43:25 -0800

hello all,
we at tresata wrote a library to provide for batch integration between
spark and kafka (distributed write of rdd to kafa, distributed read of rdd
from kafka). our main use cases are (in lambda architecture jargon):
* period appends to the immutable master dataset on hdfs from kafka using
spark
* make non-streaming data available in kafka with periodic data drops from
hdfs using spark. this is to facilitate merging the speed and batch layer
in spark-streaming
* distributed writes from spark-streaming


see here:
https://github.com/tresata/spark-kafka

best,
koert

spark kafka batch integration

Reply via email to