Thanks for the heads-up, Alberto, that's good to know. We were about to start a few projects working with Spark Streaming + Kafka; sounds like there's still quite a bit of work to be done there.
-Will On Fri, Mar 13, 2015 at 3:38 PM, Alberto Miorin <amiorin78+ka...@gmail.com> wrote: > We are currently using spark streaming 1.2.1 with kafka and write-ahead > log. > I will only say one thing : "a nightmare". ;-) > > Let's see if things are better with 1.3.0 : > http://spark.apache.org/docs/1.3.0/streaming-kafka-integration.html > > On Fri, Mar 13, 2015 at 8:33 PM, William Briggs <wrbri...@gmail.com> > wrote: > >> Spark Streaming also has built-in support for Kafka, and as of Spark 1.2, >> it supports using an HDFS write-ahead log to ensure zero data loss while >> streaming: >> https://databricks.com/blog/2015/01/15/improved-driver-fault-tolerance-and-zero-data-loss-in-spark-streaming.html >> >> -Will >> >> On Fri, Mar 13, 2015 at 3:28 PM, Alberto Miorin < >> amiorin78+ka...@gmail.com> wrote: >> >>> I'll try this too. It looks very promising. >>> >>> Thx >>> >>> On Fri, Mar 13, 2015 at 8:25 PM, Gwen Shapira <gshap...@cloudera.com> >>> wrote: >>> >>> > There's a KafkaRDD that can be used in Spark: >>> > https://github.com/tresata/spark-kafka. It doesn't exactly replace >>> > Camus, but should be useful in building Camus-like system in Spark. >>> > >>> > On Fri, Mar 13, 2015 at 12:15 PM, Alberto Miorin >>> > <amiorin78+ka...@gmail.com> wrote: >>> > > We use spark on mesos. I don't want to partition our cluster because >>> of >>> > one >>> > > YARN job (camus). >>> > > >>> > > Best >>> > > >>> > > Alberto >>> > > >>> > > On Fri, Mar 13, 2015 at 7:43 PM, Otis Gospodnetic < >>> > > otis.gospodne...@gmail.com> wrote: >>> > > >>> > >> Just curious - why - is Camus not suitable/working? >>> > >> >>> > >> Thanks, >>> > >> Otis >>> > >> -- >>> > >> Monitoring * Alerting * Anomaly Detection * Centralized Log >>> Management >>> > >> Solr & Elasticsearch Support * http://sematext.com/ >>> > >> >>> > >> >>> > >> On Fri, Mar 13, 2015 at 2:33 PM, Alberto Miorin < >>> > amiorin78+ka...@gmail.com >>> > >> > >>> > >> wrote: >>> > >> >>> > >> > I was wondering if anybody has already tried to mirror a kafka >>> topic >>> > to >>> > >> > hdfs just copying the log files from the topic directory of the >>> broker >>> > >> > (like 00000000000023244237.log). >>> > >> > >>> > >> > The file format is very simple : >>> > >> > https://twitter.com/amiorin/status/576448691139121152/photo/1 >>> > >> > >>> > >> > Implementing an InputFormat should not be so difficult. >>> > >> > >>> > >> > Any drawbacks? >>> > >> > >>> > >> >>> > >>> >> >> >