Also very interesting in hearing about them. I prefer war stories in form for Jira for the relevant project ;) There's a good chance we can make things less horrible if issues are reported.
Gwen On Fri, Mar 13, 2015 at 12:48 PM, Andrew Otto <ao...@wikimedia.org> wrote: >> We are currently using spark streaming 1.2.1 with kafka and write-ahead log. >> I will only say one thing : "a nightmare". ;-) > I’d be really interested in hearing about your experience here. I’m > exploring streaming frameworks a bit, and Spark Streaming is just so easy to > use and set up. I’d be nice if it worked well. > > > > >> On Mar 13, 2015, at 15:38, Alberto Miorin <amiorin78+ka...@gmail.com> wrote: >> >> We are currently using spark streaming 1.2.1 with kafka and write-ahead log. >> I will only say one thing : "a nightmare". ;-) >> >> Let's see if things are better with 1.3.0 : >> http://spark.apache.org/docs/1.3.0/streaming-kafka-integration.html >> >> On Fri, Mar 13, 2015 at 8:33 PM, William Briggs <wrbri...@gmail.com> wrote: >> >>> Spark Streaming also has built-in support for Kafka, and as of Spark 1.2, >>> it supports using an HDFS write-ahead log to ensure zero data loss while >>> streaming: >>> https://databricks.com/blog/2015/01/15/improved-driver-fault-tolerance-and-zero-data-loss-in-spark-streaming.html >>> >>> -Will >>> >>> On Fri, Mar 13, 2015 at 3:28 PM, Alberto Miorin <amiorin78+ka...@gmail.com >>>> wrote: >>> >>>> I'll try this too. It looks very promising. >>>> >>>> Thx >>>> >>>> On Fri, Mar 13, 2015 at 8:25 PM, Gwen Shapira <gshap...@cloudera.com> >>>> wrote: >>>> >>>>> There's a KafkaRDD that can be used in Spark: >>>>> https://github.com/tresata/spark-kafka. It doesn't exactly replace >>>>> Camus, but should be useful in building Camus-like system in Spark. >>>>> >>>>> On Fri, Mar 13, 2015 at 12:15 PM, Alberto Miorin >>>>> <amiorin78+ka...@gmail.com> wrote: >>>>>> We use spark on mesos. I don't want to partition our cluster because >>>> of >>>>> one >>>>>> YARN job (camus). >>>>>> >>>>>> Best >>>>>> >>>>>> Alberto >>>>>> >>>>>> On Fri, Mar 13, 2015 at 7:43 PM, Otis Gospodnetic < >>>>>> otis.gospodne...@gmail.com> wrote: >>>>>> >>>>>>> Just curious - why - is Camus not suitable/working? >>>>>>> >>>>>>> Thanks, >>>>>>> Otis >>>>>>> -- >>>>>>> Monitoring * Alerting * Anomaly Detection * Centralized Log >>>> Management >>>>>>> Solr & Elasticsearch Support * http://sematext.com/ >>>>>>> >>>>>>> >>>>>>> On Fri, Mar 13, 2015 at 2:33 PM, Alberto Miorin < >>>>> amiorin78+ka...@gmail.com >>>>>>>> >>>>>>> wrote: >>>>>>> >>>>>>>> I was wondering if anybody has already tried to mirror a kafka >>>> topic >>>>> to >>>>>>>> hdfs just copying the log files from the topic directory of the >>>> broker >>>>>>>> (like 00000000000023244237.log). >>>>>>>> >>>>>>>> The file format is very simple : >>>>>>>> https://twitter.com/amiorin/status/576448691139121152/photo/1 >>>>>>>> >>>>>>>> Implementing an InputFormat should not be so difficult. >>>>>>>> >>>>>>>> Any drawbacks? >>>>>>>> >>>>>>> >>>>> >>>> >>> >>> >