Spark Streaming also has built-in support for Kafka, and as of Spark 1.2,
it supports using an HDFS write-ahead log to ensure zero data loss while
streaming:
https://databricks.com/blog/2015/01/15/improved-driver-fault-tolerance-and-zero-data-loss-in-spark-streaming.html

-Will

On Fri, Mar 13, 2015 at 3:28 PM, Alberto Miorin <amiorin78+ka...@gmail.com>
wrote:

> I'll try this too. It looks very promising.
>
> Thx
>
> On Fri, Mar 13, 2015 at 8:25 PM, Gwen Shapira <gshap...@cloudera.com>
> wrote:
>
> > There's a KafkaRDD that can be used in Spark:
> > https://github.com/tresata/spark-kafka. It doesn't exactly replace
> > Camus, but should be useful in building Camus-like system in Spark.
> >
> > On Fri, Mar 13, 2015 at 12:15 PM, Alberto Miorin
> > <amiorin78+ka...@gmail.com> wrote:
> > > We use spark on mesos. I don't want to partition our cluster because of
> > one
> > > YARN job (camus).
> > >
> > > Best
> > >
> > > Alberto
> > >
> > > On Fri, Mar 13, 2015 at 7:43 PM, Otis Gospodnetic <
> > > otis.gospodne...@gmail.com> wrote:
> > >
> > >> Just curious - why - is Camus not suitable/working?
> > >>
> > >> Thanks,
> > >> Otis
> > >> --
> > >> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> > >> Solr & Elasticsearch Support * http://sematext.com/
> > >>
> > >>
> > >> On Fri, Mar 13, 2015 at 2:33 PM, Alberto Miorin <
> > amiorin78+ka...@gmail.com
> > >> >
> > >> wrote:
> > >>
> > >> > I was wondering if anybody has already tried to mirror a kafka topic
> > to
> > >> > hdfs just copying the log files from the topic directory of the
> broker
> > >> > (like 00000000000023244237.log).
> > >> >
> > >> > The file format is very simple :
> > >> > https://twitter.com/amiorin/status/576448691139121152/photo/1
> > >> >
> > >> > Implementing an InputFormat should not be so difficult.
> > >> >
> > >> > Any drawbacks?
> > >> >
> > >>
> >
>

Reply via email to