Re: using Spark Streaming with Kafka 0.9/0.10

aakash aakash Tue, 15 Nov 2016 22:29:46 -0800

Thanks for the link and info Cody !


Regards,
Aakash


On Tue, Nov 15, 2016 at 7:47 PM, Cody Koeninger <c...@koeninger.org> wrote:

> Generating / defining an RDDis not the same thing as running the
> compute() method of an rdd .  The direct stream definitely runs kafka
> consumers on the executors.
>
> If you want more info, the blog post and video linked from
> https://github.com/koeninger/kafka-exactly-once refers to the 0.8
> implementation, but the general design is similar for the 0.10
> version.
>
> I think the likelihood of an official release supporting 0.9 is fairly
> slim at this point, it's a year out of date and wouldn't be a drop-in
> dependency change.
>
>
> On Tue, Nov 15, 2016 at 5:50 PM, aakash aakash <email2aak...@gmail.com>
> wrote:
> >
> >
> >> You can use the 0.8 artifact to consume from a 0.9 broker
> >
> > We are currently using "Camus" in production and one of the main goal to
> > move to Spark is to use new Kafka Consumer API  of Kafka 0.9 and in our
> case
> > we need the security provisions available in 0.9, that why we cannot use
> 0.8
> > client.
> >
> >> Where are you reading documentation indicating that the direct stream
> > only runs on the driver?
> >
> > I might be wrong here, but I see that new kafka+Spark stream code extend
> the
> > InputStream and its documentation says : Input streams that can generate
> > RDDs from new data by running a service/thread only on the driver node
> (that
> > is, without running a receiver on worker nodes)
> >
> > Thanks and regards,
> > Aakash Pradeep
> >
> >
> > On Tue, Nov 15, 2016 at 2:55 PM, Cody Koeninger <c...@koeninger.org>
> wrote:
> >>
> >> It'd probably be worth no longer marking the 0.8 interface as
> >> experimental.  I don't think it's likely to be subject to active
> >> development at this point.
> >>
> >> You can use the 0.8 artifact to consume from a 0.9 broker
> >>
> >> Where are you reading documentation indicating that the direct stream
> >> only runs on the driver?  It runs consumers on the worker nodes.
> >>
> >>
> >> On Tue, Nov 15, 2016 at 10:58 AM, aakash aakash <email2aak...@gmail.com
> >
> >> wrote:
> >> > Re-posting it at dev group.
> >> >
> >> > Thanks and Regards,
> >> > Aakash
> >> >
> >> >
> >> > ---------- Forwarded message ----------
> >> > From: aakash aakash <email2aak...@gmail.com>
> >> > Date: Mon, Nov 14, 2016 at 4:10 PM
> >> > Subject: using Spark Streaming with Kafka 0.9/0.10
> >> > To: user-subscr...@spark.apache.org
> >> >
> >> >
> >> > Hi,
> >> >
> >> > I am planning to use Spark Streaming to consume messages from Kafka
> 0.9.
> >> > I
> >> > have couple of questions regarding this :
> >> >
> >> > I see APIs are annotated with @Experimental. So can you please tell me
> >> > when
> >> > are we planning to make it production ready ?
> >> > Currently, I see we are using Kafka 0.10 and so curious to know why
> not
> >> > we
> >> > started with 0.9 Kafka instead of 0.10 Kafka. As I see 0.10 kafka
> client
> >> > would not be compatible with 0.9 client since there are some changes
> in
> >> > arguments in consumer API.
> >> > Current API extends InputDstream and as per document it means RDD will
> >> > be
> >> > generated by running a service/thread only on the driver node instead
> of
> >> > worker node. Can you please explain to me why we are doing this and
> what
> >> > is
> >> > required to make sure that it runs on worker node.
> >> >
> >> >
> >> > Thanks in advance !
> >> >
> >> > Regards,
> >> > Aakash
> >> >
> >
> >
>

Re: using Spark Streaming with Kafka 0.9/0.10

Reply via email to