Is it OK to use ProtoBuf for sending messages to Kafka?  I do not see
anyone using it .

Please direct me to some code samples of how to use it in Spark Structured
streaming.

Thanks again..


On Sat, Nov 12, 2016 at 11:44 PM, shyla deshpande <deshpandesh...@gmail.com>
wrote:

> Thanks everyone. Very good discussion.
>
> Thanks Jacek, for the code snippet. I downloaded your Mastering Apache
> Spark pdf . I love it.
>
> I have one more question,
>
>
> On Sat, Nov 12, 2016 at 2:21 PM, Sean McKibben <grap...@graphex.com>
> wrote:
>
>> I think one of the advantages of using akka-streams within Spark is the
>> fact that it is a general purpose stream processing toolset with
>> backpressure, not necessarily specific to kafka. If things work out with
>> the approach, Spark could be a great benefit to use as a coordination
>> framework for discrete streams processed on each executor. I've been toying
>> with the idea of making essentially an RDD of task messages, where each
>> task becomes an akka stream which are materialized on multiple executors
>> and completed as that executor's 'task', allowing Spark to coordinate the
>> completion of the entire job. For example, I might make an RDD which is
>> just a set of URLs that I want to download and produce to Kafka, but let's
>> say I have so many URLs that i need to coordinate that work across many
>> servers. Using Spark with a forEachPartition block, I might set up an
>> akka-stream to accomplish that task in a backpressured, stream-oriented
>> way, so that I could have the entire Spark job complete when all of the
>> URLs had been produced to Kafka, using individual Akka Streams within each
>> executor.
>>
>> I realize that this is not the original question on this thread, and I
>> don't meant to hijack that. I am also interested in the potential of Akka
>> Stream sources for a Spark Streaming job directly, which could potentially
>> be adapted for both Kafka and non-kafka use cases, with the emphasis for me
>> being on use cases which aren't necessarily Kafka specific. There are some
>> portions which feel like a bit of a mismatch, but with Structured Streams,
>> I think there is greater opportunity for some kind of symbiotic adapter
>> layer on the input side of things. I think the Apache Gearpump
>> <https://gearpump.apache.org/overview.html> project in incubation may
>> demonstrate how this adaptation can be approached, and the nascent Alpakka
>> project <https://github.com/akka/alpakka> is an example of the generic
>> applications of Akka Streams.
>>
>> It is important to note that Akka Streams are billed as a toolbox and not
>> a framework, because they don't handle coordination of parallelism or
>> multi-host concurrency. I think Spark could end up being a very convenient
>> framework to handle this aspect of of a distributed application's
>> architecture. It may be able to do some of this without any modification to
>> either of these projects, but I haven't had the experience of actually
>> attempting the implementation yet.
>>
>>
>> On Nov 12, 2016, at 9:42 AM, Jacek Laskowski <ja...@japila.pl> wrote:
>>
>> Hi Luciano,
>>
>> Mind sharing why to have a structured streaming source/sink for Akka
>> if Kafka's available and Akka Streams has a Kafka module? #curious
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> ----
>> https://medium.com/@jaceklaskowski/
>> Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
>> Follow me at https://twitter.com/jaceklaskowski
>>
>>
>> On Sat, Nov 12, 2016 at 4:07 PM, Luciano Resende <luckbr1...@gmail.com>
>> wrote:
>>
>> If you are interested in Akka streaming, it is being maintained in Apache
>> Bahir. For Akka there isn't a structured streaming version yet, but we
>> would
>> be interested in collaborating in the structured streaming version for
>> sure.
>>
>> On Thu, Nov 10, 2016 at 8:46 AM shyla deshpande <deshpandesh...@gmail.com
>> >
>> wrote:
>>
>>
>> I am using Spark 2.0.1. I wanted to build a data pipeline using Kafka,
>> Spark Streaming and Cassandra using Structured Streaming. But the kafka
>> source support for Structured Streaming is not yet available. So now I am
>> trying to use Akka Stream as the source to Spark Streaming.
>>
>> Want to make sure I am heading in the right direction. Please direct me to
>> any sample code and reading material for this.
>>
>> Thanks
>>
>> --
>> Sent from my Mobile device
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>
>>
>

Reply via email to