Is it OK to use ProtoBuf for sending messages to Kafka? I do not see anyone using it .
Please direct me to some code samples of how to use it in Spark Structured streaming. Thanks again.. On Sat, Nov 12, 2016 at 11:44 PM, shyla deshpande <deshpandesh...@gmail.com> wrote: > Thanks everyone. Very good discussion. > > Thanks Jacek, for the code snippet. I downloaded your Mastering Apache > Spark pdf . I love it. > > I have one more question, > > > On Sat, Nov 12, 2016 at 2:21 PM, Sean McKibben <grap...@graphex.com> > wrote: > >> I think one of the advantages of using akka-streams within Spark is the >> fact that it is a general purpose stream processing toolset with >> backpressure, not necessarily specific to kafka. If things work out with >> the approach, Spark could be a great benefit to use as a coordination >> framework for discrete streams processed on each executor. I've been toying >> with the idea of making essentially an RDD of task messages, where each >> task becomes an akka stream which are materialized on multiple executors >> and completed as that executor's 'task', allowing Spark to coordinate the >> completion of the entire job. For example, I might make an RDD which is >> just a set of URLs that I want to download and produce to Kafka, but let's >> say I have so many URLs that i need to coordinate that work across many >> servers. Using Spark with a forEachPartition block, I might set up an >> akka-stream to accomplish that task in a backpressured, stream-oriented >> way, so that I could have the entire Spark job complete when all of the >> URLs had been produced to Kafka, using individual Akka Streams within each >> executor. >> >> I realize that this is not the original question on this thread, and I >> don't meant to hijack that. I am also interested in the potential of Akka >> Stream sources for a Spark Streaming job directly, which could potentially >> be adapted for both Kafka and non-kafka use cases, with the emphasis for me >> being on use cases which aren't necessarily Kafka specific. There are some >> portions which feel like a bit of a mismatch, but with Structured Streams, >> I think there is greater opportunity for some kind of symbiotic adapter >> layer on the input side of things. I think the Apache Gearpump >> <https://gearpump.apache.org/overview.html> project in incubation may >> demonstrate how this adaptation can be approached, and the nascent Alpakka >> project <https://github.com/akka/alpakka> is an example of the generic >> applications of Akka Streams. >> >> It is important to note that Akka Streams are billed as a toolbox and not >> a framework, because they don't handle coordination of parallelism or >> multi-host concurrency. I think Spark could end up being a very convenient >> framework to handle this aspect of of a distributed application's >> architecture. It may be able to do some of this without any modification to >> either of these projects, but I haven't had the experience of actually >> attempting the implementation yet. >> >> >> On Nov 12, 2016, at 9:42 AM, Jacek Laskowski <ja...@japila.pl> wrote: >> >> Hi Luciano, >> >> Mind sharing why to have a structured streaming source/sink for Akka >> if Kafka's available and Akka Streams has a Kafka module? #curious >> >> Pozdrawiam, >> Jacek Laskowski >> ---- >> https://medium.com/@jaceklaskowski/ >> Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark >> Follow me at https://twitter.com/jaceklaskowski >> >> >> On Sat, Nov 12, 2016 at 4:07 PM, Luciano Resende <luckbr1...@gmail.com> >> wrote: >> >> If you are interested in Akka streaming, it is being maintained in Apache >> Bahir. For Akka there isn't a structured streaming version yet, but we >> would >> be interested in collaborating in the structured streaming version for >> sure. >> >> On Thu, Nov 10, 2016 at 8:46 AM shyla deshpande <deshpandesh...@gmail.com >> > >> wrote: >> >> >> I am using Spark 2.0.1. I wanted to build a data pipeline using Kafka, >> Spark Streaming and Cassandra using Structured Streaming. But the kafka >> source support for Structured Streaming is not yet available. So now I am >> trying to use Akka Stream as the source to Spark Streaming. >> >> Want to make sure I am heading in the right direction. Please direct me to >> any sample code and reading material for this. >> >> Thanks >> >> -- >> Sent from my Mobile device >> >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> >> >> >