subject:"Re\: passing SparkContext as parameter"

Re: passing SparkContext as parameter

2015-09-22 Thread Priya Ch

Suppose I use rdd.joinWithCassnadra("keySpace", "table1"), does this do a full table scan which is not required at any cost On Tue, Sep 22, 2015 at 3:03 PM, Artem Aliev wrote: > All that code should looks like: > stream.filter(...).map(x=>(key, > )).joinWithCassandra().map(...).save

Re: passing SparkContext as parameter

2015-09-22 Thread Priya Ch

I have scenario like this - I read dstream of messages from kafka. Now if my rdd contains 10 messages, for each message I need to query the cassandraDB, do some modification and update the records in DB. If there is no option of passing sparkContext to workers to read.write into DB, the only opti

Re: passing SparkContext as parameter

2015-09-22 Thread Petr Novak

And probably the original source code https://gist.github.com/koen-dejonghe/39c10357607c698c0b04 On Tue, Sep 22, 2015 at 10:37 AM, Petr Novak wrote: > To complete design pattern: > > http://stackoverflow.com/questions/30450763/spark-streaming-and-connection-pool-implementation > > Petr > > On Mo

Re: passing SparkContext as parameter

2015-09-22 Thread Petr Novak

To complete design pattern: http://stackoverflow.com/questions/30450763/spark-streaming-and-connection-pool-implementation Petr On Mon, Sep 21, 2015 at 10:02 PM, Romi Kuntsman wrote: > Cody, that's a great reference! > As shown there - the best way to connect to an external database from the >

Re: passing SparkContext as parameter

2015-09-21 Thread Romi Kuntsman

Cody, that's a great reference! As shown there - the best way to connect to an external database from the workers is to create a connection pool on (each) worker. The driver mass pass, via broadcast, the connection string, but not the connect object itself and not the spark context. On Mon, Sep 21

Re: passing SparkContext as parameter

2015-09-21 Thread Cody Koeninger

That isn't accurate, I think you're confused about foreach. Look at http://spark.apache.org/docs/latest/streaming-programming-guide.html#design-patterns-for-using-foreachrdd On Mon, Sep 21, 2015 at 7:36 AM, Romi Kuntsman wrote: > foreach is something that runs on the driver, not the workers.

Re: passing SparkContext as parameter

2015-09-21 Thread Romi Kuntsman

foreach is something that runs on the driver, not the workers. if you want to perform some function on each record from cassandra, you need to do cassandraRdd.map(func), which will run distributed on the spark workers *Romi Kuntsman*, *Big Data Engineer* http://www.totango.com On Mon, Sep 21, 20

Re: passing SparkContext as parameter

2015-09-21 Thread Priya Ch

Yes, but i need to read from cassandra db within a spark transformation..something like.. dstream.forachRDD{ rdd=> rdd.foreach { message => sc.cassandraTable() . . . } } Since rdd.foreach gets executed on workers, how can i make sparkContext available on workers ???

Re: passing SparkContext as parameter

2015-09-21 Thread Ted Yu

You can use broadcast variable for passing connection information. Cheers > On Sep 21, 2015, at 4:27 AM, Priya Ch wrote: > > can i use this sparkContext on executors ?? > In my application, i have scenario of reading from db for certain records in > rdd. Hence I need sparkContext to read from

Re: passing SparkContext as parameter

2015-09-21 Thread Romi Kuntsman

sparkConext is available on the driver, not on executors. To read from Cassandra, you can use something like this: https://github.com/datastax/spark-cassandra-connector/blob/master/doc/2_loading.md *Romi Kuntsman*, *Big Data Engineer* http://www.totango.com On Mon, Sep 21, 2015 at 2:27 PM, Priya

Re: passing SparkContext as parameter

2015-09-21 Thread Priya Ch

can i use this sparkContext on executors ?? In my application, i have scenario of reading from db for certain records in rdd. Hence I need sparkContext to read from DB (cassandra in our case), If sparkContext couldn't be sent to executors , what is the workaround for this ?? On Mon, Sep 21, 2

Re: passing SparkContext as parameter

2015-09-21 Thread Petr Novak

add @transient? On Mon, Sep 21, 2015 at 11:36 AM, Petr Novak wrote: > add @transient? > > On Mon, Sep 21, 2015 at 11:27 AM, Priya Ch > wrote: > >> Hello All, >> >> How can i pass sparkContext as a parameter to a method in an object. >> Because passing sparkContext is giving me TaskNotSerial

Re: passing SparkContext as parameter

Re: passing SparkContext as parameter

Re: passing SparkContext as parameter

Re: passing SparkContext as parameter

Re: passing SparkContext as parameter

Re: passing SparkContext as parameter

Re: passing SparkContext as parameter

Re: passing SparkContext as parameter

Re: passing SparkContext as parameter

Re: passing SparkContext as parameter

Re: passing SparkContext as parameter

Re: passing SparkContext as parameter

12 matches

Site Navigation

Mail list logo

Footer information